Mercurial > repos > peterjc > blastxml_to_top_descr
comparison tools/blastxml_to_top_descr/blastxml_to_top_descr.xml @ 11:98f8431dab44 draft
Uploaded v0.1.0, now also handles extended tabular BLAST output.
author | peterjc |
---|---|
date | Fri, 13 Jun 2014 07:07:35 -0400 |
parents | |
children | fe1ed74793c9 |
comparison
equal
deleted
inserted
replaced
10:09a68a90d552 | 11:98f8431dab44 |
---|---|
1 <tool id="blastxml_to_top_descr" name="BLAST top hit descriptions" version="0.1.0"> | |
2 <description>Make a table from BLAST output</description> | |
3 <version_command interpreter="python">blastxml_to_top_descr.py --version</version_command> | |
4 <command interpreter="python"> | |
5 blastxml_to_top_descr.py | |
6 -f "$input.in_format" | |
7 #if $input.in_format == "tabular": | |
8 --qseqid $input.qseqid | |
9 --sseqid $input.sseqid | |
10 --salltitles $input.salltitles | |
11 #end if | |
12 -o "${tabular_file}" | |
13 -t ${topN} | |
14 "${in_file}" | |
15 </command> | |
16 <stdio> | |
17 <!-- Assume anything other than zero is an error --> | |
18 <exit_code range="1:" /> | |
19 <exit_code range=":-1" /> | |
20 </stdio> | |
21 <inputs> | |
22 <conditional name="input"> | |
23 <param name="in_format" type="select" label="Input format"> | |
24 <option value="blastxml" select="True">BLAST XML</option> | |
25 <option value="tabular">Tabular</option> | |
26 </param> | |
27 <when value="blastxml"> | |
28 <param name="in_file" type="data" format="blastxml" label="BLAST results as XML"/> | |
29 </when> | |
30 <when value="tabular"> | |
31 <param name="in_file" type="data" format="tabular" label="BLAST results as tabular"/> | |
32 <param name="qseqid" type="data_column" data_ref="in_file" | |
33 multiple="False" numerical="False" default_value="1" value="1" | |
34 label="Column containing query ID (qseqid)" | |
35 help="This is column 1 in standard BLAST tabular output" /> | |
36 <param name="sseqid" type="data_column" data_ref="in_file" | |
37 multiple="False" numerical="False" default_value="2" value="2" | |
38 label="Column containing match ID (sseqid)" | |
39 help="This is column 2 in standard BLAST tabular output"/> | |
40 <param name="salltitles" type="data_column" data_ref="in_file" | |
41 multiple="False" numerical="False" default_value="25" value="25" | |
42 label="Column containing containing descriptions (salltitles)" | |
43 help="This is column 25 in the default extended BLAST tabular output"/> | |
44 </when> | |
45 </conditional> | |
46 <param name="topN" type="integer" min="1" max="100" optional="false" label="Number of descriptions" value="3"/> | |
47 </inputs> | |
48 <outputs> | |
49 <data name="tabular_file" format="tabular" label="Top $topN descriptions from $input.in_file.name" /> | |
50 </outputs> | |
51 <requirements> | |
52 </requirements> | |
53 <tests> | |
54 <test> | |
55 <param name="in_format" value="blastxml" /> | |
56 <param name="in_file" value="blastp_four_human_vs_rhodopsin.xml" ftype="blastxml" /> | |
57 <param name="topN" value="3" /> | |
58 <output name="tabular_file" file="blastp_four_human_vs_rhodopsin_top3.tabular" ftype="tabular" /> | |
59 </test> | |
60 <test> | |
61 <param name="in_format" value="tabular" /> | |
62 <param name="in_file" value="blastp_four_human_vs_rhodopsin_converted_ext.tabular" ftype="tabular" /> | |
63 <param name="topN" value="3" /> | |
64 <output name="tabular_file" file="blastp_four_human_vs_rhodopsin_top3_positive.tabular" ftype="tabular" /> | |
65 </test> | |
66 </tests> | |
67 <help> | |
68 | |
69 **What it does** | |
70 | |
71 NCBI BLAST+ (and the older NCBI 'legacy' BLAST) can output in a range of | |
72 formats including text, tabular and a more detailed XML format. You can | |
73 do a lot of things with tabular files in Galaxy (sorting, filtering, joins, | |
74 etc), however until BLAST+ 2.2.28 the tabular output never included the | |
75 hit descriptions (titles) found in the other output formats. | |
76 | |
77 This tool turns a BLAST XML file into a simple tabular file containing | |
78 one row per query sequence, containing the query identifier and then | |
79 the three (by default) top hit descriptions (i.e. the first three). If | |
80 a query doesn't have that many hits, then these entries are left blank. | |
81 | |
82 This tool can also be used with the tabular output from BLAST+ instead, | |
83 provided the relevant columns are provided. The default settings will | |
84 work with the default 25 column extended output from the BLAST+ tools | |
85 wrapped in Galaxy. Note if a query has *no* hits, it does not appear in | |
86 the BLAST tabular output. | |
87 | |
88 **Example Usage** | |
89 | |
90 One simple usage would be to take a transcriptome assembly or set of | |
91 gene predictions, run a BLAST search against the NCBI NR database, and | |
92 then use this tool to make a table of the top three BLAST hits. This | |
93 can give you a 'quick and dirty' crude annotation, potentially enough | |
94 to spot some problems (e.g. bacterial contaimination could be very | |
95 obvious). | |
96 | |
97 **References** | |
98 | |
99 If you use this Galaxy tool in work leading to a scientific publication please | |
100 cite: | |
101 | |
102 Peter J.A. Cock, Björn A. Grüning, Konrad Paszkiewicz and Leighton Pritchard (2013). | |
103 Galaxy tools and workflows for sequence analysis with applications | |
104 in molecular plant pathology. PeerJ 1:e167 | |
105 http://dx.doi.org/10.7717/peerj.167 | |
106 | |
107 This wrapper is available to install into other Galaxy Instances via the Galaxy | |
108 Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/blastxml_to_top_descr | |
109 | |
110 </help> | |
111 </tool> |