Mercurial > repos > devteam > ncbi_blast_plus
comparison tools/ncbi_blast_plus/ncbi_macros.xml @ 11:4c4a0da938ff draft
Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25.
Supports $GALAXY_SLOTS.
Includes more tests and heavy use of macros.
author | peterjc |
---|---|
date | Thu, 05 Dec 2013 06:55:59 -0500 |
parents | |
children | 623f727cdff1 |
comparison
equal
deleted
inserted
replaced
10:70e7dcbf6573 | 11:4c4a0da938ff |
---|---|
1 <macros> | |
2 <xml name="output_change_format"> | |
3 <change_format> | |
4 <when input="out_format" value="0" format="txt"/> | |
5 <when input="out_format" value="0 -html" format="html"/> | |
6 <when input="out_format" value="2" format="txt"/> | |
7 <when input="out_format" value="2 -html" format="html"/> | |
8 <when input="out_format" value="4" format="txt"/> | |
9 <when input="out_format" value="4 -html" format="html"/> | |
10 <when input="out_format" value="5" format="blastxml"/> | |
11 </change_format> | |
12 </xml> | |
13 <xml name="input_out_format"> | |
14 <param name="out_format" type="select" label="Output format"> | |
15 <option value="6">Tabular (standard 12 columns)</option> | |
16 <option value="ext" selected="True">Tabular (extended 25 columns)</option> | |
17 <option value="5">BLAST XML</option> | |
18 <option value="0">Pairwise text</option> | |
19 <option value="0 -html">Pairwise HTML</option> | |
20 <option value="2">Query-anchored text</option> | |
21 <option value="2 -html">Query-anchored HTML</option> | |
22 <option value="4">Flat query-anchored text</option> | |
23 <option value="4 -html">Flat query-anchored HTML</option> | |
24 <!-- | |
25 <option value="-outfmt 11">BLAST archive format (ASN.1)</option> | |
26 --> | |
27 </param> | |
28 </xml> | |
29 <xml name="input_scoring_matrix"> | |
30 <param name="matrix" type="select" label="Scoring matrix"> | |
31 <option value="BLOSUM90">BLOSUM90</option> | |
32 <option value="BLOSUM80">BLOSUM80</option> | |
33 <option value="BLOSUM62" selected="true">BLOSUM62 (default)</option> | |
34 <option value="BLOSUM50">BLOSUM50</option> | |
35 <option value="BLOSUM45">BLOSUM45</option> | |
36 <option value="PAM250">PAM250</option> | |
37 <option value="PAM70">PAM70</option> | |
38 <option value="PAM30">PAM30</option> | |
39 </param> | |
40 </xml> | |
41 <xml name="stdio"> | |
42 <stdio> | |
43 <!-- Anything other than zero is an error --> | |
44 <exit_code range="1:" /> | |
45 <exit_code range=":-1" /> | |
46 <!-- In case the return code has not been set propery check stderr too --> | |
47 <regex match="Error:" /> | |
48 <regex match="Exception:" /> | |
49 </stdio> | |
50 </xml> | |
51 <xml name="input_query_gencode"> | |
52 <param name="query_gencode" type="select" label="Query genetic code"> | |
53 <!-- See http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi for details --> | |
54 <option value="1" select="True">1. Standard</option> | |
55 <option value="2">2. Vertebrate Mitochondrial</option> | |
56 <option value="3">3. Yeast Mitochondrial</option> | |
57 <option value="4">4. Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code</option> | |
58 <option value="5">5. Invertebrate Mitochondrial</option> | |
59 <option value="6">6. Ciliate, Dasycladacean and Hexamita Nuclear Code</option> | |
60 <option value="9">9. Echinoderm Mitochondrial</option> | |
61 <option value="10">10. Euplotid Nuclear</option> | |
62 <option value="11">11. Bacteria and Archaea</option> | |
63 <option value="12">12. Alternative Yeast Nuclear</option> | |
64 <option value="13">13. Ascidian Mitochondrial</option> | |
65 <option value="14">14. Flatworm Mitochondrial</option> | |
66 <option value="15">15. Blepharisma Macronuclear</option> | |
67 <option value="16">16. Chlorophycean Mitochondrial Code</option> | |
68 <option value="21">21. Trematode Mitochondrial Code</option> | |
69 <option value="22">22. Scenedesmus obliquus mitochondrial Code</option> | |
70 <option value="23">23. Thraustochytrium Mitochondrial Code</option> | |
71 <option value="24">24. Pterobranchia mitochondrial code</option> | |
72 </param> | |
73 </xml> | |
74 <xml name="input_db_gencode"> | |
75 <param name="db_gencode" type="select" label="Database/subject genetic code"> | |
76 <!-- See http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi for details --> | |
77 <option value="1" select="True">1. Standard</option> | |
78 <option value="2">2. Vertebrate Mitochondrial</option> | |
79 <option value="3">3. Yeast Mitochondrial</option> | |
80 <option value="4">4. Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code</option> | |
81 <option value="5">5. Invertebrate Mitochondrial</option> | |
82 <option value="6">6. Ciliate, Dasycladacean and Hexamita Nuclear Code</option> | |
83 <option value="9">9. Echinoderm Mitochondrial</option> | |
84 <option value="10">10. Euplotid Nuclear</option> | |
85 <option value="11">11. Bacteria and Archaea</option> | |
86 <option value="12">12. Alternative Yeast Nuclear</option> | |
87 <option value="13">13. Ascidian Mitochondrial</option> | |
88 <option value="14">14. Flatworm Mitochondrial</option> | |
89 <option value="15">15. Blepharisma Macronuclear</option> | |
90 <option value="16">16. Chlorophycean Mitochondrial Code</option> | |
91 <option value="21">21. Trematode Mitochondrial Code</option> | |
92 <option value="22">22. Scenedesmus obliquus mitochondrial Code</option> | |
93 <option value="23">23. Thraustochytrium Mitochondrial Code</option> | |
94 <option value="24">24. Pterobranchia mitochondrial code</option> | |
95 </param> | |
96 </xml> | |
97 <xml name="input_conditional_nucleotide_db"> | |
98 <conditional name="db_opts"> | |
99 <param name="db_opts_selector" type="select" label="Subject database/sequences"> | |
100 <option value="db" selected="True">Locally installed BLAST database</option> | |
101 <option value="histdb">BLAST database from your history</option> | |
102 <option value="file">FASTA file from your history (see warning note below)</option> | |
103 </param> | |
104 <when value="db"> | |
105 <param name="database" type="select" label="Nucleotide BLAST database"> | |
106 <options from_file="blastdb.loc"> | |
107 <column name="value" index="0"/> | |
108 <column name="name" index="1"/> | |
109 <column name="path" index="2"/> | |
110 </options> | |
111 </param> | |
112 <param name="histdb" type="hidden" value="" /> | |
113 <param name="subject" type="hidden" value="" /> | |
114 </when> | |
115 <when value="histdb"> | |
116 <param name="database" type="hidden" value="" /> | |
117 <param name="histdb" type="data" format="blastdbn" label="Nucleotide BLAST database" /> | |
118 <param name="subject" type="hidden" value="" /> | |
119 </when> | |
120 <when value="file"> | |
121 <param name="database" type="hidden" value="" /> | |
122 <param name="histdb" type="hidden" value="" /> | |
123 <param name="subject" type="data" format="fasta" label="Nucleotide FASTA file to use as database"/> | |
124 </when> | |
125 </conditional> | |
126 </xml> | |
127 <xml name="input_conditional_protein_db"> | |
128 <conditional name="db_opts"> | |
129 <param name="db_opts_selector" type="select" label="Subject database/sequences"> | |
130 <option value="db" selected="True">Locally installed BLAST database</option> | |
131 <option value="histdb">BLAST database from your history</option> | |
132 <option value="file">FASTA file from your history (see warning note below)</option> | |
133 </param> | |
134 <when value="db"> | |
135 <param name="database" type="select" label="Protein BLAST database"> | |
136 <options from_file="blastdb_p.loc"> | |
137 <column name="value" index="0"/> | |
138 <column name="name" index="1"/> | |
139 <column name="path" index="2"/> | |
140 </options> | |
141 </param> | |
142 <param name="histdb" type="hidden" value="" /> | |
143 <param name="subject" type="hidden" value="" /> | |
144 </when> | |
145 <when value="histdb"> | |
146 <param name="database" type="hidden" value="" /> | |
147 <param name="histdb" type="data" format="blastdbp" label="Protein BLAST database" /> | |
148 <param name="subject" type="hidden" value="" /> | |
149 </when> | |
150 <when value="file"> | |
151 <param name="database" type="hidden" value="" /> | |
152 <param name="histdb" type="hidden" value="" /> | |
153 <param name="subject" type="data" format="fasta" label="Protein FASTA file to use as database"/> | |
154 </when> | |
155 </conditional> | |
156 </xml> | |
157 <xml name="input_conditional_pssm"> | |
158 <conditional name="db_opts"> | |
159 <param name="db_opts_selector" type="select" label="Protein domain database (PSSM)"> | |
160 <option value="db" selected="True">Locally installed BLAST database</option> | |
161 <!-- TODO - define new datatype | |
162 <option value="histdb">BLAST protein domain database from your history</option> | |
163 --> | |
164 </param> | |
165 <when value="db"> | |
166 <param name="database" type="select" label="Protein domain database"> | |
167 <options from_file="blastdb_d.loc"> | |
168 <column name="value" index="0"/> | |
169 <column name="name" index="1"/> | |
170 <column name="path" index="2"/> | |
171 </options> | |
172 </param> | |
173 <param name="histdb" type="hidden" value="" /> | |
174 <param name="subject" type="hidden" value="" /> | |
175 </when> | |
176 <!-- TODO - define new datatype | |
177 <when value="histdb"> | |
178 <param name="database" type="hidden" value="" /> | |
179 <param name="histdb" type="data" format="blastdbd" label="Protein domain database" /> | |
180 <param name="subject" type="hidden" value="" /> | |
181 </when> | |
182 --> | |
183 </conditional> | |
184 </xml> | |
185 <xml name="input_conditional_choose_db_type"> | |
186 <conditional name="db_opts"> | |
187 <param name="db_type" type="select" label="Type of BLAST database"> | |
188 <option value="nucl" selected="True">Nucleotide</option> | |
189 <option value="prot">Protein</option> | |
190 </param> | |
191 <when value="nucl"> | |
192 <param name="database" type="select" label="Nucleotide BLAST database"> | |
193 <options from_file="blastdb.loc"> | |
194 <column name="value" index="0"/> | |
195 <column name="name" index="1"/> | |
196 <column name="path" index="2"/> | |
197 </options> | |
198 </param> | |
199 </when> | |
200 <when value="prot"> | |
201 <param name="database" type="select" label="Protein BLAST database"> | |
202 <options from_file="blastdb_p.loc"> | |
203 <column name="value" index="0"/> | |
204 <column name="name" index="1"/> | |
205 <column name="path" index="2"/> | |
206 </options> | |
207 </param> | |
208 </when> | |
209 </conditional> | |
210 </xml> | |
211 <xml name="input_parse_deflines"> | |
212 <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/> | |
213 </xml> | |
214 <xml name="input_filter_query_default_false"> | |
215 <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="false" /> | |
216 </xml> | |
217 <xml name="input_filter_query_default_true"> | |
218 <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="true" /> | |
219 </xml> | |
220 <xml name="input_max_hits"> | |
221 <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits"> | |
222 <validator type="in_range" min="0" /> | |
223 </param> | |
224 </xml> | |
225 <xml name="input_evalue"> | |
226 <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" /> | |
227 </xml> | |
228 <xml name="input_word_size"> | |
229 <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2."> | |
230 <validator type="in_range" min="0" /> | |
231 </param> | |
232 </xml> | |
233 <xml name="input_strand"> | |
234 <param name="strand" type="select" label="Query strand(s) to search against database/subject"> | |
235 <option value="-strand both">Both</option> | |
236 <option value="-strand plus">Plus (forward)</option> | |
237 <option value="-strand minus">Minus (reverse complement)</option> | |
238 </param> | |
239 </xml> | |
240 <xml name="requirements"> | |
241 <requirements> | |
242 <requirement type="binary">@BINARY@</requirement> | |
243 <requirement type="package" version="2.2.28">blast+</requirement> | |
244 </requirements> | |
245 <version_command>@BINARY@ -version</version_command> | |
246 </xml> | |
247 <xml name="advanced_options"> | |
248 <conditional name="adv_opts"> | |
249 <param name="adv_opts_selector" type="select" label="Advanced Options"> | |
250 <option value="basic" selected="True">Hide Advanced Options</option> | |
251 <option value="advanced">Show Advanced Options</option> | |
252 </param> | |
253 <when value="basic" /> | |
254 <when value="advanced"> | |
255 <yield /> | |
256 </when> | |
257 </conditional> | |
258 </xml> | |
259 <token name="@THREADS@">-num_threads "\${GALAXY_SLOTS:-8}"</token> | |
260 <token name="@BLAST_DB_SUBJECT@"> | |
261 #if $db_opts.db_opts_selector == "db": | |
262 -db "${db_opts.database.fields.path}" | |
263 #elif $db_opts.db_opts_selector == "histdb": | |
264 -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}" | |
265 #else: | |
266 -subject "$db_opts.subject" | |
267 #end if | |
268 </token> | |
269 <token name="@BLAST_OUTPUT@">-out "$output1" | |
270 ##Set the extended list here so when we add things, saved workflows are not affected | |
271 #if str($out_format)=="ext": | |
272 -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen salltitles" | |
273 #else: | |
274 -outfmt $out_format | |
275 #end if | |
276 </token> | |
277 <token name="@ADVANCED_OPTIONS@">$adv_opts.filter_query | |
278 ## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string | |
279 ## Note -max_target_seqs overrides -num_descriptions and -num_alignments | |
280 #if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0): | |
281 -max_target_seqs $adv_opts.max_hits | |
282 #end if | |
283 #if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0): | |
284 -word_size $adv_opts.word_size | |
285 #end if | |
286 $adv_opts.parse_deflines | |
287 </token> | |
288 <!-- @ON_DB_SUBJECT@ is for use with @BLAST_DB_SUBJECT@ --> | |
289 <token name="@ON_DB_SUBJECT@">#if str($db_opts.db_opts_selector)=='db' | |
290 ${db_opts.database} | |
291 #elif str($db_opts.db_opts_selector)=='histdb' | |
292 ${db_opts.histdb.name} | |
293 #else | |
294 ${db_opts.subject.name} | |
295 #end if</token> | |
296 <token name="@REFERENCES@"> | |
297 Peter J.A. Cock, Björn A. Grüning, Konrad Paszkiewicz and Leighton Pritchard (2013). | |
298 Galaxy tools and workflows for sequence analysis with applications | |
299 in molecular plant pathology. PeerJ 1:e167 | |
300 http://dx.doi.org/10.7717/peerj.167 | |
301 | |
302 Christiam Camacho et al. (2009). | |
303 BLAST+: architecture and applications. | |
304 BMC Bioinformatics. 15;10:421. | |
305 http://dx.doi.org/10.1186/1471-2105-10-421 | |
306 | |
307 This wrapper is available to install into other Galaxy Instances via the Galaxy | |
308 Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus | |
309 </token> | |
310 <token name="@OUTPUT_FORMAT@">**Output format** | |
311 | |
312 Because Galaxy focuses on processing tabular data, the default output of this | |
313 tool is tabular. The standard BLAST+ tabular output contains 12 columns: | |
314 | |
315 ====== ========= ============================================ | |
316 Column NCBI name Description | |
317 ------ --------- -------------------------------------------- | |
318 1 qseqid Query Seq-id (ID of your sequence) | |
319 2 sseqid Subject Seq-id (ID of the database hit) | |
320 3 pident Percentage of identical matches | |
321 4 length Alignment length | |
322 5 mismatch Number of mismatches | |
323 6 gapopen Number of gap openings | |
324 7 qstart Start of alignment in query | |
325 8 qend End of alignment in query | |
326 9 sstart Start of alignment in subject (database hit) | |
327 10 send End of alignment in subject (database hit) | |
328 11 evalue Expectation value (E-value) | |
329 12 bitscore Bit score | |
330 ====== ========= ============================================ | |
331 | |
332 The BLAST+ tools can optionally output additional columns of information, | |
333 but this takes longer to calculate. Most (but not all) of these columns are | |
334 included by selecting the extended tabular output. The extra columns are | |
335 included *after* the standard 12 columns. This is so that you can write | |
336 workflow filtering steps that accept either the 12 or 25 column tabular | |
337 BLAST output. Galaxy now uses this extended 25 column output by default. | |
338 | |
339 ====== ============= =========================================== | |
340 Column NCBI name Description | |
341 ------ ------------- ------------------------------------------- | |
342 13 sallseqid All subject Seq-id(s), separated by ';' | |
343 14 score Raw score | |
344 15 nident Number of identical matches | |
345 16 positive Number of positive-scoring matches | |
346 17 gaps Total number of gaps | |
347 18 ppos Percentage of positive-scoring matches | |
348 19 qframe Query frame | |
349 20 sframe Subject frame | |
350 21 qseq Aligned part of query sequence | |
351 22 sseq Aligned part of subject sequence | |
352 23 qlen Query sequence length | |
353 24 slen Subject sequence length | |
354 25 salltitles All subject title(s), separated by '<>' | |
355 ====== ============= =========================================== | |
356 | |
357 The third option is BLAST XML output, which is designed to be parsed by | |
358 another program, and is understood by some Galaxy tools. | |
359 | |
360 You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program). | |
361 The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website. | |
362 The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query. | |
363 The two query anchored outputs show a multiple sequence alignment between the query and all the matches, | |
364 and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences). | |
365 </token> | |
366 <token name="@FASTA_WARNING@">.. class:: warningmark | |
367 | |
368 You can also search against a FASTA file of subject (target) | |
369 sequences. This is *not* advised because it is slower (only one | |
370 CPU is used), but more importantly gives e-values for pairwise | |
371 searches (very small e-values which will look overly signficiant). | |
372 In most cases you should instead turn the other FASTA file into a | |
373 database first using *makeblastdb* and search against that. | |
374 </token> | |
375 <token name="@SEARCH_TIME_WARNING@">.. class:: warningmark | |
376 | |
377 **Note**. Database searches may take a substantial amount of time. | |
378 For large input datasets it is advisable to allow overnight processing. | |
379 | |
380 ----- | |
381 </token> | |
382 </macros> |