Mercurial > repos > devteam > ncbi_blast_plus
annotate tools/ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml @ 11:4c4a0da938ff draft
Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25.
Supports $GALAXY_SLOTS.
Includes more tests and heavy use of macros.
author | peterjc |
---|---|
date | Thu, 05 Dec 2013 06:55:59 -0500 |
parents | 70e7dcbf6573 |
children | 623f727cdff1 |
rev | line source |
---|---|
11
4c4a0da938ff
Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25.
peterjc
parents:
10
diff
changeset
|
1 <tool id="ncbi_blastdbcmd_wrapper" name="NCBI BLAST+ blastdbcmd entry(s)" version="0.0.22"> |
5 | 2 <description>Extract sequence(s) from BLAST database</description> |
11
4c4a0da938ff
Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25.
peterjc
parents:
10
diff
changeset
|
3 <macros> |
4c4a0da938ff
Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25.
peterjc
parents:
10
diff
changeset
|
4 <token name="@BINARY@">blastdbcmd</token> |
4c4a0da938ff
Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25.
peterjc
parents:
10
diff
changeset
|
5 <import>ncbi_macros.xml</import> |
4c4a0da938ff
Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25.
peterjc
parents:
10
diff
changeset
|
6 </macros> |
4c4a0da938ff
Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25.
peterjc
parents:
10
diff
changeset
|
7 <expand macro="requirements" /> |
5 | 8 <command> |
9 ## The command is a Cheetah template which allows some Python based syntax. | |
10 ## Lines starting hash hash are comments. Galaxy will turn newlines into spaces | |
11 blastdbcmd -dbtype $db_opts.db_type -db "${db_opts.database.fields.path}" | |
12 | |
13 ##TODO: What about -ctrl_a and -target_only as advanced options? | |
14 | |
15 #if $id_opts.id_type=="file": | |
16 -entry_batch "$id_opts.entries" | |
17 #else: | |
18 ##Perform some simple search/replaces to remove whitespace | |
19 ##and make it comma separated, and escape any pipe characters | |
20 -entry "$id_opts.entries.replace('\r',',').replace('\n',',').replace(' ','').replace(',,',',').replace(',,',',').strip(',').replace('|','\|')" | |
21 #end if | |
22 | |
23 ##When building a BLAST database, to ensure unique IDs makeblastdb will | |
24 ##do things like turning a FASTA entry with ID of ERP44 into lcl|ERP44 | |
25 ##(if using -parse_seqids) or simply assign it an ID using the record | |
26 ##number like gnl|BL_ORD_ID|123 (to cope with duplicate IDs in the FASTA | |
27 ##file). In -parse_seqids mode, a duplicate FASTA ID gives an error. | |
28 ## | |
29 ##The BLAST plain text and XML output will contain these BLAST IDs, but | |
30 ##the tabular output does not (at least, not in BLAST 2.2.25+). | |
31 ##Therefore in general, Galaxy users won't care about the (internal) | |
32 ##BLAST identifiers. | |
33 ## | |
34 ##The blastdbcmd FASTA output will also contain these IDs, but in the | |
35 ##context of the BLAST tabular output they are not helpful. Therefore | |
36 ##to recover the original ID as used in the FASTA file for makeblastdb | |
37 ##we need a litte post processing. | |
38 ## | |
39 ##We remove the NCBI's lcl|... or gnl|BL_ORD_ID|123 prefixes | |
40 ##using sed, however the exact syntax differs for Mac OS X's sed | |
41 | |
42 #if str($outfmt)=="blastid": | |
43 -out "$seq" | |
44 #else if sys.platform == "darwin": | |
45 | sed -E 's/^>(lcl\||gnl\|BL_ORD_ID\|[0-9]* )/>/1' > "$seq" | |
46 #else: | |
47 | sed 's/>\(lcl|\|gnl|BL_ORD_ID|[0-9]* \)/>/1' > "$seq" | |
48 #end if | |
49 </command> | |
11
4c4a0da938ff
Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25.
peterjc
parents:
10
diff
changeset
|
50 <expand macro="stdio" /> |
5 | 51 <inputs> |
11
4c4a0da938ff
Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25.
peterjc
parents:
10
diff
changeset
|
52 <expand macro="input_conditional_choose_db_type" /> |
5 | 53 <conditional name="id_opts"> |
54 <param name="id_type" type="select" label="Type of identifier list"> | |
55 <option value="file">From file</option> | |
56 <option value="prompt">User entered</option> | |
57 </param> | |
58 <when value="file"> | |
59 <param name="entries" type="data" format="txt,tabular" label="Sequence identifier(s)" help="Plain text file with one ID per line (i.e. single column tabular file)"/> | |
60 </when> | |
61 <when value="prompt"> | |
62 <param name="entries" type="text" label="Sequence identifier(s)" help="Comma or new line separated list." optional="False" area="True" size="10x30"/> | |
63 </when> | |
64 </conditional> | |
65 <param name="outfmt" type="select" label="Output format"> | |
66 <option value="original">FASTA with original identifiers</option> | |
67 <option value="blastid">FASTA with BLAST assigned identifiers</option> | |
68 </param> | |
69 </inputs> | |
70 <outputs> | |
71 <data name="seq" format="fasta" label="Sequences from ${db_opts.database.fields.name}" /> | |
72 </outputs> | |
73 <help> | |
74 | |
75 **What it does** | |
76 | |
77 Extracts FASTA formatted sequences from a BLAST database | |
78 using the NCBI BLAST+ blastdbcmd command line tool. | |
79 | |
80 .. class:: warningmark | |
81 | |
82 **BLAST assigned identifiers** | |
83 | |
84 When a BLAST database is constructed from a FASTA file, the | |
85 original identifiers can be replaced with BLAST assigned | |
86 identifiers, partly to ensure uniqueness. e.g. Sometimes | |
87 a prefix of 'lcl|' is added (lcl is short for local), | |
88 or an arbitrary name starting 'gnl|BL_ORD_ID|' is created. | |
89 | |
90 If you are using the tabular output from BLAST, it will contain | |
91 the original identifiers - not the BLAST assigned identifiers | |
92 suitable for use with the blastdbcmd tool. | |
93 | |
94 If you are using the XML or plain text output, this will also | |
95 contain the BLAST assigned identifiers. However, this means | |
96 getting a list of BLAST assigned identifiers isn't straightforward. | |
97 | |
98 ------- | |
99 | |
100 **References** | |
101 | |
10
70e7dcbf6573
Uploaded v0.0.20, handles dependencies via package_blast_plus_2_2_26, development moved to GitHub, RST README, MIT licence, citation information, more tests, percentage identity option to BLASTN, cElementTree to ElementTree fallback.
peterjc
parents:
9
diff
changeset
|
102 If you use this Galaxy tool in work leading to a scientific publication please |
70e7dcbf6573
Uploaded v0.0.20, handles dependencies via package_blast_plus_2_2_26, development moved to GitHub, RST README, MIT licence, citation information, more tests, percentage identity option to BLASTN, cElementTree to ElementTree fallback.
peterjc
parents:
9
diff
changeset
|
103 cite the following papers: |
70e7dcbf6573
Uploaded v0.0.20, handles dependencies via package_blast_plus_2_2_26, development moved to GitHub, RST README, MIT licence, citation information, more tests, percentage identity option to BLASTN, cElementTree to ElementTree fallback.
peterjc
parents:
9
diff
changeset
|
104 |
11
4c4a0da938ff
Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25.
peterjc
parents:
10
diff
changeset
|
105 @REFERENCES@ |
5 | 106 </help> |
107 </tool> |