annotate tools/ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml @ 9:9dabbfd73c8a draft

Uploaded v0.0.19, adds wrappers for rpsblast and rpstblastn with new blastdb_d.loc file for their protein domain database. Also includes other minor improvements.
author peterjc
date Thu, 25 Apr 2013 09:38:37 -0400
parents 393a7a35383c
children 70e7dcbf6573
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
9
9dabbfd73c8a Uploaded v0.0.19, adds wrappers for rpsblast and rpstblastn with new blastdb_d.loc file for their protein domain database.
peterjc
parents: 5
diff changeset
1 <tool id="ncbi_blastdbcmd_wrapper" name="NCBI BLAST+ blastdbcmd entry(s)" version="0.0.5">
5
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
2 <description>Extract sequence(s) from BLAST database</description>
9
9dabbfd73c8a Uploaded v0.0.19, adds wrappers for rpsblast and rpstblastn with new blastdb_d.loc file for their protein domain database.
peterjc
parents: 5
diff changeset
3 <requirements>
9dabbfd73c8a Uploaded v0.0.19, adds wrappers for rpsblast and rpstblastn with new blastdb_d.loc file for their protein domain database.
peterjc
parents: 5
diff changeset
4 <requirement type="binary">blastdbcmd</requirement>
9dabbfd73c8a Uploaded v0.0.19, adds wrappers for rpsblast and rpstblastn with new blastdb_d.loc file for their protein domain database.
peterjc
parents: 5
diff changeset
5 <requirement type="package" version="2.2.26+">blast+</requirement>
9dabbfd73c8a Uploaded v0.0.19, adds wrappers for rpsblast and rpstblastn with new blastdb_d.loc file for their protein domain database.
peterjc
parents: 5
diff changeset
6 </requirements>
9dabbfd73c8a Uploaded v0.0.19, adds wrappers for rpsblast and rpstblastn with new blastdb_d.loc file for their protein domain database.
peterjc
parents: 5
diff changeset
7 <version_command>blastdbcmd -version</version_command>
5
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
8 <command>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
9 ## The command is a Cheetah template which allows some Python based syntax.
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
10 ## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
11 blastdbcmd -dbtype $db_opts.db_type -db "${db_opts.database.fields.path}"
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
12
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
13 ##TODO: What about -ctrl_a and -target_only as advanced options?
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
14
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
15 #if $id_opts.id_type=="file":
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
16 -entry_batch "$id_opts.entries"
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
17 #else:
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
18 ##Perform some simple search/replaces to remove whitespace
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
19 ##and make it comma separated, and escape any pipe characters
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
20 -entry "$id_opts.entries.replace('\r',',').replace('\n',',').replace(' ','').replace(',,',',').replace(',,',',').strip(',').replace('|','\|')"
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
21 #end if
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
22
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
23 ##When building a BLAST database, to ensure unique IDs makeblastdb will
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
24 ##do things like turning a FASTA entry with ID of ERP44 into lcl|ERP44
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
25 ##(if using -parse_seqids) or simply assign it an ID using the record
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
26 ##number like gnl|BL_ORD_ID|123 (to cope with duplicate IDs in the FASTA
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
27 ##file). In -parse_seqids mode, a duplicate FASTA ID gives an error.
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
28 ##
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
29 ##The BLAST plain text and XML output will contain these BLAST IDs, but
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
30 ##the tabular output does not (at least, not in BLAST 2.2.25+).
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
31 ##Therefore in general, Galaxy users won't care about the (internal)
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
32 ##BLAST identifiers.
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
33 ##
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
34 ##The blastdbcmd FASTA output will also contain these IDs, but in the
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
35 ##context of the BLAST tabular output they are not helpful. Therefore
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
36 ##to recover the original ID as used in the FASTA file for makeblastdb
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
37 ##we need a litte post processing.
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
38 ##
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
39 ##We remove the NCBI's lcl|... or gnl|BL_ORD_ID|123 prefixes
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
40 ##using sed, however the exact syntax differs for Mac OS X's sed
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
41
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
42 #if str($outfmt)=="blastid":
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
43 -out "$seq"
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
44 #else if sys.platform == "darwin":
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
45 | sed -E 's/^>(lcl\||gnl\|BL_ORD_ID\|[0-9]* )/>/1' > "$seq"
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
46 #else:
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
47 | sed 's/>\(lcl|\|gnl|BL_ORD_ID|[0-9]* \)/>/1' > "$seq"
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
48 #end if
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
49 </command>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
50 <stdio>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
51 <!-- Anything other than zero is an error -->
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
52 <exit_code range="1:" />
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
53 <exit_code range=":-1" />
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
54 <!-- Suspect blastdbcmd sometimes fails to set error level -->
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
55 <regex match="Error:" />
9
9dabbfd73c8a Uploaded v0.0.19, adds wrappers for rpsblast and rpstblastn with new blastdb_d.loc file for their protein domain database.
peterjc
parents: 5
diff changeset
56 <regex match="Exception:" />
5
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
57 </stdio>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
58 <inputs>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
59 <conditional name="db_opts">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
60 <param name="db_type" type="select" label="Type of BLAST database">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
61 <option value="nucl" selected="True">Nucleotide</option>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
62 <option value="prot">Protein</option>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
63 </param>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
64 <when value="nucl">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
65 <param name="database" type="select" label="Nucleotide BLAST database">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
66 <options from_file="blastdb.loc">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
67 <column name="value" index="0"/>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
68 <column name="name" index="1"/>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
69 <column name="path" index="2"/>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
70 </options>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
71 </param>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
72 </when>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
73 <when value="prot">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
74 <param name="database" type="select" label="Protein BLAST database">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
75 <options from_file="blastdb_p.loc">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
76 <column name="value" index="0"/>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
77 <column name="name" index="1"/>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
78 <column name="path" index="2"/>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
79 </options>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
80 </param>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
81 </when>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
82 </conditional>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
83 <conditional name="id_opts">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
84 <param name="id_type" type="select" label="Type of identifier list">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
85 <option value="file">From file</option>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
86 <option value="prompt">User entered</option>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
87 </param>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
88 <when value="file">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
89 <param name="entries" type="data" format="txt,tabular" label="Sequence identifier(s)" help="Plain text file with one ID per line (i.e. single column tabular file)"/>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
90 </when>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
91 <when value="prompt">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
92 <param name="entries" type="text" label="Sequence identifier(s)" help="Comma or new line separated list." optional="False" area="True" size="10x30"/>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
93 </when>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
94 </conditional>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
95 <param name="outfmt" type="select" label="Output format">
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
96 <option value="original">FASTA with original identifiers</option>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
97 <option value="blastid">FASTA with BLAST assigned identifiers</option>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
98 </param>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
99 </inputs>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
100 <outputs>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
101 <data name="seq" format="fasta" label="Sequences from ${db_opts.database.fields.name}" />
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
102 </outputs>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
103 <help>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
104
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
105 **What it does**
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
106
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
107 Extracts FASTA formatted sequences from a BLAST database
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
108 using the NCBI BLAST+ blastdbcmd command line tool.
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
109
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
110 .. class:: warningmark
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
111
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
112 **BLAST assigned identifiers**
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
113
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
114 When a BLAST database is constructed from a FASTA file, the
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
115 original identifiers can be replaced with BLAST assigned
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
116 identifiers, partly to ensure uniqueness. e.g. Sometimes
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
117 a prefix of 'lcl|' is added (lcl is short for local),
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
118 or an arbitrary name starting 'gnl|BL_ORD_ID|' is created.
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
119
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
120 If you are using the tabular output from BLAST, it will contain
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
121 the original identifiers - not the BLAST assigned identifiers
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
122 suitable for use with the blastdbcmd tool.
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
123
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
124 If you are using the XML or plain text output, this will also
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
125 contain the BLAST assigned identifiers. However, this means
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
126 getting a list of BLAST assigned identifiers isn't straightforward.
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
127
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
128 -------
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
129
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
130 **References**
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
131
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
132 Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
133
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
134 Schaffer et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. 2001. Nucleic Acids Res. 29:2994-3005.
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
135
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
136 </help>
393a7a35383c Uploaded v0.0.14 adding local BLAST database support.
peterjc
parents:
diff changeset
137 </tool>