comparison mothur/tools/mothur/chimera.pintail.xml @ 2:e990ac8a0f58

Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
author jjohnson
date Tue, 07 Jun 2011 17:39:06 -0400
parents fcc0778f6987
children ce6e81622c6a
comparison
equal deleted inserted replaced
1:fcc0778f6987 2:e990ac8a0f58
1 <tool id="mothur_chimera_pintail" name="Chimera.pintail" version="1.16.0"> 1 <tool id="mothur_chimera_pintail" name="Chimera.pintail" version="1.19.0">
2 <description>Find putative chimeras using pintail</description> 2 <description>Find putative chimeras using pintail</description>
3 <command interpreter="python"> 3 <command interpreter="python">
4 mothur_wrapper.py 4 mothur_wrapper.py
5 --cmd='chimera.pintail' 5 --cmd='chimera.pintail'
6 ## --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.pintail\.chimeras$:'$out_file,'^\S+\.pintail\.accnos$:'$out_accnos,'^\S+\.freq$:'$out_freq,'^\S+\.quan$:'$out_quantile 6 ## --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.pintail\.chimeras$:'$out_file,'^\S+\.pintail\.accnos$:'$out_accnos,'^\S+\.freq$:'$out_freq,'^\S+\.quan$:'$out_quantile
7 #set results = ["'^mothur.\S+\.logfile$:'" + $logfile.__str__] 7 #set results = ["'^mothur.\S+\.logfile$:'" + $logfile.__str__]
8 #set results = $results + ["'^\S+\.pintail\.chimeras$:'" + $out_file.__str__] 8 #set results = $results + ["'^\S+\.pintail\.chimeras$:'" + $out_file.__str__]
9 #set results = $results + ["'^\S+\.pintail\.accnos$:'" + $out_accnos.__str__] 9 #set results = $results + ["'^\S+\.pintail\.accnos$:'" + $out_accnos.__str__]
10 --outputdir='$logfile.extra_files_path' 10 --outputdir='$logfile.extra_files_path'
11 --fasta=$fasta 11 --fasta=$fasta
12 --template=$alignment.template 12 --reference=$alignment.template
13 $filter 13 $filter
14 #if $mask.source == 'default': 14 #if $mask.source == 'default':
15 --mask=default 15 --mask=default
16 #elif $mask.source == 'history': 16 #elif $mask.source == 'history':
17 --mask=$mask.input 17 --mask=$mask.input
36 --processors=2 36 --processors=2
37 </command> 37 </command>
38 <inputs> 38 <inputs>
39 <param name="fasta" type="data" format="fasta" label="fasta - Candiate Sequences"/> 39 <param name="fasta" type="data" format="fasta" label="fasta - Candiate Sequences"/>
40 <conditional name="alignment"> 40 <conditional name="alignment">
41 <param name="source" type="select" label="Select Template from" help=""> 41 <param name="source" type="select" label="Select Reference Template from" help="">
42 <option value="hist">History</option> 42 <option value="hist">History</option>
43 <option value="ref">Cached Reference</option> 43 <option value="ref">Cached Reference</option>
44 </param> 44 </param>
45 <when value="ref"> 45 <when value="ref">
46 <param name="template" type="select" label="template - Select an alignment database " help=""> 46 <param name="template" type="select" label="reference - Select an alignment database " help="">
47 <options from_file="mothur_aligndb.loc"> 47 <options from_file="mothur_aligndb.loc">
48 <column name="name" index="0" /> 48 <column name="name" index="0" />
49 <column name="value" index="1" /> 49 <column name="value" index="1" />
50 </options> 50 </options>
51 </param> 51 </param>
52 </when> 52 </when>
53 <when value="hist"> 53 <when value="hist">
54 <param name="template" type="data" format="fasta" label="template - Template to align with" help=""/> 54 <param name="template" type="data" format="fasta" label="reference - Reference to align with" help=""/>
55 </when> 55 </when>
56 </conditional> 56 </conditional>
57 <param name="filter" type="boolean" falsevalue="" truevalue="--filter=true" checked="false" label="filter - Apply a 50% soft vertical filter"/> 57 <param name="filter" type="boolean" falsevalue="" truevalue="--filter=true" checked="false" label="filter - Apply a 50% soft vertical filter"/>
58 <!-- mask --> 58 <!-- mask -->
59 <conditional name="mask"> 59 <conditional name="mask">
135 135
136 .. _Mothur: http://www.mothur.org/wiki/Main_Page 136 .. _Mothur: http://www.mothur.org/wiki/Main_Page
137 137
138 **Command Documenation** 138 **Command Documenation**
139 139
140 The chimera.pintail_ command identifies putative chimeras using the pintail approach. 140 The chimera.pintail_ command identifies putative chimeras using the pintail approach. It looks at the variation between the expected differences and the observed differences in the query sequence over several windows.
141 141
142 This method was written using the algorithms described in the paper_ "At Least 1 in 20 16S rRNA Sequence Records Currently Held in the Public Repositories is Estimated To Contain Substantial Anomalies" by Kevin E. Ashelford 1, Nadia A. Chuzhanova 3, John C. Fry 1, Antonia J. Jones 2 and Andrew J. Weightman 1.
143
144
145 From www.bioinformatics-toolkit.org_
146
147 The Pintail algorithm is a technique for determining whether a 16S rDNA sequence is anomalous. It is based on the idea that the extent of local base differences between two aligned 16S rDNA sequences should be roughly the same along the length of the alignment (having allowed for the underlying pattern of hypervariable and conserved regions known to exist within the 16S rRNA gene). In other words, evolutionary distance between two reliable sequences should be constant along the length of the gene.
148
149 In contrast, if an error-free sequence is compared with an anomalous sequence, evolutionary distance along the alignment is unlikely to be constant, especially if the anomaly in question is a chimera and formed from phylogenetically different parental sequences.
150
151 The Pintail algorithm is designed to detect and quantify such local variations and in doing so generates the Deviation from Expectation (DE) statistic. The higher the DE value, the greater the likelihood that the query is anomalous.
152
153 The algorithm works as follows
154
155 The sequence to be checked (the query) is first globally aligned with a phylogenetically similar sequence known to be error-free (the subject). At regular intervals along the resulting alignment, the local evolutionary distance between query and subject is estimated by recording percentage base mismatches within a sampling window of fixed length. The resulting array of percentages (observed percentage differences) reflects variations in evolutionary distance between the query and subject along the length of the 16S rRNA gene. Subtracting observed percentage differences from an equivalent array of expected percentage differences (predicted values for error-free sequences), we obtain a set of deviations, the standard deviation of which (Deviation from Expectation, DE) summarises the variation between observed and expected datasets. The greater the DE value, the greater the disparity there is between observed and expected percentage differences, and the more likely it is that the query sequence is anomalous.
156
157
158 .. _paper: http://www.ncbi.nlm.nih.gov/pubmed/16332745
159 .. _www.bioinformatics-toolkit.org: http://www.bioinformatics-toolkit.org/Help/Topics/pintailAlgorithm.html
142 .. _chimera.pintail: http://www.mothur.org/wiki/Chimera.pintail 160 .. _chimera.pintail: http://www.mothur.org/wiki/Chimera.pintail
143 161
144 162
145 </help> 163 </help>
146 </tool> 164 </tool>