Mercurial > repos > bgruening > infernal
comparison cmbuild.xml @ 3:2c2c5e5e495b draft
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/infernal commit 9eeedfaf35c069d75014c5fb2e42046106bf813c-dirty
author | bgruening |
---|---|
date | Fri, 04 Mar 2016 07:24:53 -0500 |
parents | fac157e22e1b |
children | c47a7c52ac4f |
comparison
equal
deleted
inserted
replaced
2:fac157e22e1b | 3:2c2c5e5e495b |
---|---|
69 <param name="model_construction_opts_selector" type="select" label="These options control how consensus columns are defined in an alignment" help=""> | 69 <param name="model_construction_opts_selector" type="select" label="These options control how consensus columns are defined in an alignment" help=""> |
70 <option value="--fast" selected="true">automatic (--fast)</option> | 70 <option value="--fast" selected="true">automatic (--fast)</option> |
71 <option value="--hand">user defined (--hand)</option> | 71 <option value="--hand">user defined (--hand)</option> |
72 </param> | 72 </param> |
73 <when value="--fast"> | 73 <when value="--fast"> |
74 <param name="symfrac" type="float" value="0.5" size="5" | 74 <param name="symfrac" type="float" value="0.5" |
75 label="Define the residue fraction threshold necessary to define a consensus (--symfrac)" help=""/> | 75 label="Define the residue fraction threshold necessary to define a consensus (--symfrac)" help=""/> |
76 </when> | 76 </when> |
77 <when value="--hand"/> | 77 <when value="--hand"/> |
78 </conditional> | 78 </conditional> |
79 | 79 |
91 <when value="--wpb"/> | 91 <when value="--wpb"/> |
92 <when value="--wgsc"/> | 92 <when value="--wgsc"/> |
93 <when value="--wnone"/> | 93 <when value="--wnone"/> |
94 <when value="--wgiven"/> | 94 <when value="--wgiven"/> |
95 <when value="--wblosum"> | 95 <when value="--wblosum"> |
96 <param name="wid" type="float" value="0.5" size="5" | 96 <param name="wid" type="float" value="0.5" |
97 label="Percent identity for clustering the alignment (--wid)" help=""/> | 97 label="Percent identity for clustering the alignment (--wid)" help=""/> |
98 </when> | 98 </when> |
99 </conditional> | 99 </conditional> |
100 | 100 |
101 | 101 |
104 <option value="--eent" selected="true">entropy weighting strategy (--eent)</option> | 104 <option value="--eent" selected="true">entropy weighting strategy (--eent)</option> |
105 <option value="--enone">Turn off the entropy weighting strategy (--enone)</option> | 105 <option value="--enone">Turn off the entropy weighting strategy (--enone)</option> |
106 </param> | 106 </param> |
107 <when value="--enone"/> | 107 <when value="--enone"/> |
108 <when value="--eent"> | 108 <when value="--eent"> |
109 <param name="ere" type="float" value="0.59" size="5" | 109 <param name="ere" type="float" value="0.59" |
110 label="Set the target mean match state relative entropy (--ere)" help=""/> | 110 label="Set the target mean match state relative entropy (--ere)" help=""/> |
111 | 111 |
112 <param name="eminseq" type="integer" value="" size="5" | 112 <param name="eminseq" type="integer" value="" |
113 label="Define the minimum allowed effective sequence number (--eminseq)" help=""/> | 113 label="Define the minimum allowed effective sequence number (--eminseq)" help=""/> |
114 | 114 |
115 <param name="ehmmre" type="float" value="" size="5" | 115 <param name="ehmmre" type="float" value="" |
116 label="Set the target HMM mean match state relative entropy (--ehmmre)" help=""/> | 116 label="Set the target HMM mean match state relative entropy (--ehmmre)" help=""/> |
117 | 117 |
118 <param name="eset" type="integer" value="" size="5" | 118 <param name="eset" type="integer" value="" |
119 label="Set the effective sequence number for entropy weighting (--eset)" help=""/> | 119 label="Set the effective sequence number for entropy weighting (--eset)" help=""/> |
120 </when> | 120 </when> |
121 </conditional> | 121 </conditional> |
122 | 122 |
123 | 123 |
184 <, J. Mol. Biol. 243:574, 1994). This is the default. |
220 which algorithm gets used. | 248 - *--wgsc*: Use the Gerstein/Sonnhammer/Chothia weighting algorithm ([Gerstein et al.](http://ac.els-cdn.com/0022283694900124/1-s2.0-0022283694900124-main.pdf?_tid=6ed29974-3044-11e5-8949-00000aacb35f&acdnat=1437550798_aaa62caa2c812bb81013f967e7b119ee), J. Mol. Biol. 236:1067, 1994). |
221 | 249 - *--wnone*: Turn sequence weighting off; e.g. explicitly set all sequence weights to 1.0. |
222 * --wpb Use the Henikoff position-based sequence weighting scheme [Henikoff and Henikoff, J. Mol. Biol. 243:574, 1994]. This is the default. | 250 - *--wgiven*: Use sequence weights as given in annotation in the input alignment file. If no weights were given, assume they are all 1.0. The default is to determine new sequence weights by the Gerstein/Sonnhammer/Chothia algorithm, ignoring any annotated weights. |
223 * --wgsc Use the Gerstein/Sonnhammer/Chothia weighting algorithm [Gerstein et al, J. Mol. Biol. 235:1067, 1994]. | 251 - *--wblosum*: Use the BLOSUM filtering algorithm to weight the sequences, instead of the default GSC weighting. Cluster the sequences at a given percentage identity (see --wid); assign each cluster a total weight of 1.0, distributed equally amongst the members of that cluster. |
224 * --wnone Turn sequence weighting off; e.g. explicitly set all sequence weights to 1.0. | 252 |
225 * --wgiven Use sequence weights as given in annotation in the input alignment file. If no weights were given, assume they are all 1.0. The default is to determine new sequence weights by the Gerstein/Sonnhammer/Chothia algorithm, ignoring any annotated weights. | 253 |
226 * --wblosum Use the BLOSUM filtering algorithm to weight the sequences, instead of the default GSC weighting. Cluster the sequences at a given percentage identity (see --wid); assign each cluster a total weight of 1.0, distributed equally amongst the members of that cluster. | 254 **Options controlling effective sequence number** |
227 * --wid Controls the behavior of the --wblosum weighting option by setting the percent identity for clustering the alignment. | 255 |
228 | 256 |
229 | 257 After relative weights are determined, they are normalized to sum to a total effective sequence number, eff nseq. This number may be the actual number of sequences in the alignment, but it is almost always smaller than that. The default entropy weighting method (--eent) reduces the effective sequence number to reduce the information content (relative entropy, or average expected score on true homologs) per consensus position. The target relative entropy is controlled by a two-parameter function, where the two parameters are settable with --ere and --esigma. |
230 Options controlling effective sequence number | 258 |
231 --------------------------------------------- | 259 - *--eent*: Use the entropy weighting strategy to determine the effective sequence number that gives a target mean match state relative entropy. This option is the default, and can be turned off with --enone. The default target mean match state relative entropy is 0.59 bits for models with at least 1 basepair and 0.38 bits for models with zero basepairs, but changed with --ere. The default of 0.59 or 0.38 bits is automatically changed if the total relative entropy of the model (summed match state relative entropy) is less than a cutoff, which is is 6.0 bits by default, but can be changed with the expert, undocumented --eX option. If you really want to play with that option, consult the source code. |
232 | 260 - *--enone*: Turn off the entropy weighting strategy. The effective sequence number is just the number of sequences in the alignment. |
233 After relative weights are determined, they are normalized to sum to a total effective sequence number, eff nseq. This | 261 - *--ere*: Set the target mean match state relative entropy. By default the target relative entropy per match position is 0.59 bits for models with at least 1 basepair and 0.38 for models with zero basepairs. |
234 number may be the actual number of sequences in the alignment, but it is almost always smaller than that. The default | 262 - *--eminseq*: Define the minimum allowed effective sequence number. |
235 entropy weighting method (--eent) reduces the effective sequence number to reduce the information content (relative | 263 - *--ehmmre*: Set the target HMM mean match state relative entropy. Entropy for basepairing match states is calculated using marginalized basepair emission probabilities. |
236 entropy, or average expected score on true homologs) per consensus position. The target relative entropy is controlled | 264 - *--eset*: Set the effective sequence number for entropy weighting. |
237 by a two-parameter function, where the two parameters are settable with --ere and --esigma. | 265 |
238 | 266 |
239 * --eent Use the entropy weighting strategy to determine the effective sequence number that gives a target mean match state relative entropy. This option is the default, and can be turned off with --enone. The default target mean match state relative entropy is 0.59 bits for models with at least 1 basepair and 0.38 bits for models with zero basepairs, but changed with --ere. The default of 0.59 or 0.38 bits is automatically changed if the total relative entropy of the model (summed match state relative entropy) is less than a cutoff, which is is 6.0 bits by default, but can be changed with the expert, undocumented --eX option. If you really want to play with that option, consult the source code. | 267 |
240 * --enone Turn off the entropy weighting strategy. The effective sequence number is just the number of sequences in the alignment. | 268 **Options for refining the input alignment** |
241 * --ere Set the target mean match state relative entropy. By default the target relative entropy per match position is 0.59 bits for models with at least 1 basepair and 0.38 for models with zero basepairs. | 269 |
242 * --eminseq Define the minimum allowed effective sequence number. | 270 - *--refine*: Attempt to refine the alignment before building the CM using expectation-maximization (EM). A CM is first built from the initial alignment as usual. Then, the sequences in the alignment are realigned optimally (with the HMM banded CYK algorithm, optimal means optimal given the bands) to the CM, and a new CM is built from the resulting alignment. The sequences are then realigned to the new CM, and a new CM is built from that alignment. This is continued until convergence, specifically when the alignments for two successive iterations are not significantly different (the summed bit scores of all the sequences in the alignment changes less than 1% between two successive iterations). |
243 * --ehmmre Set the target HMM mean match state relative entropy. Entropy for basepairing match states is calculated using marginalized basepair emission probabilities. | 271 - *Turn on the local alignment algorithm*: allows the alignment to span two or more subsequences if necessary (e.g. if the structures of the query model and target sequence are only partially shared), allowing certain large insertions and deletions in the structure to be penalized differently than normal indels. The default is to globally align the query model to the target sequences. |
244 * --eset Set the effective sequence number for entropy weighting. | 272 - *--gibbs sampling*: Modifies the behavior of --refine so Gibbs sampling is used instead of EM. The difference is that during the alignment stage the alignment is not necessarily optimal, instead an alignment (parsetree) for each sequences is sampled from the posterior distribution of alignments as determined by the Inside algorithm. Due to this sampling step --gibbs is non- deterministic, so different runs with the same alignment may yield different results. This is not true when --refine is used without the --gibbs option, in which case the final alignment and CM will always be the same. When --gibbs is enabled, the --seed "number" option can be used to seed the random number generator predictably, making the results reproducible. The goal of the --gibbs option is to help expert RNA alignment curators refine structural alignments by allowing them to observe alternative high scoring alignments. |
245 | 273 - *--Random seed*: Seed the random number generator with an integer >= 0. This option can only be used in combination with --gibbs. If the given number is nonzero, stochastic sampling of alignments will be reproducible; the same command will give the same results. If the given number is 0, the random number generator is seeded arbitrarily, and stochastic samplings may vary from run to run of the same command. The default seed is 0. |
246 | 274 - *--Turn off the truncated alignment algorithm*: With --refine, turn off the truncated alignment algorithm. There is more information on this in the cmalign manual page. |
247 | 275 - *--cyk algorithm*: With --refine, align with the CYK algorithm. By default the optimal accuracy algorithm is used. There is more information on this in the cmalign manual page. |
248 Options for refining the input alignment | 276 |
249 ---------------------------------------- | 277 |
250 | |
251 * --refine Attempt to refine the alignment before building the CM using expectation-maximization (EM). A CM is first built from the initial alignment as usual. Then, the sequences in the alignment are realigned optimally (with the HMM banded CYK algorithm, optimal means optimal given the bands) to the CM, and a new CM is built from the resulting alignment. The sequences are then realigned to the new CM, and a new CM is built from that alignment. This is continued until convergence, specifically when the alignments for two successive iterations are not significantly different (the summed bit scores of all the sequences in the alignment changes less than 1% between two successive iterations). | |
252 * -l Turn on the local alignment algorithm, which allows the alignment to span two or more subsequences if necessary (e.g. if the structures of the query model and target sequence are only partially shared), allowing certain large insertions and deletions in the structure to be penalized differently than normal indels. The default is to globally align the query model to the target sequences. | |
253 * --gibbs Modifies the behavior of --refine so Gibbs sampling is used instead of EM. The difference is that during the alignment stage the alignment is not necessarily optimal, instead an alignment (parsetree) for each sequences is sampled from the posterior distribution of alignments as determined by the Inside algorithm. Due to this sampling step --gibbs is non- deterministic, so different runs with the same alignment may yield different results. This is not true when --refine is used without the --gibbs option, in which case the final alignment and CM will always be the same. When --gibbs is enabled, the --seed "number" option can be used to seed the random number generator predictably, making the results reproducible. The goal of the --gibbs option is to help expert RNA alignment curators refine structural alignments by allowing them to observe alternative high scoring alignments. | |
254 * --seed Seed the random number generator with an integer >= 0. This option can only be used in combination with --gibbs. If the given number is nonzero, stochastic sampling of alignments will be reproducible; the same command will give the same results. If the given number is 0, the random number generator is seeded arbitrarily, and stochastic samplings may vary from run to run of the same command. The default seed is 0. | |
255 * --cyk With --refine, align with the CYK algorithm. By default the optimal accuracy algorithm is used. There is more information on this in the cmalign manual page. | |
256 * --notrunc With --refine, turn off the truncated alignment algorithm. There is more information on this in the cmalign manual page. | |
257 | |
258 | 278 |
259 For further questions please refere to the Infernal Userguide_. | 279 For further questions please refere to the Infernal Userguide_. |
260 | 280 |
261 .. _Userguide: http://selab.janelia.org/software/infernal/Userguide.pdf | 281 .. _Userguide: http://selab.janelia.org/software/infernal/Userguide.pdf |
262 | 282 |
263 | |
264 How do I cite Infernal? | |
265 ----------------------- | |
266 | |
267 The recommended citation for using Infernal 1.1 is E. P. Nawrocki and S. R. Eddy, Infernal 1.1: 100-fold faster RNA homology searches , Bioinformatics 29:2933-2935 (2013). | |
268 | |
269 **Galaxy Wrapper Author**:: | |
270 | |
271 * Bjoern Gruening, University of Freiburg | |
272 | 283 |
273 ]]> | 284 ]]> |
274 </help> | 285 </help> |
286 | |
287 <citations> | |
288 <citation type="doi">10.1093/bioinformatics/btt509</citation> | |
289 <citation type="bibtex"> | |
290 @ARTICLE{bgruening_galaxytools, | |
291 Author = {Björn Grüning, Cameron Smith, Torsten Houwaart, Nicola Soranzo, Eric Rasche}, | |
292 keywords = {bioinformatics, ngs, galaxy, cheminformatics, rna}, | |
293 title = {{Galaxy Tools - A collection of bioinformatics and cheminformatics tools for the Galaxy environment}}, | |
294 url = {https://github.com/bgruening/galaxytools} | |
295 } | |
296 </citation> | |
297 </citations> | |
298 | |
275 </tool> | 299 </tool> |