comparison ITSx.xml @ 1:b433586432d7 draft

Uploaded
author okorol
date Tue, 24 Mar 2015 16:18:36 -0400
parents f82c70f54bd7
children 7c914d783d36
comparison
equal deleted inserted replaced
0:f82c70f54bd7 1:b433586432d7
11 <requirement type="package" version="3.1b1">hmmer</requirement> 11 <requirement type="package" version="3.1b1">hmmer</requirement>
12 </requirements> 12 </requirements>
13 <inputs> 13 <inputs>
14 <param name="input" type="data" format="fasta" label="Input Fasta"/> 14 <param name="input" type="data" format="fasta" label="Input Fasta"/>
15 <param name="cpu" type="integer" value="1" label="cpu"/> 15 <param name="cpu" type="integer" value="1" label="cpu"/>
16 <param name="complement" type="boolean" checked="true" truevalue="--complement T" falsevalue="--complement F" label="Checks both DNA strands against the database"/> 16 <param name="complement" type="boolean" checked="true" truevalue="--complement T" falsevalue="--complement F" label="Check both DNA strands against the database"/>
17 <param name="heuristics" type="boolean" checked="false" truevalue="--heuristics T" falsevalue="--heuristics F" label="Use HMMER's heuristic filtering"/> 17 <param name="heuristics" type="boolean" checked="false" truevalue="--heuristics T" falsevalue="--heuristics F" label="Use HMMER's heuristic filtering"/>
18 <param name="reset" type="boolean" checked="false" truevalue="--reset T" falsevalue="--reset F" label="Re-creates the HMM-database before ITSx is run"/> 18 <param name="reset" type="boolean" checked="false" truevalue="--reset T" falsevalue="--reset F" label="Re-creates the HMM-database before ITSx is run"/>
19 <param name="preserve" type="boolean" checked="false" truevalue="--preserve T" falsevalue="--preserve F" label=" Preserve sequence headers instead of printing out ITSx headers"/> 19 <param name="preserve" type="boolean" checked="false" truevalue="--preserve T" falsevalue="--preserve F" label=" Preserve sequence headers instead of printing out ITSx headers"/>
20 </inputs> 20 </inputs>
21 21
36 <regex match="analysis" source="both" level="log"/> 36 <regex match="analysis" source="both" level="log"/>
37 <regex match="ERROR" source="both" level="fatal"/> 37 <regex match="ERROR" source="both" level="fatal"/>
38 <regex match="error" source="both" level="fatal"/> 38 <regex match="error" source="both" level="fatal"/>
39 </stdio> 39 </stdio>
40 40
41 <test></test> 41 <tests>
42 <test>
43 <param name="input" value="test-data/testITSsequences.fasta"/>
44 <param name="cpu" value="1" />
45 <param name="complement" value="--complement T"/>
46 <param name="reset" value="--reset F" />
47 <param name="preserve" value="--preserve F" />
48
49 <output name="positions" file="test-data/expectedOutput.positions.txt" />
50 <output name="fullfasta" file="test-data/expectedOutput.full.fasta" />
51 <output name="summary" file="test-data/expectedOutput.summary.txt" />
52 <output name="problematic" file="test-data/expectedOutput.problematic.txt" />
53 </test>
54 </tests>
55
42 <help> 56 <help>
43 ITSx -- Identifies ITS sequences and extracts the ITS region 57 **What it does**
44 58
59 Identifies ITS sequences and extracts the ITS regions
60
61 ITSx is an open source software utility to extract the highly variable ITS1 and ITS2 subregions from ITS sequences, which is commonly used as a molecular barcode for e.g. fungi. As the inclusion of parts of the neighbouring, very conserved, ribosomal genes (SSU, 5S and LSU rRNA sequences) in the sequence identification process can lead to severely misleading results, ITSx identifies and extracts only the ITS regions themselves.
62
63 ------
64
65
66 **Info**
67
68 Galaxy wrapper:
69
70 Microbial Biodiversity Bioinformatics Group
71 Agriculture and Agri-Food Canada
72
73 Contact: Oksana Korol, oksana.korol[at]agr.gc.ca
74 mbb[at]agr.gc.ca
75
76
77 ITSx tool:
78
79 Version: 1.0.11
45 Source code available at: 80 Source code available at:
46 http://microbiology.se/software/itsx 81 http://microbiology.se/software/itsx
47 82
48 Version: 1.0.6
49 ITSx -- Identifies ITS sequences and extracts the ITS region
50 Copyright (C) 2012-2013 Johan Bengtsson-Palme et al. 83 Copyright (C) 2012-2013 Johan Bengtsson-Palme et al.
51 Contact: Johan Bengtsson-Palme, johan[at]microbiology.se 84 Contact: Johan Bengtsson-Palme, johan[at]microbiology.se
52 Programmer: Johan Bengtsson-Palme 85 Programmer: Johan Bengtsson-Palme
53 86
54 Full installation instructions can be found in the User's Guide.
55 A quick installation guide follows below.
56 87
57 ITSx requires Perl and HMMER3. 88 **Citation**
58 89
59 1) Perl is usually installed on Unix-like systems by default. If not, it can be retrieved from http://www.perl.org/ 90 Bengtsson-Palme, Johan and Ryberg, Martin and Hartmann, Martin and Branco, Sara and Wang, Zheng and Godhe, Anna and De Wit, Pierre and Sánchez-García, Marisol and Ebersberger, Ingo and de Sousa, Filipe and Amend, Anthony and Jumpponen, Ari and Unterseher, Martin and Kristiansson, Erik and Abarenkov, Kessy and Bertrand, Yann J. K. and Sanli, Kemal and Eriksson, K. Martin and Vik, Unni and Veldre, Vilmar and Nilsson, R. Henrik. Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for analysis of environmental sequencing data. Methods in Ecology and Evolution, 4;10:914-919, 2013.
60 91
61 2) HMMER3 can be found at http://hmmer.janelia.org/software
62 Download it and follow the on site instructions for installation.
63
64 3) Obtain the ITSx package from http://microbiology.se/software/itsx
65 Unpack the tarball and move into the newly created "ITSx" directory.
66
67 4) Copy the ITSx file and the ITSx_db directory to your preferred bin directory.
68
69 5) To test if ITSx was successfully installed type "ITSx --help" on the command-line. You should now see the ITSx help message.
70
71 To run ITSx, you need a FASTA-formatted output file. You can e.g. use the test.fasta file supplied with the package. To check for ITS sequences in the test file, type "ITSx -i test.fasta -o test" on the command line. If you are on a multicore machine, you might want to use the "--cpu 2" option to speed up the processes by using two (or more) cores.
72
73 New features in this version:
74 - Fixed a bug causing over-reporting of chimeras
75
76
77 If you encounter a bug or some other strange behaviour, please report it to:
78 johan[at]microbiology.se
79
80 This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.You should have received a copy of the GNU General Public License along with this program, in a file called 'license.txt'. If not, see: http://www.gnu.org/licenses/.
81
82 ----
83
84 Usage: ITSx -i [input file] -o [output file]
85
86 Options:
87
88 -i {file} : DNA FASTA input file to investigate
89
90 -o {file} : Base for the names of output file(s)
91
92 -p {directory} : A path to a directory of HMM-profile collections representing ITS conserved regions, default is in the same directory as ITSx itself
93
94 --date {T or F} : Adds a date and time stamp to the output directory, off (F) by default
95
96 --reset {T or F} : Re-creates the HMM-database before ITSx is run, off (F) by default
97
98 Sequence selection options:
99
100 -t {character code} : Profile set to use for the search, see the User's Guide (comma-separated), default is all
101
102 -E {value} : Domain E-value cutoff for a sequence to be included in the output, default = 1e-5
103
104 -S {value} : Domain score cutoff for a sequence to be included in the output, default = 0
105
106 -N {value} : The minimal number of domains that must match a sequence before it is included, default = 2
107
108 --selection_priority {sum, domains, eval, score} : Selects what will be of highest priority when determining the origin of the sequence, default is sum
109
110 --search_eval {value} : The E-value cutoff used in the HMMER search, high numbers may slow down the process, cannot be used with the --search_score option, default is 0.01
111
112 --search_score {value} : The score cutoff used in the HMMER search, low numbers may slow down the process, cannot be used with the --search_eval option, default is to used E-value cutoff, not score
113
114 --allow_single_domain {e-value,score or F} : Allow inclusion of sequences that only find a single domain, given that they meet the given E-value and score thresholds, on with parameters 1e-9,0 by default
115
116 --allow_reorder {T or F} : Allows profiles to be in the wrong order on extracted sequences, off (F) by default
117
118 --complement {T or F} : Checks both DNA strands against the database, creating reverse complements, on (T) by default
119
120 --cpu {value} : the number of CPU threads to use, default is 1
121
122 --multi_thread {T or F} : Multi-thread the HMMER-search, on (T) if number of CPUs (--cpu option > 1), else off (F) by default
123
124 --heuristics {T or F} : Selects whether to use HMMER's heuristic filtering, off (F) by default
125
126 Output options:
127
128 --summary {T or F} : Summary of results output, on (T) by default
129
130 --graphical {T or F} : 'Graphical' output, on (T) by default
131
132 --fasta {T or F} : FASTA-format output of extracted ITS sequences, on (T) by default
133
134 --preserve {T or F} : Preserve sequence headers in input file instead of printing out ITSx headers, off (F) by default
135
136 --save_regions {SSU,ITS1,5.8S,ITS2,LSU,all,none} : A comma separated list of regions to output separate FASTA files for, 'ITS1,ITS2' by default
137
138 --anchor {integer or HMM} : Saves an additional number of bases before and after each extracted region. If set to 'HMM' all bases matching the corresponding HMM will be output, default = 0
139
140 --partial {integer} : Saves additional FASTA-files for full and partial ITS sequences longer than the specified cutoff, default = 0 (off)
141
142 --concat {T or F} : Saves a FASTA-file with concatenated ITS sequences (with 5.8S removed), off (F) by default
143
144 --minlen {integer} : Minimum length the ITS regions must be to be outputted in the concatenated file (see above), default = 0
145
146 --positions {T or F} : Table format output containing the positions ITS sequences were found in, on (T) by default
147
148 --table {T or F} : Table format output of sequences containing probable ITS sequences, off (F) by default
149
150 --not_found {T or F} : Saves a list of non-found entries, on (T) by default
151
152 --detailed_results {T or F} : Saves a tab-separated list of all results, off (F) by default
153
154 --truncate {T or F} : Truncates the FASTA output to only contain the actual ITS sequences found, on (T) by default
155
156 --silent {T or F} : Supresses printing progress info to stderr, off (F) by default
157
158 --graph_scale {value} : Sets the scale of the graph output, if value is zero, a percentage view is shown, default = 0
159
160 --save_raw {T or F} : Saves all raw data for searches etc. instead of removing it on finish, off (F) by default
161
162 -h : displays this help message
163
164 --help : displays this help message
165
166 --bugs : displays the bug fixes and known bugs in this version of ITSx
167
168 --license : displays licensing information
169 92
170 </help> 93 </help>
171 94
172 </tool> 95 </tool>