comparison metaphlan.xml @ 3:e88fa24fa837

updated to version 1.6.0
author nsegata
date Wed, 06 Jun 2012 10:41:23 -0400
parents 1f80b01e1490
children 80b22b31633f
comparison
equal deleted inserted replaced
2:1f80b01e1490 3:e88fa24fa837
1 <tool id="metaphlan" name="MetaPhlAn" version="1.0"> 1 <tool id="metaphlan" name="MetaPhlAn" version="1.6.0">
2 <requirements> 2 <requirements>
3 <requirement type="package">metaphlan</requirement> 3 <requirement type="package">metaphlan</requirement>
4 <requirement type="package" version="2.2.25+">blast</requirement> 4 <requirement type="package">bowtie2</requirement>
5 </requirements> 5 </requirements>
6 <description>Metagenomic Phylogenetic Analysis</description> 6 <description>Metagenomic Phylogenetic Analysis</description>
7 <command> 7 <command>
8 metaphlan.py 8 metaphlan.py
9 #if str($source.type) == "fasta": 9 $input
10 ${source.fasta_input} 10 --bowtie2db ${GALAXY_DATA_INDEX_DIR}/shared/metaphlan/bowtie2db/mpa
11 #else: 11 --no_map
12 ${source.blast_input} 12 -o $output
13 #end if 13 --bt2_ps $PresetsForBowtie2
14 ${metaphlan_out}
15 --nproc 4
16 #if str($source.type) == "fasta":
17 --blastout metagenome.outfmt6.txt
18 --evalue ${source.evalue}
19 #end if
20 --lib_dir ${GALAXY_DATA_INDEX_DIR}/shared/metaphlan
21 --min_cu_len ${min_cu_len}
22 --min_nreads ${min_nreads}
23 </command> 14 </command>
24 15
25 <inputs> 16 <inputs>
17 <param format="fasta" name="input" type="data" label="Input metagenome (multi-fasta of metagenomic reads, loaded with the Get Data module, see below for an example)"></param>
18 <param name="PresetsForBowtie2" type="select" format="text">
19 <label>Sensitivity options for read-marker similarity (as described by BowTie2)</label>
20 <option value="very-sensitive-local">Very Sensitive Local</option>
21 <option value="sensitive-local">Sensitive Local</option>
22 <option value="very-sensitive">Very Sensitive</option>
23 <option value="sensitive">Sensitive</option>
24 </param>
25 </inputs>
26 <outputs>
27 <data format="tabular" name="output" />
28 </outputs>
29
30 <tests>
31 </tests>
26 32
27 <conditional name="source"> 33 <help>
28 <param name="type" type="select" label="Input Type">
29 <option value="fasta">multi-fasta file containing metagenomic reads</option>
30 <option value="blast">NCBI BLAST output file</option>
31 </param>
32 <when value="fasta">
33 <param format="fasta" name="fasta_input" type="data" label="from"/>
34 <param name="evalue" type="float" size="15" value="0.00001" label="evalue threshold for the blasting" />
35 </when>
36 <when value="blast">
37 <param format="tabular" name="blast_input" type="data" label="from"/>
38 </when>
39 </conditional>
40 34
41 <param name="tax_lev" type="select" label="Taxonomic Level" help="The taxonomic level for the relative abundance output"> 35 .. class:: infomark
42 <option value="a">All taxonomic levels</option> 36
43 <option value="k">Kingdoms (Bacteria and Archaea) only</option> 37 **Input example:** You can try out MetaPhlAn using the synthetic dataset (250,000 reads) available at: http://huttenhower.sph.harvard.edu/sites/default/files/LC1.fna . There is no need to download the file, you can just copy-and-paste the dataset address in the "Upload File" module inside the "Load Data" link here in the left panel.
44 <option value="p">Phyla only</option> 38
45 <option value="c">Classes only</option> 39 .. class:: infomark
46 <option value="o">Orders only</option> 40
47 <option value="f">Families only</option> 41 **Computational time:** Unless the server is overloaded, you should expect the tool to process ~10,000 reads per second. The synthetic metagenome linked above (250,000 reads) should take no more than 30 seconds to complete.
48 <option value="g">Genera only</option> 42
49 <option value="s">Species only</option> 43 .. class:: infomark
50 </param> 44
51 <param name="min_cu_len" type="integer" value="10000" help="min_cu_len" label="Minimum total nucleotide length for the unique markers for estimating the abundance without considering children clade abundances" /> 45 **Tip:** If your input is in FASTQ you can convert it in FASTA using the corresponding Galaxy module included in the "Convert Format" tools.
52 <param name="min_nreads" type="integer" value="5" help="min_nreads" label="minimum total reads assigned to a clade for estimating the abundance without considering children clade abundances" /> 46
53 </inputs> 47 ---------
54 <outputs>
55 <data format="tabular" name="metaphlan_out" label="MetaPhlAn on ${on_string}" />
56 <data format="tabular" name="blast_out" from_work_dir="metagenome.outfmt6.txt" label="MetaPhlAn BLAST on ${on_string}">
57 <filter>source['type'] == "fasta"</filter>
58 </data>
59 </outputs>
60 <tests>
61 </tests>
62 <help>
63 48
64 **What it does** 49 **What it does**
65 MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data. MetaPhlAn relies on unique clade-specific marker genes identified from reference genomes, allowing orders of magnitude speedups and unambiguous taxonomic assignments.
66 50
67 MetaPhlAn main features are: 51 MetaPhlAn (Metagenomic Phylogenetic Analysis) is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data. MetaPhlAn relies on unique clade-specific marker genes identified from reference genomes, allowing orders of magnitude speedups and unambiguous taxonomic assignments.
68 52
69 More than 100x computational speedup compared to Blast-based approaches or other available methods with species level resolution 53 Although MetaPhlAn can use both BlastN and BowTie2 in the read-to-marker mapping step, this Galaxy module uses only BowTie2 for computational reasons.
70 Higher accuracy in estimating the true composition of microbial communities in terms of organismal relative abundance 54
71 Unambiguous read-to-taxa assignments as conserved inter-clade sequences are removed from the reference sequence data 55 For additional information about MetaPhlAn and the MetaPhlAn command line package, please refer to http://huttenhower.sph.harvard.edu/metaphlan or to the paper reported below. Please notice that most of the additional parameters that can be tuned with the command line version are set here to the default values.
72 56
73 --------- 57 ---------
74 58
75 **Inputs** 59 **Inputs**
76 60
77 The input file can be a multi-fasta file containing metagenomic reads OR a NCBI BLAST output file (-outfmt 6 format) of the metagenomic read fasta file against the metaflan database. 61 The input file must be a multi-fasta file containing metagenomic reads loaded with the "Get Data" module in the left panel. Reads can be as short as ~40 nt although lengths higher than 70 nt are recommended.
78 62
79 **outputs** 63 A synthetic metagenome you can use as sample input is available at http://huttenhower.sph.harvard.edu/sites/default/files/LC1.fna
80 64
81 The output is a tab-separated output file of the predicted taxon relative abundances. 65 **Outputs**
82 If the input is a multi-fasta file then the output from the BLAST operation is also provided as an additional output.
83 66
84 --------- 67 The output is a two column tab-separated plain file reporting the predicted microbial clades present in the metagenomic samples and the corresponding relative abundances.
85 68
86 **Settings**:: 69 All taxonomic levels from domain to species will be reported and higher taxonomic levelis contain the sum of the abundances of its taxonomic leaf nodes (usually species) and, possibly, some lower level "unclassified" clades.
87
88 --tax_lev TAXONOMIC_LEVEL
89 The taxonomic level for the relative abundance
90 output:
91 'a' : all taxonomic levels
92 'k' : kingdoms (Bacteria and Archaea) only
93 'p' : phyla only
94 'c' : classes only
95 'o' : orders only
96 'f' : families only
97 'g' : genera only
98 's' : species only
99 [default 'a']
100 --evalue evalue threshold for the blasting
101 [default 1e-6]
102 --min_cu_len minimum total nucleotide lenght for the unique
103 markers for estimating the abundance without
104 considering children clade abundances
105 [default 10000]
106 --min_nreads minimum total reads assigned to a clade for
107 estimating the abundance without considering
108 children clade abundances
109 [default 5]
110 70
111 ----- 71 -----
112 72
113 **Citation** 73 **Citation and contacts**
114 74
115 If you find MetaPhlAn useful in your research, please cite our paper: 75 If you find MetaPhlAn useful in your research, please cite our paper:
116 Nicola Segata, Levi Waldron, Annalisa Ballarini, Vagheesh Narasimhan, Olivier Jousson, Curtis Huttenhower. 76
117 "Fast and accurate metagenomic profiling of microbial community composition using unique clade-specific marker genes" 77 | `Nicola Segata`_, Levi Waldron, Annalisa Ballarini, Vagheesh Narasimhan, Olivier Jousson, `Curtis Huttenhower`_.
118 ***in review*** 78 | **"Fast and accurate metagenomic profiling of microbial community composition using unique clade-specific marker genes"**
79 | Nature Methods, 2012 (in press)
80
81 .. _Nicola Segata: nsegata@hsph.harvard.edu
82 .. _Curtis Huttenhower: chuttenh@hsph.harvard.edu
83
84 If you have any questions or comments, feel free to `contact us`_. Additional information are available at http://huttenhower.sph.harvard.edu/metaphlan and in the FAQ at the same page. You can also join and use our user group at https://groups.google.com/d/forum/metaphlan-users
85
86 .. _contact us: nsegata@hsph.harvard.edu
87
119 88
120 </help> 89 </help>
121 </tool> 90 </tool>
122 91