comparison alignment/phylocatenator.xml @ 0:5b9a38ec4a39 draft default tip

First commit of old repositories
author osiris_phylogenetics <ucsb_phylogenetics@lifesci.ucsb.edu>
date Tue, 11 Mar 2014 12:19:13 -0700
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:5b9a38ec4a39
1 <tool id="phylocatenator" name="Phylocatenator" version="1.0.1">
2 <description>Produces concatenated sequence file from phytab file of aligned sequences</description>
3 <command interpreter="perl">
4 phylocatenator.pl $input1 $genes $species $mingene $species_file $models_file $out_file1 $partition_file $html_file > $phylocat_log
5 </command>
6 <inputs>
7 <!-- <display> $input1 with $genes and $species</display> -->
8 <param name="input1" type="data" format="tabular" label="Table containing aligned genes"/>
9 <param name="genes" type="integer" value="0" label="Genes" help="Minimum genes required per species. 0 retains all species. " />
10 <param name="mingene" type="integer" value="35" label="Min" help="Minimum length of an aligned gene family to be included " />
11 <param name="species" type="integer" value="4" label="Species" help="Minimum species per gene. 0 retains all genes." />
12 <param name="species_file" type="data" format="txt" optional="true" label="Text list of species" help="Only species in the last can be retained in concatenated file" />
13 <param name="models_file" type="data" format="tabular" optional="true" label="Table of: Models LUT" help="To partition data by model (protein, dna, binary, etc) according to a LUT (lookup table)" />
14 <param name="outtype" type="select" label="Write as">
15 <option value="R">RAxML_phylip</option>
16 </param>
17 </inputs>
18 <outputs>
19 <data format="txt" name="out_file1" metadata_source="input1" />
20 <data format="txt" name="phylocat_log" label="${tool.name} on ${on_string}: Log File" />
21 <data format="html" name="html_file" label="${tool.name} on ${on_string}: html Table" />
22 <data format="txt" name="partition_file" label="${tool.name} on ${on_string}: Partition File" >
23 </data>
24 </outputs>
25 <help>
26 **What it does**
27
28 This tool produces a concatenated data set for phylogenetics when not all genes are sampled for all species.
29
30 ------
31
32 **Basic Example**
33
34 The input data must be in phytab column format. Column 1 is species name, C2 is genefamily, C3 individual gene name, C4 is sequence.
35 Sequences of each gene family must be aligned::
36
37 species1 gene1 genenameA acgttagcgcgctatagc
38 species2 gene1 genenameB acgttag--cgctataaa
39 species3 gene1 genenameC acgttagcgcgctatagc
40 species4 gene1 genenameD acgttagcgcgctatagc
41 species1 gene2 genenameE --gttagtttgcta
42 species3 gene2 genenameF gtgttagtttgcta
43
44 Two variables are $gene and $species. These set thresholds for
45 inclusion of data. $species is the minimum number of species that
46 contain a particular gene. $gene sets a minimum number of gene families
47 that a species must have to be included in the dataset.
48
49 Running phylocatenator on the above data with 0 for genes and 0 for species yields::
50
51 4 32
52 species1 acgttagcgcgctatagc--gttagtttgcta
53 species2 acgttag--cgctataaa??????????????
54 species3 acgttagcgcgctatagcgtgttagtttgcta
55 species4 acgttagcgcgctatagc??????????????
56
57 **Optional Functionality**
58
59 I. You may enter a list of species.
60 Species not in this list will not be written to the output file.
61 For example, a species list of::
62
63 species1
64 species2
65
66
67 Would change the above output to::
68
69 species1 acgttagcgcgctatagc--gttagtttgcta
70 species2 acgttag--cgctataaa??????????????
71
72 II. Table of partition models
73
74 You may enter a table of models for each gene family/partition. Phylocatenator will then sort all the data to put all data
75 for the same models together. It will then create the appropriate partition file, which will specify each model in raxml.
76 Currently, it is only possible to partiion data into valid raxml models.
77
78 The format is a tab-delimited file as follows::
79
80 gene1 WAG
81 gene2 JTT
82 gene3 DNA
83 gene4 WAG
84
85 Valid models include the following::
86
87 BIN = binary morphological data
88 MULTI = multistate morphological data
89 DNA = DNA data
90 WAG = one of several protein models listed in raxml help documents
91
92 III. Attribute
93
94 You may enter a table with an attribute/value for each gene family/partition. Phylocatenator will then select the data based
95 on that value.
96
97 The format is a tab-delimited file as follows::
98
99 gene1 3.1
100 gene2 2.2
101 gene3 0.9
102 gene4 6.5
103
104 You can choose gene partitions based on the attribute value.
105 For example, if the numbers above represent rate of evolution, you could
106 choose to include 'slow' genes with a rate less than 2.5
107
108 ------
109
110 **Additional Information**
111
112 http://osiris-phylogenetics.blogspot.com/2012/10/phylocatenator.html
113 Please direct questions or comments to ucsb_phylogenetics@lifesci.ucsb.edu or, if you can, enter them on the osiris_phylogenetics site at bitbucket.org
114
115 ------
116
117 **Citation**
118
119 This tool is part of the Osiris Phylogenetics Tool Package for Galaxy. If you make extensive use of this tool in a publication, please consider citing the following.
120
121 Current Osiris Citation is here
122
123 http://osiris-phylogenetics.blogspot.com/2012/10/citation.html
124
125 First used in this paper
126
127 Oakley, Todd H, Joanna M Wolfe, Annie R Lindgren, and Alexander K Zaharoff. 2012. Phylotranscriptomics to Bring the Understudied into the Fold: Monophyletic Ostracoda, Fossil Placement, and Pancrustacean Phylogeny. lecular Biology and Evolution. doi:10.1093/molbev/mss216.
128
129 </help>
130 </tool>