annotate tools/align_back_trans/align_back_trans.xml @ 7:883842b81796 draft default tip

"Update all the pico_galaxy tools on main Tool Shed"
author peterjc
date Fri, 16 Apr 2021 22:26:52 +0000
parents b27388e5a0bb
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
6
b27388e5a0bb v0.0.10 removed unused reference to muscle format
peterjc
parents: 5
diff changeset
1 <tool id="align_back_trans" name="Thread nucleotides onto a protein alignment (back-translation)" version="0.0.10">
0
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
2 <description>Gives a codon aware alignment</description>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
3 <requirements>
4
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
4 <requirement type="package" version="1.67">biopython</requirement>
0
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
5 </requirements>
4
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
6 <version_command>
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
7 python $__tool_directory__/align_back_trans.py --version
6
b27388e5a0bb v0.0.10 removed unused reference to muscle format
peterjc
parents: 5
diff changeset
8 </version_command>
4
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
9 <command detect_errors="aggressive">
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
10 python $__tool_directory__/align_back_trans.py $prot_align.ext '$prot_align' '$nuc_file' '$out_nuc_align' '$table'
2
9fbf29a8c12b v0.0.6 use format_source; v0.0.5 more explicit error msg, citation info
peterjc
parents: 1
diff changeset
11 </command>
0
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
12 <inputs>
6
b27388e5a0bb v0.0.10 removed unused reference to muscle format
peterjc
parents: 5
diff changeset
13 <param name="prot_align" type="data" format="fasta,clustal" label="Aligned protein file" help="Mutliple sequence file in FASTA, ClustalW or PHYLIP format." />
0
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
14 <param name="table" type="select" label="Genetic code" help="Tables from the NCBI, these determine the start and stop codons">
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
15 <option value="1">1. Standard</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
16 <option value="2">2. Vertebrate Mitochondrial</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
17 <option value="3">3. Yeast Mitochondrial</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
18 <option value="4">4. Mold, Protozoan, Coelenterate Mitochondrial and Mycoplasma/Spiroplasma</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
19 <option value="5">5. Invertebrate Mitochondrial</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
20 <option value="6">6. Ciliate Macronuclear and Dasycladacean</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
21 <option value="9">9. Echinoderm Mitochondrial</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
22 <option value="10">10. Euplotid Nuclear</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
23 <option value="11">11. Bacterial</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
24 <option value="12">12. Alternative Yeast Nuclear</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
25 <option value="13">13. Ascidian Mitochondrial</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
26 <option value="14">14. Flatworm Mitochondrial</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
27 <option value="15">15. Blepharisma Macronuclear</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
28 <option value="16">16. Chlorophycean Mitochondrial</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
29 <option value="21">21. Trematode Mitochondrial</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
30 <option value="22">22. Scenedesmus obliquus</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
31 <option value="23">23. Thraustochytrium Mitochondrial</option>
4
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
32 <option value="24">24. Pterobranchia Mitochondrial</option>
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
33 <option value="25">25. Candidate Division SR1 and Gracilibacteria</option>
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
34 <!-- TODO, these are not in Biopython 1.67
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
35 <option value="26">26. Pachysolen tannophilus Nuclear</option>
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
36 <option value="26">27. Karyorelict Nuclear</option>
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
37 <option value="26">28. Condylostoma Nuclear</option>
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
38 <option value="26">29. Mesodinium Nuclear</option>
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
39 <option value="26">30. Peritrich Nuclear</option>
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
40 <option value="26">31. Blastocrithidia Nuclear</option>
c8469274d136 v0.0.8 Using Biopython 1.67 from Tool Shed or Conda package
peterjc
parents: 3
diff changeset
41 -->
0
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
42 <option value="0">Don't check the translation</option>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
43 </param>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
44 <param name="nuc_file" type="data" format="fasta" label="Unaligned nucleotide sequences" help="FASTA format, using same identifiers as your protein alignment" />
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
45 </inputs>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
46 <outputs>
2
9fbf29a8c12b v0.0.6 use format_source; v0.0.5 more explicit error msg, citation info
peterjc
parents: 1
diff changeset
47 <data name="out_nuc_align" format_source="prot_align" metadata_source="prot_align" label="${prot_align.name} (back-translated)"/>
0
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
48 </outputs>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
49 <tests>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
50 <test>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
51 <param name="prot_align" value="demo_prot_align.fasta" />
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
52 <param name="nuc_file" value="demo_nucs.fasta" />
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
53 <param name="table" value="0" />
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
54 <output name="out_nuc_align" file="demo_nuc_align.fasta" />
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
55 </test>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
56 <test>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
57 <param name="prot_align" value="demo_prot_align.fasta" />
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
58 <param name="nuc_file" value="demo_nucs_trailing_stop.fasta" />
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
59 <param name="table" value="11" />
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
60 <output name="out_nuc_align" file="demo_nuc_align.fasta" />
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
61 </test>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
62 </tests>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
63 <help>
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
64 **What it does**
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
65
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
66 Takes an input file of aligned protein sequences (typically FASTA or Clustal
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
67 format), and a matching file of unaligned nucleotide sequences (FASTA format,
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
68 using the same identifiers), and threads the nucleotide sequences onto the
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
69 protein alignment to produce a codon aware nucleotide alignment - which can
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
70 be viewed as a back translation.
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
71
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
72 If you specify one of the standard NCBI genetic codes (recommended), then the
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
73 translation is verified. This will allow fuzzy matching if stop codons in the
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
74 protein sequence have been reprented as X, and will allow for a trailing stop
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
75 codon present in the nucleotide sequences but not the protein.
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
76
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
77 Note - the protein and nucleotide sequences must use the same identifers.
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
78
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
79 Note - If no translation table is specified, the provided nucleotide sequences
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
80 should be exactly three times the length of the protein sequences (exluding the gaps).
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
81
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
82 Note - the nucleotide FASTA file may contain extra sequences not in the
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
83 protein alignment, they will be ignored. This can be useful if for example
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
84 you have a nucleotide FASTA file containing all the genes in an organism,
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
85 while the protein alignment is for a specific gene family.
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
86
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
87 **Example**
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
88
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
89 Given this protein alignment in FASTA format::
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
90
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
91 >Alpha
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
92 DEER
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
93 >Beta
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
94 DE-R
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
95 >Gamma
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
96 D--R
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
97
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
98 and this matching unaligned nucleotide FASTA file::
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
99
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
100 >Alpha
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
101 GATGAGGAACGA
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
102 >Beta
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
103 GATGAGCGU
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
104 >Gamma
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
105 GATCGG
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
106
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
107 the tool would return this nucleotide alignment::
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
108
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
109 >Alpha
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
110 GATGAGGAACGA
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
111 >Beta
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
112 GATGAG---CGU
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
113 >Gamma
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
114 GAT------CGG
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
115
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
116 Notice that all the gaps are multiples of three in length.
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
117
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
118
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
119 **Citation**
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
120
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
121 This tool uses Biopython, so if you use this Galaxy tool in work leading to a
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
122 scientific publication please cite the following paper:
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
123
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
124 Cock et al (2009). Biopython: freely available Python tools for computational
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
125 molecular biology and bioinformatics. Bioinformatics 25(11) 1422-3.
7
883842b81796 "Update all the pico_galaxy tools on main Tool Shed"
peterjc
parents: 6
diff changeset
126 https://doi.org/10.1093/bioinformatics/btp163 pmid:19304878.
0
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
127
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
128 This tool is available to install into other Galaxy Instances via the Galaxy
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
129 Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/align_back_trans
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
130 </help>
2
9fbf29a8c12b v0.0.6 use format_source; v0.0.5 more explicit error msg, citation info
peterjc
parents: 1
diff changeset
131 <citations>
9fbf29a8c12b v0.0.6 use format_source; v0.0.5 more explicit error msg, citation info
peterjc
parents: 1
diff changeset
132 <citation type="doi">10.7717/peerj.167</citation>
9fbf29a8c12b v0.0.6 use format_source; v0.0.5 more explicit error msg, citation info
peterjc
parents: 1
diff changeset
133 <citation type="doi">10.1093/bioinformatics/btp163</citation>
9fbf29a8c12b v0.0.6 use format_source; v0.0.5 more explicit error msg, citation info
peterjc
parents: 1
diff changeset
134 </citations>
0
0c24e4e2177d Uploaded v0.0.3, first stable release.
peterjc
parents:
diff changeset
135 </tool>