annotate tools/filters/axt_to_concat_fasta.xml @ 0:9071e359b9a3

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:37:19 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1 <tool id="axt_to_concat_fasta" name="AXT to concatenated FASTA">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
2 <description>Converts an AXT formatted file to a concatenated FASTA alignment</description>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
3 <command interpreter="python">axt_to_concat_fasta.py $dbkey_1 $dbkey_2 &lt; $axt_input &gt; $out_file1</command>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
4 <inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
5 <param format="axt" name="axt_input" type="data" label="AXT file"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
6 <param name="dbkey_1" type="genomebuild" label="Genome"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
7 <param name="dbkey_2" type="genomebuild" label="Genome"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
8 </inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
9 <outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
10 <data format="fasta" name="out_file1" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
11 </outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
12 <tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
13 <test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
14 <param name="axt_input" value="1.axt" ftype="axt" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
15 <param name="dbkey_1" value='hg17' />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
16 <param name="dbkey_2" value="panTro1" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
17 <output name="out_file1" file="axt_to_concat_fasta.dat" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
18 </test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
19 </tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
20 <help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
21
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
22 .. class:: warningmark
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
23
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
24 **IMPORTANT**: AXT formatted alignments will be phased out from Galaxy in the coming weeks. They will be replaced with pairwise MAF alignments, which are already available. To try pairwise MAF alignments use "Extract Pairwise MAF blocks" tool in *Fetch Sequences and Alignments* section.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
25
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
26 --------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
27
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
28 **Syntax**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
29
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
30 This tool converts an AXT formatted file to the FASTA format, and concatenates the results in the same build.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
31
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
32 - **AXT format** The alignments are produced from Blastz, an alignment tool available from Webb Miller's lab at Penn State University. The lav format Blastz output, which does not include the sequence, was converted to AXT format with lavToAxt. Each alignment block in an AXT file contains three lines: a summary line and 2 sequence lines. Blocks are separated from one another by blank lines.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
33
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
34 - **FASTA format** a text-based format for representing both nucleic and protein sequences, in which base pairs or proteins are represented using a single-letter code.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
35
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
36 - This format contains an one line header. It starts with a " >" symbol. The first word on this line is the name of the sequence. The rest of the line is a description of the sequence.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
37 - The remaining lines contain the sequence itself.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
38 - Blank lines in a FASTA file are ignored, and so are spaces or other gap symbols (dashes, underscores, periods) in a sequence.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
39 - Fasta files containing multiple sequences are just the same, with one sequence listed right after another. This format is accepted for many multiple sequence alignment programs.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
40
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
41 -----
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
42
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
43 **Example**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
44
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
45 - AXT format::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
46
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
47 0 chr19 3001012 3001075 chr11 70568380 70568443 - 3500
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
48 TCAGCTCATAAATCACCTCCTGCCACAAGCCTGGCCTGGTCCCAGGAGAGTGTCCAGGCTCAGA
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
49 TCTGTTCATAAACCACCTGCCATGACAAGCCTGGCCTGTTCCCAAGACAATGTCCAGGCTCAGA
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
50
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
51 1 chr19 3008279 3008357 chr11 70573976 70574054 - 3900
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
52 CACAATCTTCACATTGAGATCCTGAGTTGCTGATCAGAATGGAAGGCTGAGCTAAGATGAGCGACGAGGCAATGTCACA
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
53 CACAGTCTTCACATTGAGGTACCAAGTTGTGGATCAGAATGGAAAGCTAGGCTATGATGAGGGACAGTGCGCTGTCACA
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
54
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
55 - Convert the above file to concatenated FASTA format::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
56
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
57 &gt;hg16
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
58 TCAGCTCATAAATCACCTCCTGCCACAAGCCTGGCCTGGTCCCAGGAGAGTGTCCAGGCTCAGACACAATCTTCACATTGAGATCCTGAGTTGCTGATCAGAATGGAAGGCTGAGCTAAGATGAGCGACGAGGCAATGTCACA
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
59 &gt;mm5
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
60 TCTGTTCATAAACCACCTGCCATGACAAGCCTGGCCTGTTCCCAAGACAATGTCCAGGCTCAGACACAGTCTTCACATTGAGGTACCAAGTTGTGGATCAGAATGGAAAGCTAGGCTATGATGAGGGACAGTGCGCTGTCACA
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
61
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
62 </help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
63 </tool>