0
|
1 <tool id="color2nuc" name="Convert Color Space" version="1.0.0">
|
|
2 <description> to Nucleotides </description>
|
|
3 <command interpreter="python">convert_SOLiD_color2nuc.py $input1 $input2 $output1 </command>
|
|
4
|
|
5 <inputs>
|
|
6 <param name="input1" type="data" format="txt" label="SOLiD color coding file" />
|
|
7 <param name="input2" type="select" label="Keep prefix nucleotide">
|
|
8 <option value="yes">Yes</option>
|
|
9 <option value="no">No</option>
|
|
10 </param>
|
|
11 </inputs>
|
|
12 <outputs>
|
|
13 <data name="output1" format="fasta" />
|
|
14 </outputs>
|
|
15 <!--
|
|
16 <tests>
|
|
17 <test>
|
|
18 <param name="input1" value="convert_SOLiD_color2nuc_test1.txt" ftype="txt" />
|
|
19 <param name="input2" value="no" />
|
|
20 <output name="output1" file="convert_SOLiD_color2nuc_test1.out" />
|
|
21 </test>
|
|
22 </tests>
|
|
23 -->
|
|
24 <help>
|
|
25
|
|
26 .. class:: warningmark
|
|
27
|
|
28 The tool was designed for color space files generated from an ABI SOLiD sequencer. The file format must be fasta-like: the title starts with a ">" character, and each color space sequence starts with a leading nucleotide.
|
|
29
|
|
30 -----
|
|
31
|
|
32 **What it does**
|
|
33
|
|
34 This tool converts a color space sequence to nucleotides. The leading character must be a nucleotide: A, C, G, or T.
|
|
35
|
|
36 -----
|
|
37
|
|
38 **Example**
|
|
39
|
|
40 - If the color space file looks like this::
|
|
41
|
|
42 >seq1
|
|
43 A013
|
|
44 >seq2
|
|
45 T011213122200221123032111221021210131332222101
|
|
46
|
|
47 - If you would like to **keep** the leading nucleotide::
|
|
48
|
|
49 >seq1
|
|
50 AACG
|
|
51 >seq2
|
|
52 TTGTCATGAGAAAGACAGCCGACACTCAAGTCAACGTATCTCTGGT
|
|
53
|
|
54 - If you **do not want to keep** the leading nucleotide (the length of nucleotide sequence will be one less than the color-space sequence)::
|
|
55
|
|
56 >seq1
|
|
57 ACG
|
|
58 >seq2
|
|
59 TGTCATGAGAAAGACAGCCGACACTCAAGTCAACGTATCTCTGGT
|
|
60
|
|
61 -----
|
|
62
|
|
63 **ABI SOLiD Color Coding Alignment matrix**
|
|
64
|
|
65 Each di-nucleotide is represented by a single digit: 0 to 3. The matrix is symmetric, thus the leading nucleotide is necessary to determine the sequence (otherwise there are four possibilities).
|
|
66
|
|
67
|
|
68 .. image:: dualcolorcode.png
|
|
69
|
|
70
|
|
71 </help>
|
|
72 </tool>
|