Mercurial > repos > devteam > convert_solid_color2nuc
diff convert_SOLiD_color2nuc.xml @ 0:ab28e7de2db3 draft default tip
Imported from capsule None
author | devteam |
---|---|
date | Mon, 19 May 2014 12:33:14 -0400 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/convert_SOLiD_color2nuc.xml Mon May 19 12:33:14 2014 -0400 @@ -0,0 +1,72 @@ +<tool id="color2nuc" name="Convert Color Space" version="1.0.0"> +<description> to Nucleotides </description> +<command interpreter="python">convert_SOLiD_color2nuc.py $input1 $input2 $output1 </command> + +<inputs> + <param name="input1" type="data" format="txt" label="SOLiD color coding file" /> + <param name="input2" type="select" label="Keep prefix nucleotide"> + <option value="yes">Yes</option> + <option value="no">No</option> + </param> +</inputs> +<outputs> + <data name="output1" format="fasta" /> +</outputs> +<!-- +<tests> + <test> + <param name="input1" value="convert_SOLiD_color2nuc_test1.txt" ftype="txt" /> + <param name="input2" value="no" /> + <output name="output1" file="convert_SOLiD_color2nuc_test1.out" /> + </test> +</tests> +--> +<help> + +.. class:: warningmark + +The tool was designed for color space files generated from an ABI SOLiD sequencer. The file format must be fasta-like: the title starts with a ">" character, and each color space sequence starts with a leading nucleotide. + +----- + +**What it does** + +This tool converts a color space sequence to nucleotides. The leading character must be a nucleotide: A, C, G, or T. + +----- + +**Example** + +- If the color space file looks like this:: + + >seq1 + A013 + >seq2 + T011213122200221123032111221021210131332222101 + +- If you would like to **keep** the leading nucleotide:: + + >seq1 + AACG + >seq2 + TTGTCATGAGAAAGACAGCCGACACTCAAGTCAACGTATCTCTGGT + +- If you **do not want to keep** the leading nucleotide (the length of nucleotide sequence will be one less than the color-space sequence):: + + >seq1 + ACG + >seq2 + TGTCATGAGAAAGACAGCCGACACTCAAGTCAACGTATCTCTGGT + +----- + +**ABI SOLiD Color Coding Alignment matrix** + + Each di-nucleotide is represented by a single digit: 0 to 3. The matrix is symmetric, thus the leading nucleotide is necessary to determine the sequence (otherwise there are four possibilities). + + + .. image:: dualcolorcode.png + + +</help> +</tool>