getalleleseq: getalleleseq/getalleleseq.xml comparison

comparison getalleleseq/getalleleseq.xml @ 6:2c31c61c48df draft

Uploaded

author	boris
date	Tue, 18 Mar 2014 12:20:59 -0400
parents	c16e440a53a0
children

comparison

equal deleted inserted replaced

-:a3710b1fe374
+:2c31c61c48df
+<tool id="getalleleseq" name="FASTA from allele counts" version="0.0.1" force_history_refresh="True">
+<description>Generate major and minor allele sequences from alleles table</description>
+<command interpreter="python">getalleleseq.py
+$alleles
+-l $seq_length
+-j $major_seq
+-p $major_seq.id
+</command>
+<inputs>
+<param format="tabular" name="alleles" type="data" label="Table containing major and minor alleles base per position" help="must be tabular and follow the Variant Annotator tool output format"/>
+<param name="seq_length" type="integer" value="16569" label="Background sequence length" help="e.g. 16569 for mitochondrial variants"/>
+</inputs>
+<outputs>
+<data format="fasta" name="major_seq"/>
+</outputs>
+<tests>
+<test>
+<param name="alleles" value="test-table-getalleleseq.tab"/>
+<param name="seq_length" value="16569"/>
+<output name="major_seq" file="test-major-allele-out-getalleleseq.fa"/>
+</test>
+</tests>
+<help>
+The major allele sequence of a sample is simply the sequence consisting of the most frequent nucleotide per position.
+Replacing the major allele for the second most frequent allele at diploid positions generates the minor allele sequence.
+-----
+.. class:: infomark
+**What it does**
+It takes the table generated from the Variant Annotator tool to derive a major and minor allele sequence per sample.
+Since all sequences share the same length all the major allele sequences are included into a single file (with proper headers per sample)
+to create a multiple sequence alignment in FASTA format that can be used for downstream phylogenetic analyses.
+In contrast, the minor allele sequences are informed as single FASTA files per sample to ease their downstream manipulation.
+-----
+.. class:: warningmark
+**Note**
+Please, follow the format described below for the input file:
+-----
+.. class:: infomark
+**Formats**
+**Variant Annotator tool output format**
+Columns::
+1.  sample id
+2.  chromosome
+3.  position
+4   counts for A's
+5.  counts for C's
+6.  counts for G's
+7.  counts for T's
+8.  Coverage
+9.  Number of alleles passing frequency threshold
+10. Major allele
+11. Minor allele
+12. Minor allele frequency in position
+**FASTA multiple alignment**
+See http://www.bioperl.org/wiki/FASTA_multiple_alignment_format
+-----
+**Example**
+- For the following dataset::
+S9	chrM	3	3	0	2	214	219	0	T	A	0.013698630137
+S9	chrM	4	3	249	3	0	255	0	C	N	0.0
+S9	chrM	5	245	1	1	0	247	1	A	N	0.0
+S11	chrM	6	0	292	0	0	292	1	C	.	0.0
+S7	chrM	6	0	254	0	0	254	1	C	.	0.0
+S9	chrM	6	2	306	2	0	310	0	C	N	0.0
+S11	chrM	7	281	0	3	0	284	0	A	G	0.0105633802817
+S7	chrM	7	249	0	2	0	251	1	A	G	0.00796812749004
+etc. for all covered positions per sample...
+- Running this tool with background sequence length 16569 will produce 4 files::
+1. Multiple alignment FASTA file containing the major allele sequences of samples S7, S9 and S11
+2. minor allele sequence of sample S7
+3. minor allele sequence of sample S9
+4. minor allele sequence of sample S11
+-----
+**Citation**
+If you use this tool, please cite Dickins B, Rebolledo-Jaramillo B, et al (2014). *Acccepted in Biotechniques*
+(boris-at-bx.psu.edu)
+</help>
+</tool>

Mercurial > repos > boris > getalleleseq

comparison getalleleseq/getalleleseq.xml @ 6:2c31c61c48df draft