9
|
1 <tool id="deseq-hts" name="DESeq" version="1.12.1">
|
|
2 <description> Determines differentially expressed transcripts from read alignments</description>
|
|
3 <requirements>
|
|
4 <requirement type="package" version="0.1">oqtans</requirement>
|
|
5 </requirements>
|
|
6 <command interpreter="bash">
|
|
7 ./../src/deseq-hts.sh $anno_input_selected $deseq_out $deseq_out.extra_files_path/gene_map.mat
|
0
|
8 #for $i in $replicate_groups
|
|
9 #for $j in $i.replicates
|
|
10 $j.bam_alignment:#slurp
|
|
11 #end for
|
1
|
12
|
0
|
13 #end for
|
|
14 >> $Log_File </command>
|
|
15 <inputs>
|
|
16 <param format="gff3" name="anno_input_selected" type="data" label="Genome annotation in GFF3 file" help="A tab delimited format for storing sequence features and annotations"/>
|
|
17 <repeat name="replicate_groups" title="Replicate group" min="2">
|
|
18 <repeat name="replicates" title="Replicate">
|
|
19 <param format="bam" name="bam_alignment" type="data" label="BAM alignment file" help="BAM alignment file. Can be generated from SAM files using the SAM Tools."/>
|
|
20 </repeat>
|
|
21 </repeat>
|
|
22 </inputs>
|
|
23
|
|
24 <outputs>
|
9
|
25 <data format="txt" name="deseq_out" label="${tool.name} on ${on_string}: Differential Expression"/>
|
|
26 <data format="txt" name="Log_File" label="${tool.name} on ${on_string}: log"/>
|
0
|
27 </outputs>
|
|
28
|
|
29 <tests>
|
|
30 <test>
|
|
31 command:
|
|
32 ./deseq-hts.sh ../test_data/deseq_c_elegans_WS200-I-regions.gff3 ../test_data/deseq_c_elegans_WS200-I-regions_deseq.txt ../test_data/genes.mat ../test_data/deseq_c_elegans_WS200-I-regions-SRX001872.bam ../test_data/deseq_c_elegans_WS200-I-regions-SRX001875.bam
|
|
33
|
|
34 <param name="anno_input_selected" value="deseq_c_elegans_WS200-I-regions.gff3" ftype="gff3" />
|
|
35 <param name="bam_alignments1" value="deseq_c_elegans_WS200-I-regions-SRX001872.bam" ftype="bam" />
|
|
36 <param name="bam_alignments2" value="deseq_c_elegans_WS200-I-regions-SRX001875.bam" ftype="bam" />
|
|
37 <output name="deseq_out" file="deseq_c_elegans_WS200-I-regions_deseq.txt" />
|
|
38 </test>
|
|
39 </tests>
|
|
40
|
|
41 <help>
|
|
42
|
|
43 .. class:: infomark
|
|
44
|
|
45 **What it does**
|
|
46
|
9
|
47 DESeq_ is a tool for differential expression testing of RNA-Seq data.
|
0
|
48
|
9
|
49 .. _DESeq: http://bioconductor.org/packages/release/bioc/html/DESeq.html
|
0
|
50
|
9
|
51 `DESeq` requires:
|
0
|
52
|
9
|
53 Genome annotation file in GFF3, containing the necessary information about the transcripts that are to be quantified.
|
0
|
54
|
9
|
55 The BAM alignment files grouped into replicate groups, each containing several replicates. BAM files store the read alignments, The program will also work with only two groups containing only a single replicate each. However, this analysis has less statistical power and is therefore not recommended!
|
0
|
56
|
|
57 ------
|
|
58
|
|
59 **Licenses**
|
|
60
|
|
61 If **DESeq** is used to obtain results for scientific publications it
|
|
62 should be cited as [1]_.
|
|
63
|
|
64 **References**
|
|
65
|
|
66 .. [1] Anders, S and Huber, W (2010): `Differential expression analysis for sequence count data`_.
|
|
67
|
|
68 .. _Differential expression analysis for sequence count data: http://dx.doi.org/10.1186/gb-2010-11-10-r106
|
|
69
|
|
70 ------
|
|
71
|
|
72 .. class:: infomark
|
|
73
|
|
74 **About formats**
|
|
75
|
|
76
|
|
77 **GFF3 format** General Feature Format is a format for describing genes
|
|
78 and other features associated with DNA, RNA and protein
|
|
79 sequences. GFF3 lines have nine tab-separated fields:
|
|
80
|
|
81 1. seqid - The name of a chromosome or scaffold.
|
|
82 2. source - The program that generated this feature.
|
|
83 3. type - The name of this type of feature. Some examples of standard feature types are "gene", "CDS", "protein", "mRNA", and "exon".
|
|
84 4. start - The starting position of the feature in the sequence. The first base is numbered 1.
|
|
85 5. stop - The ending position of the feature (inclusive).
|
|
86 6. score - A score between 0 and 1000. If there is no score value, enter ".".
|
|
87 7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
|
|
88 8. phase - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
|
|
89 9. attributes - All lines with the same group are linked together into a single item.
|
|
90
|
|
91 For more information see http://www.sequenceontology.org/gff3.shtml
|
|
92
|
|
93 **SAM/BAM format** The Sequence Alignment/Map (SAM) format is a
|
|
94 tab-limited text format that stores large nucleotide sequence
|
|
95 alignments. BAM is the binary version of a SAM file that allows for
|
|
96 fast and intensive data processing. The format specification and the
|
|
97 description of SAMtools can be found on
|
|
98 http://samtools.sourceforge.net/.
|
|
99
|
|
100 ------
|
|
101
|
9
|
102 DESeq-hts Wrapper Version 0.5 (Aug 2013)
|
0
|
103
|
|
104 </help>
|
|
105 </tool>
|