1
|
1 <tool id="deseq-hts" name="DESeq" version="1.6.1">
|
0
|
2 <description>Determines differentially expressed transcripts from read alignments</description>
|
|
3 <command>
|
|
4 deseq-hts/src/deseq-hts.sh $anno_input_selected $deseq_out $deseq_out.extra_files_path/gene_map.mat
|
|
5 #for $i in $replicate_groups
|
|
6 #for $j in $i.replicates
|
|
7 $j.bam_alignment:#slurp
|
|
8 #end for
|
1
|
9
|
0
|
10 #end for
|
|
11 >> $Log_File </command>
|
|
12 <inputs>
|
|
13 <param format="gff3" name="anno_input_selected" type="data" label="Genome annotation in GFF3 file" help="A tab delimited format for storing sequence features and annotations"/>
|
|
14 <repeat name="replicate_groups" title="Replicate group" min="2">
|
|
15 <repeat name="replicates" title="Replicate">
|
|
16 <param format="bam" name="bam_alignment" type="data" label="BAM alignment file" help="BAM alignment file. Can be generated from SAM files using the SAM Tools."/>
|
|
17 </repeat>
|
|
18 </repeat>
|
|
19 </inputs>
|
|
20
|
|
21 <outputs>
|
|
22 <data format="txt" name="deseq_out" label="DESeq result"/>
|
|
23 <data format="txt" name="Log_File" label="DESeq log file"/>
|
|
24 </outputs>
|
|
25
|
|
26 <tests>
|
|
27 <test>
|
|
28 command:
|
|
29 ./deseq-hts.sh ../test_data/deseq_c_elegans_WS200-I-regions.gff3 ../test_data/deseq_c_elegans_WS200-I-regions_deseq.txt ../test_data/genes.mat ../test_data/deseq_c_elegans_WS200-I-regions-SRX001872.bam ../test_data/deseq_c_elegans_WS200-I-regions-SRX001875.bam
|
|
30
|
|
31 <param name="anno_input_selected" value="deseq_c_elegans_WS200-I-regions.gff3" ftype="gff3" />
|
|
32 <param name="bam_alignments1" value="deseq_c_elegans_WS200-I-regions-SRX001872.bam" ftype="bam" />
|
|
33 <param name="bam_alignments2" value="deseq_c_elegans_WS200-I-regions-SRX001875.bam" ftype="bam" />
|
|
34 <output name="deseq_out" file="deseq_c_elegans_WS200-I-regions_deseq.txt" />
|
|
35 </test>
|
|
36 </tests>
|
|
37
|
|
38 <help>
|
|
39
|
|
40 .. class:: infomark
|
|
41
|
|
42 **What it does**
|
|
43
|
|
44 `DESeq` is a tool for differential expression testing of RNA-Seq data.
|
|
45
|
|
46
|
|
47 **Inputs**
|
|
48
|
|
49 `DESeq` requires three input files to run:
|
|
50
|
|
51 1. Annotation file in GFF3, containing the necessary information about the transcripts that are to be quantified.
|
|
52 2. The BAM alignment files grouped into replicate groups, each containing several replicates. BAM files store the read alignments in a compressed format. They can be generated using the `SAM-to-BAM` tool in the NGS: SAM Tools section. (The script will also work with only two groups containing only a single replicate each. However, this analysis has less statistical power and is therefor not recommended.)
|
|
53
|
|
54 **Output**
|
|
55
|
|
56 `DESeq` generates a text file containing the gene name and the p-value.
|
|
57
|
|
58 ------
|
|
59
|
|
60 **Licenses**
|
|
61
|
|
62 If **DESeq** is used to obtain results for scientific publications it
|
|
63 should be cited as [1]_.
|
|
64
|
|
65 **References**
|
|
66
|
|
67 .. [1] Anders, S and Huber, W (2010): `Differential expression analysis for sequence count data`_.
|
|
68
|
|
69 .. _Differential expression analysis for sequence count data: http://dx.doi.org/10.1186/gb-2010-11-10-r106
|
|
70
|
|
71 ------
|
|
72
|
|
73 .. class:: infomark
|
|
74
|
|
75 **About formats**
|
|
76
|
|
77
|
|
78 **GFF3 format** General Feature Format is a format for describing genes
|
|
79 and other features associated with DNA, RNA and protein
|
|
80 sequences. GFF3 lines have nine tab-separated fields:
|
|
81
|
|
82 1. seqid - The name of a chromosome or scaffold.
|
|
83 2. source - The program that generated this feature.
|
|
84 3. type - The name of this type of feature. Some examples of standard feature types are "gene", "CDS", "protein", "mRNA", and "exon".
|
|
85 4. start - The starting position of the feature in the sequence. The first base is numbered 1.
|
|
86 5. stop - The ending position of the feature (inclusive).
|
|
87 6. score - A score between 0 and 1000. If there is no score value, enter ".".
|
|
88 7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
|
|
89 8. phase - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
|
|
90 9. attributes - All lines with the same group are linked together into a single item.
|
|
91
|
|
92 For more information see http://www.sequenceontology.org/gff3.shtml
|
|
93
|
|
94 **SAM/BAM format** The Sequence Alignment/Map (SAM) format is a
|
|
95 tab-limited text format that stores large nucleotide sequence
|
|
96 alignments. BAM is the binary version of a SAM file that allows for
|
|
97 fast and intensive data processing. The format specification and the
|
|
98 description of SAMtools can be found on
|
|
99 http://samtools.sourceforge.net/.
|
|
100
|
|
101 ------
|
|
102
|
|
103 DESeq-hts Wrapper Version 0.3 (Feb 2012)
|
|
104
|
|
105 </help>
|
|
106 </tool>
|