comparison deseq-hts_1.0/galaxy/deseq.xml @ 0:94a108763d9e draft

deseq-hts version 1.0 wraps the DESeq 1.6.0
author vipints
date Wed, 09 May 2012 20:43:47 -0400
parents
children 8ab01cc29c4b
comparison
equal deleted inserted replaced
-1:000000000000 0:94a108763d9e
1 <tool id="deseq-hts" name="DESeq" version="1.6.0">
2 <description>Determines differentially expressed transcripts from read alignments</description>
3 <command>
4 deseq-hts/src/deseq-hts.sh $anno_input_selected $deseq_out $deseq_out.extra_files_path/gene_map.mat
5 #for $i in $replicate_groups
6 #for $j in $i.replicates
7 $j.bam_alignment:#slurp
8 #end for
9 #end for
10 >> $Log_File </command>
11 <inputs>
12 <param format="gff3" name="anno_input_selected" type="data" label="Genome annotation in GFF3 file" help="A tab delimited format for storing sequence features and annotations"/>
13 <repeat name="replicate_groups" title="Replicate group" min="2">
14 <repeat name="replicates" title="Replicate">
15 <param format="bam" name="bam_alignment" type="data" label="BAM alignment file" help="BAM alignment file. Can be generated from SAM files using the SAM Tools."/>
16 </repeat>
17 </repeat>
18 </inputs>
19
20 <outputs>
21 <data format="txt" name="deseq_out" label="DESeq result"/>
22 <data format="txt" name="Log_File" label="DESeq log file"/>
23 </outputs>
24
25 <tests>
26 <test>
27 command:
28 ./deseq-hts.sh ../test_data/deseq_c_elegans_WS200-I-regions.gff3 ../test_data/deseq_c_elegans_WS200-I-regions_deseq.txt ../test_data/genes.mat ../test_data/deseq_c_elegans_WS200-I-regions-SRX001872.bam ../test_data/deseq_c_elegans_WS200-I-regions-SRX001875.bam
29
30 <param name="anno_input_selected" value="deseq_c_elegans_WS200-I-regions.gff3" ftype="gff3" />
31 <param name="bam_alignments1" value="deseq_c_elegans_WS200-I-regions-SRX001872.bam" ftype="bam" />
32 <param name="bam_alignments2" value="deseq_c_elegans_WS200-I-regions-SRX001875.bam" ftype="bam" />
33 <output name="deseq_out" file="deseq_c_elegans_WS200-I-regions_deseq.txt" />
34 </test>
35 </tests>
36
37 <help>
38
39 .. class:: infomark
40
41 **What it does**
42
43 `DESeq` is a tool for differential expression testing of RNA-Seq data.
44
45
46 **Inputs**
47
48 `DESeq` requires three input files to run:
49
50 1. Annotation file in GFF3, containing the necessary information about the transcripts that are to be quantified.
51 2. The BAM alignment files grouped into replicate groups, each containing several replicates. BAM files store the read alignments in a compressed format. They can be generated using the `SAM-to-BAM` tool in the NGS: SAM Tools section. (The script will also work with only two groups containing only a single replicate each. However, this analysis has less statistical power and is therefor not recommended.)
52
53 **Output**
54
55 `DESeq` generates a text file containing the gene name and the p-value.
56
57 ------
58
59 **Licenses**
60
61 If **DESeq** is used to obtain results for scientific publications it
62 should be cited as [1]_.
63
64 **References**
65
66 .. [1] Anders, S and Huber, W (2010): `Differential expression analysis for sequence count data`_.
67
68 .. _Differential expression analysis for sequence count data: http://dx.doi.org/10.1186/gb-2010-11-10-r106
69
70 ------
71
72 .. class:: infomark
73
74 **About formats**
75
76
77 **GFF3 format** General Feature Format is a format for describing genes
78 and other features associated with DNA, RNA and protein
79 sequences. GFF3 lines have nine tab-separated fields:
80
81 1. seqid - The name of a chromosome or scaffold.
82 2. source - The program that generated this feature.
83 3. type - The name of this type of feature. Some examples of standard feature types are "gene", "CDS", "protein", "mRNA", and "exon".
84 4. start - The starting position of the feature in the sequence. The first base is numbered 1.
85 5. stop - The ending position of the feature (inclusive).
86 6. score - A score between 0 and 1000. If there is no score value, enter ".".
87 7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
88 8. phase - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
89 9. attributes - All lines with the same group are linked together into a single item.
90
91 For more information see http://www.sequenceontology.org/gff3.shtml
92
93 **SAM/BAM format** The Sequence Alignment/Map (SAM) format is a
94 tab-limited text format that stores large nucleotide sequence
95 alignments. BAM is the binary version of a SAM file that allows for
96 fast and intensive data processing. The format specification and the
97 description of SAMtools can be found on
98 http://samtools.sourceforge.net/.
99
100 ------
101
102 DESeq-hts Wrapper Version 0.3 (Feb 2012)
103
104 </help>
105 </tool>