comparison deseq-hts_2.0/galaxy/deseq2.xml @ 10:2fe512c7bfdf draft

DESeq2 version 1.0.19 added to the repo
author vipints <vipin@cbio.mskcc.org>
date Tue, 08 Oct 2013 08:15:34 -0400
parents
children
comparison
equal deleted inserted replaced
9:e27b4f7811c2 10:2fe512c7bfdf
1 <tool id="deseq2-hts" name="DESeq2" version="1.0.19">
2 <description> Differential gene expression analysis based on the negative binomial distribution</description>
3 <command interpreter="bash">
4 ./../src/deseq2-hts.sh $anno_input_selected $deseq_out $deseq_out.extra_files_path/gene_map.mat
5 $distype
6 #for $i in $replicate_groups
7 #for $j in $i.replicates
8 $j.bam_alignment:#slurp
9 #end for
10
11 #end for
12 >> $Log_File </command>
13 <inputs>
14 <param format="gff,gtf,gff3" name="anno_input_selected" type="data" label="Genome annotation in GFF file" help="A tab delimited format for storing sequence features and annotations"/>
15 <repeat name="replicate_groups" title="Replicate group" min="2">
16 <repeat name="replicates" title="Replicate">
17 <param format="bam" name="bam_alignment" type="data" label="BAM alignment file" help="BAM alignment file. Can be generated from SAM files using the SAMTools."/>
18 </repeat>
19 </repeat>
20
21 <param name="distype" type="select" label="Select fitting of dispersions to the mean intensity">
22 <option value="parametric">Parametric</option>
23 <option value="local">Local</option>
24 <option value="mean" selected="true">Mean</option>
25 </param>
26
27 </inputs>
28
29 <outputs>
30 <data format="txt" name="deseq_out" label="${tool.name} on ${on_string}: Differential Expression"/>
31 <data format="txt" name="Log_File" label="${tool.name} on ${on_string}: log"/>
32 </outputs>
33
34 <tests>
35 <test>
36 ./deseq2-hts.sh ../test_data/deseq_c_elegans_WS200-I-regions.gff3 ../test_data/deseq_c_elegans_WS200-I-regions_deseq.txt ../test_data/genes.mat ../test_data/deseq_c_elegans_WS200-I-regions-SRX001872.bam ../test_data/deseq_c_elegans_WS200-I-regions-SRX001875.bam
37
38 <param name="anno_input_selected" value="deseq_c_elegans_WS200-I-regions.gff3" ftype="gff3" />
39 <param name="bam_alignments1" value="deseq_c_elegans_WS200-I-regions-SRX001872.bam" ftype="bam" />
40 <param name="bam_alignments2" value="deseq_c_elegans_WS200-I-regions-SRX001875.bam" ftype="bam" />
41 <output name="deseq_out" file="deseq_c_elegans_WS200-I-regions_deseq.txt" />
42 </test>
43 </tests>
44
45 <help>
46
47 .. class:: infomark
48
49 **What it does**
50
51 DESeq2_ Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
52
53 .. _DESeq2: http://bioconductor.org/packages/2.12/bioc/html/DESeq2.html
54
55 `DESeq2` requires:
56
57 Genome annotation in GFF file type, containing the necessary information about the transcripts that are to be quantified.
58
59 The BAM alignment files grouped into replicate groups, each containing several replicates. BAM files store the read alignments, The program will also work with only two groups containing only a single replicate each. However, this analysis has less statistical power and is therefore not recommended!
60
61 ------
62
63 **Licenses**
64
65 If **DESeq2** is used to obtain results for scientific publications it
66 should be cited as [1]_.
67
68 **References**
69
70 .. [1] Anders, S and Huber, W (2010): `Differential expression analysis for sequence count data`_.
71
72 .. _Differential expression analysis for sequence count data: http://dx.doi.org/10.1186/gb-2010-11-10-r106
73
74 ------
75
76 .. class:: infomark
77
78 **About formats**
79
80 **GFF/GTF format** General Feature Format/Gene Transfer Format is a format for describing genes and other features associated with DNA, RNA and protein sequences. GFF3 lines have nine tab-separated fields:
81
82 1. seqid - The name of a chromosome or scaffold.
83 2. source - The program that generated this feature.
84 3. type - The name of this type of feature. Some examples of standard feature types are "gene", "CDS", "protein", "mRNA", and "exon".
85 4. start - The starting position of the feature in the sequence. The first base is numbered 1.
86 5. stop - The ending position of the feature (inclusive).
87 6. score - A score between 0 and 1000. If there is no score value, enter ".".
88 7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
89 8. phase - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
90 9. attributes - All lines with the same group are linked together into a single item.
91
92 For more information see http://www.sequenceontology.org/gff3.shtml
93
94 **BAM format** The Sequence Alignment/Map (SAM) format is a
95 tab-limited text format that stores large nucleotide sequence
96 alignments. BAM is the binary version of a SAM file that allows for
97 fast and intensive data processing. The format specification and the
98 description of SAMtools can be found on
99 http://samtools.sourceforge.net/.
100
101 ------
102
103 DESeq2-hts Wrapper Version 0.2 (Aug 2013)
104
105 </help>
106 </tool>