annotate read_duplication.xml @ 32:580ee0c4bc4e

Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
author lparsons
date Mon, 07 Oct 2013 15:01:13 -0400
parents cc5eaa9376d8
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
32
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
1 <tool id="rseqc_read_duplication" name="Read Duplication" version="1.1">
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
2 <description>determines reads duplication rate with sequence-based and mapping-based strategies</description>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
3 <requirements>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
4 <requirement type="package" version="3.0.1">R</requirement>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
5 <requirement type="package" version="1.7.1">numpy</requirement>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
6 <requirement type="package" version="2.3.7">rseqc</requirement>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
7 </requirements>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
8 <command>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
9 read_duplication.py -i $input -o output -u $upLimit
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
10 </command>
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
11 <stdio>
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
12 <exit_code range="1:" level="fatal" description="An error occured during execution, see stderr and stdout for more information" />
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
13 <regex match="[Ee]rror" source="both" description="An error occured during execution, see stderr and stdout for more information" />
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
14 </stdio>
32
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
15 <inputs>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
16 <param name="input" type="data" format="bam,sam" label="input bam/sam file" />
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
17 <param name="upLimit" type="integer" label="Upper Limit of Plotted Duplicated Times (default=500)" value="500" />
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
18 </inputs>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
19 <outputs>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
20 <data format="xls" name="outputxls" from_work_dir="output.dup.pos.DupRate.xls" label="${tool.name} on ${on_string} (Position XLS)"/>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
21 <data format="xls" name="outputseqxls" from_work_dir="output.dup.seq.DupRate.xls" label="${tool.name} on ${on_string} (Sequence XLS)"/>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
22 <data format="txt" name="outputr" from_work_dir="output.DupRate_plot.r" label="${tool.name} on ${on_string} (R Script)" />
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
23 <data format="pdf" name="outputpdf" from_work_dir="output.DupRate_plot.pdf" label="${tool.name} on ${on_string} (PDF)" />
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
24 </outputs>
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
25 <help>
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
26 read_duplication.py
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
27 +++++++++++++++++++
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
28
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
29 Two strategies were used to determine reads duplication rate:
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
30
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
31 * Sequence based: reads with exactly the same sequence content are regarded as duplicated reads.
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
32 * Mapping based: reads mapped to the same genomic location are regarded as duplicated reads. For splice reads, reads mapped to the same starting position and splice the same way are regarded as duplicated reads.
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
33
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
34 Inputs
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
35 ++++++++++++++
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
36
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
37 Input BAM/SAM file
32
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
38 Alignment file in BAM/SAM format.
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
39
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
40 Upper Limit of Plotted Duplicated Times (default=500)
32
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
41 Only used for plotting.
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
42
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
43 Output
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
44 ++++++++++++++
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
45
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
46 1. output.dup.pos.DupRate.xls: Read duplication rate determined from mapping position of read. First column is "occurrence" or duplication times, second column is number of uniquely mapped reads.
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
47 2. output.dup.seq.DupRate.xls: Read duplication rate determined from sequence of read. First column is "occurrence" or duplication times, second column is number of uniquely mapped reads.
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
48 3. output.DupRate_plot.r: R script to generate pdf file
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
49 4. output.DupRate_plot.pdf: graphical output generated from R script
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
50
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
51 .. image:: http://rseqc.sourceforge.net/_images/duplicate.png
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
52 :height: 600 px
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
53 :width: 600 px
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
54 :scale: 80 %
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
55
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
56 -----
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
57
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
58 About RSeQC
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
59 +++++++++++
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
60
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
61 The RSeQC_ package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. "Basic modules" quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while "RNA-seq specific modules" investigate sequencing saturation status of both splicing junction detection and expression estimation, mapped reads clipping profile, mapped reads distribution, coverage uniformity over gene body, reproducibility, strand specificity and splice junction annotation.
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
62
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
63 The RSeQC package is licensed under the GNU GPL v3 license.
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
64
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
65 .. image:: http://rseqc.sourceforge.net/_static/logo.png
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
66
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
67 .. _RSeQC: http://rseqc.sourceforge.net/
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
68
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
69
32
580ee0c4bc4e Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
lparsons
parents: 31
diff changeset
70 </help>
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
71 </tool>