comparison inner_distance.xml @ 31:cc5eaa9376d8

Lance's updates
author nilesh
date Wed, 02 Oct 2013 02:20:04 -0400
parents adc934fb9a76
children 580ee0c4bc4e
comparison
equal deleted inserted replaced
30:b5d2f575ccb6 31:cc5eaa9376d8
1 <tool id="inner_distance" name="Inner Distance"> 1 <tool id="inner_distance" name="Inner Distance" version="1.1">
2 <description>calculate the inner distance (or insert size) between two paired RNA reads</description> 2 <description>calculate the inner distance (or insert size) between two paired RNA reads</description>
3 <requirements> 3 <requirements>
4 <requirement type="package" version="2.15.1">R</requirement> 4 <requirement type="package" version="2.11.0">R</requirement>
5 <requirement type="package" version="1.7.1">numpy</requirement>
5 <requirement type="package" version="2.3.7">rseqc</requirement> 6 <requirement type="package" version="2.3.7">rseqc</requirement>
6 </requirements> 7 </requirements>
7 <command interpreter="python"> inner_distance.py -i $input -o output -r $refgene 8 <command> inner_distance.py -i $input -o output -r $refgene
8 9
9 #if $bounds.hasLowerBound 10 #if $bounds.hasLowerBound
10 -l $bounds.lowerBound 11 -l $bounds.lowerBound
11 #end if 12 #end if
12 13
39 <param name="stepSize" type="integer" value="5" label="Step size (bp, default=5)" /> 40 <param name="stepSize" type="integer" value="5" label="Step size (bp, default=5)" />
40 </when> 41 </when>
41 </conditional> 42 </conditional>
42 </inputs> 43 </inputs>
43 <outputs> 44 <outputs>
44 <data format="txt" name="outputtxt" from_work_dir="output.inner_distance.txt"/> 45 <data format="txt" name="outputtxt" from_work_dir="output.inner_distance.txt" label="${tool.name} on ${on_string} (Text)"/>
45 <data format="txt" name="outputfreqtxt" from_work_dir="output.inner_distance_freq.txt" /> 46 <data format="txt" name="outputfreqtxt" from_work_dir="output.inner_distance_freq.txt" label="${tool.name} on ${on_string} (Freq Text)" />
46 <data format="pdf" name="outputpdf" from_work_dir="output.inner_distance_plot.pdf" /> 47 <data format="pdf" name="outputpdf" from_work_dir="output.inner_distance_plot.pdf" label="${tool.name} on ${on_string} (PDF)" />
47 <data format="r" name="outputr" from_work_dir="output.inner_distance_plot.r" /> 48 <data format="r" name="outputr" from_work_dir="output.inner_distance_plot.r" label="${tool.name} on ${on_string} (R Script)" />
48 </outputs> 49 </outputs>
50 <stdio>
51 <exit_code range="1:" level="fatal" description="An error occured during execution, see stderr and stdout for more information" />
52 <regex match="[Ee]rror" source="both" description="An error occured during execution, see stderr and stdout for more information" />
53 </stdio>
49 <help> 54 <help>
50 .. image:: https://code.google.com/p/rseqc/logo?cct=1336721062 55 inner_distance.py
56 +++++++++++++++++
51 57
52 ----- 58 This module is used to calculate the inner distance (or insert size) between two paired RNA
59 reads. The distance is the mRNA length between two paired fragments. We first determine the
60 genomic (DNA) size between two paired reads: D_size = read2_start - read1_end, then
53 61
54 About RSeQC 62 * if two paired reads map to the same exon: inner distance = D_size
55 +++++++++++ 63 * if two paired reads map to different exons:inner distance = D_size - intron_size
64 * if two paired reads map non-exonic region (such as intron and intergenic region): inner distance = D_size
65 * The inner_distance might be a negative value if two fragments were overlapped.
56 66
57 The RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. “Basic modules” quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while “RNA-seq specific modules” investigate sequencing saturation status of both splicing junction detection and expression estimation, mapped reads clipping profile, mapped reads distribution, coverage uniformity over gene body, reproducibility, strand specificity and splice junction annotation. 67 NOTE: Not all read pairs were used to estimate the inner distance distribution. Those low
58 68 quality, PCR duplication, multiple mapped reads were skipped.
59 The RSeQC package is licensed under the GNU GPL v3 license.
60 69
61 Inputs 70 Inputs
62 ++++++++++++++ 71 ++++++++++++++
63 72
64 Input BAM/SAM file 73 Input BAM/SAM file
76 85
77 Output 86 Output
78 ++++++++++++++ 87 ++++++++++++++
79 88
80 1. output.inner_distance.txt: 89 1. output.inner_distance.txt:
81 - first column is read ID 90 - first column is read ID
82 -second column is inner distance. Could be negative value if PE reads were overlapped or mapping error (e.g. Read1_start < Read2_start, while Read1_end >> Read2_end due to spliced mapping of read1) 91 -second column is inner distance. Could be negative value if PE reads were overlapped or mapping error (e.g. Read1_start &lt; Read2_start, while Read1_end >> Read2_end due to spliced mapping of read1)
83 - third column indicates how paired reads were mapped: PE_within_same_exon, PE_within_diff_exon,PE_reads_overlap 92 - third column indicates how paired reads were mapped: PE_within_same_exon, PE_within_diff_exon,PE_reads_overlap
84 2. output..inner_distance_freq.txt: 93 2. output..inner_distance_freq.txt:
85 - inner distance starts 94 - inner distance starts
86 - inner distance ends 95 - inner distance ends
87 - number of read pairs 96 - number of read pairs
88 - note the first 2 columns are left side half open interval 97 - note the first 2 columns are left side half open interval
89 3. output.inner_distance_plot.r: R script to generate histogram 98 3. output.inner_distance_plot.r: R script to generate histogram
90 4. output.inner_distance_plot.pdf: histogram plot 99 4. output.inner_distance_plot.pdf: histogram plot
91 100
92 .. image:: http://dldcc-web.brc.bcm.edu/lilab/liguow/RSeQC/figure/inner_distance.png 101 .. image:: http://rseqc.sourceforge.net/_images/inner_distance.png
102 :height: 600 px
103 :width: 600 px
104 :scale: 80 %
105
106
107 -----
108
109 About RSeQC
110 +++++++++++
111
112 The RSeQC_ package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. "Basic modules" quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while "RNA-seq specific modules" investigate sequencing saturation status of both splicing junction detection and expression estimation, mapped reads clipping profile, mapped reads distribution, coverage uniformity over gene body, reproducibility, strand specificity and splice junction annotation.
113
114 The RSeQC package is licensed under the GNU GPL v3 license.
115
116 .. image:: http://rseqc.sourceforge.net/_static/logo.png
117
118 .. _RSeQC: http://rseqc.sourceforge.net/
119
93 120
94 </help> 121 </help>
95 </tool> 122 </tool>