Mercurial > repos > nilesh > rseqc

diff tin.xml @ 51:09846d5169fa draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/rseqc commit 37fb1988971807c6a072e1afd98eeea02329ee83
author: iuc
date: Tue, 14 Mar 2017 10:23:21 -0400
children: 5873cd7afb67
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tin.xml	Tue Mar 14 10:23:21 2017 -0400
@@ -0,0 +1,144 @@
+<tool id="rseqc_tin" name="Transcript Integrity Number" version="@WRAPPER_VERSION@">
+    <description>
+        evaluates RNA integrity at a transcript level
+    </description>
+
+    <macros>
+        <import>rseqc_macros.xml</import>
+    </macros>
+
+    <expand macro="requirements" />
+
+    <expand macro="stdio" />
+
+    <version_command><![CDATA[tin.py --version]]></version_command>
+
+    <!-- Generate output files here because tin.py removes all instances of "bam"
+    in the filename -->
+    <command><![CDATA[
+        #import re
+        ln -sf '${input}' 'input.bam' &&
+        ln -sf '${input.metadata.bam_index}' 'input.bam.bai' &&
+        tin.py -i 'input.bam' --refgene='${refgene}' --minCov=${minCov}
+        --sample-size=${samplesize} ${subtractbackground}
+        ]]>
+    </command>
+
+    <inputs>
+        <expand macro="bam_param" />
+        <expand macro="refgene_param" />
+        <param name="minCov" type="integer" value="10" label="Minimum coverage (default=10)"
+            help="Minimum number of reads mapped to a transcript (--minCov)." />
+        <param name="samplesize" type="integer" value="100" label="Sample size (default=100)"
+            help="Number of equal-spaced nucleotide positions picked from mRNA.
+            Note: if this number is larger than the length of mRNA (L), it will
+            be halved until is's smaller than L. (--sample-size)." />
+        <param name="subtractbackground" type="boolean" value="false" falsevalue=""
+            truevalue="--subtract-background" label="Subtract background noise
+            (default=No)" help="Subtract background noise (estimated from
+            intronic reads). Only use this option if there are substantial
+            intronic reads (--subtract-background)." />
+    </inputs>
+
+    <outputs>
+        <data name="outputsummary" format="tabular" from_work_dir="input.summary.txt" label="TIN on ${on_string} (summary)" />
+        <data name="outputxls" format="xls" from_work_dir="input.tin.xls" label="TIN on ${on_string} (tin)" />
+    </outputs>
+
+    <!-- PDF Files contain R version, must avoid checking for diff -->
+    <tests>
+        <test>
+            <param name="input" value="pairend_strandspecific_51mer_hg19_chr1_1-100000.bam"/>
+            <param name="refgene" value="hg19_RefSeq_chr1_1-100000.bed"/>
+            <output name="outputsummary" file="output.tin.summary.txt"/>
+            <output name="outputxls" file="output.tin.xls"/>
+        </test>
+    </tests>
+
+    <help><![CDATA[
+## tin.py
+
+This program is designed to evaluate RNA integrity at transcript level. TIN
+(transcript integrity number) is named in analogous to RIN (RNA integrity
+number). RIN (RNA integrity number) is the most widely used metric to
+evaluate RNA integrity at sample (or transcriptome) level. It is a very
+useful preventive measure to ensure good RNA quality and robust,
+reproducible RNA sequencing. However, it has several weaknesses:
+
+* RIN score (1 <= RIN <= 10) is not a direct measurement of mRNA quality.
+  RIN score heavily relies on the amount of 18S and 28S ribosome RNAs, which
+  was demonstrated by the four features used by the RIN algorithm: the
+  “total RNA ratio” (i.e. the fraction of the area in the region of 18S and
+  28S compared to the total area under the curve), 28S-region height, 28S
+  area ratio and the 18S:28S ratio24. To a large extent, RIN score was a
+  measure of ribosome RNA integrity. However, in most RNA-seq experiments,
+  ribosome RNAs were depleted from the library to enrich mRNA through either
+  ribo-minus or polyA selection procedure.
+
+* RIN only measures the overall RNA quality of an RNA sample. However, in  real
+  situation, the degradation rate may differs significantly among
+  transcripts, depending on factors such as “AU-rich sequence”, “transcript
+  length”, “GC content”, “secondary structure” and the “RNA-protein
+  complex”. Therefore, RIN is practically not very useful in downstream
+  analysis such as adjusting the gene expression count.
+
+* RIN has very limited sensitivity to measure substantially degraded RNA
+  samples such as preserved clinical tissues. (ref:
+  http://www.illumina.com/documents/products/technotes/technote-truseq-rna-access.pdf).
+
+To overcome these limitations, we developed TIN, an algorithm that is able
+to measure RNA integrity at transcript level. TIN calculates a score (0 <=
+TIN <= 100) for each expressed transcript, however, the medTIN (i.e.
+meidan TIN score across all the transcripts) can also be used to measure
+the RNA integrity at sample level. Below plots demonstrated TIN is a
+useful metric to measure RNA integrity in both transcriptome-wise and
+transcript-wise, as demonstrated by the high concordance with both RIN and
+RNA fragment size (estimated from RNA-seq read pairs).
+
+
+## Inputs
+
+Input BAM/SAM file
+    Alignment file in BAM/SAM format.
+
+Reference gene model
+    Gene Model in BED format. Must be standard 12-column BED file.
+
+Minimum coverage
+    Minimum number of reads mapped to a tracript (default is 10).
+
+Sample size
+    Number of equal-spaced nucleotide positions picked from mRNA. Note: if
+    this number is larger than the length of mRNA (L), it will be halved until
+    it’s smaller than L (default is 100).
+
+Subtract background
+    Subtract background noise (estimated from intronic reads). Only use this
+    option if there are substantial intronic reads.
+
+
+## Outputs
+
+Text
+    Table that includes the gene identifier (geneID), chromosome (chrom),
+    transcript start (tx_start), transcript end (tx_end), and transcript
+    integrity number (TIN).
+
+Example output:
+
+------  -----  ----------  ---------  -------------
+geneID  chrom  tx_start    tx_end     TIN
+------  -----  ----------  ---------  -------------
+ABCC2   chr10   101542354  101611949  67.6446525761
+IPMK    chr10   59951277   60027694   86.383618429
+RUFY2   chr10   70100863   70167051   43.8967503948
+------  -----  ----------  ---------  -------------
+
+@ABOUT@
+
+]]>
+    </help>
+
+    <expand macro="citations" />
+
+</tool>
author	iuc
date	Tue, 14 Mar 2017 10:23:21 -0400
parents
children	5873cd7afb67