Mercurial > repos > nilesh > rseqc
diff tin.xml @ 51:09846d5169fa draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/rseqc commit 37fb1988971807c6a072e1afd98eeea02329ee83
author | iuc |
---|---|
date | Tue, 14 Mar 2017 10:23:21 -0400 |
parents | |
children | 5873cd7afb67 |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tin.xml Tue Mar 14 10:23:21 2017 -0400 @@ -0,0 +1,144 @@ +<tool id="rseqc_tin" name="Transcript Integrity Number" version="@WRAPPER_VERSION@"> + <description> + evaluates RNA integrity at a transcript level + </description> + + <macros> + <import>rseqc_macros.xml</import> + </macros> + + <expand macro="requirements" /> + + <expand macro="stdio" /> + + <version_command><![CDATA[tin.py --version]]></version_command> + + <!-- Generate output files here because tin.py removes all instances of "bam" + in the filename --> + <command><![CDATA[ + #import re + ln -sf '${input}' 'input.bam' && + ln -sf '${input.metadata.bam_index}' 'input.bam.bai' && + tin.py -i 'input.bam' --refgene='${refgene}' --minCov=${minCov} + --sample-size=${samplesize} ${subtractbackground} + ]]> + </command> + + <inputs> + <expand macro="bam_param" /> + <expand macro="refgene_param" /> + <param name="minCov" type="integer" value="10" label="Minimum coverage (default=10)" + help="Minimum number of reads mapped to a transcript (--minCov)." /> + <param name="samplesize" type="integer" value="100" label="Sample size (default=100)" + help="Number of equal-spaced nucleotide positions picked from mRNA. + Note: if this number is larger than the length of mRNA (L), it will + be halved until is's smaller than L. (--sample-size)." /> + <param name="subtractbackground" type="boolean" value="false" falsevalue="" + truevalue="--subtract-background" label="Subtract background noise + (default=No)" help="Subtract background noise (estimated from + intronic reads). Only use this option if there are substantial + intronic reads (--subtract-background)." /> + </inputs> + + <outputs> + <data name="outputsummary" format="tabular" from_work_dir="input.summary.txt" label="TIN on ${on_string} (summary)" /> + <data name="outputxls" format="xls" from_work_dir="input.tin.xls" label="TIN on ${on_string} (tin)" /> + </outputs> + + <!-- PDF Files contain R version, must avoid checking for diff --> + <tests> + <test> + <param name="input" value="pairend_strandspecific_51mer_hg19_chr1_1-100000.bam"/> + <param name="refgene" value="hg19_RefSeq_chr1_1-100000.bed"/> + <output name="outputsummary" file="output.tin.summary.txt"/> + <output name="outputxls" file="output.tin.xls"/> + </test> + </tests> + + <help><![CDATA[ +## tin.py + +This program is designed to evaluate RNA integrity at transcript level. TIN +(transcript integrity number) is named in analogous to RIN (RNA integrity +number). RIN (RNA integrity number) is the most widely used metric to +evaluate RNA integrity at sample (or transcriptome) level. It is a very +useful preventive measure to ensure good RNA quality and robust, +reproducible RNA sequencing. However, it has several weaknesses: + +* RIN score (1 <= RIN <= 10) is not a direct measurement of mRNA quality. + RIN score heavily relies on the amount of 18S and 28S ribosome RNAs, which + was demonstrated by the four features used by the RIN algorithm: the + “total RNA ratio” (i.e. the fraction of the area in the region of 18S and + 28S compared to the total area under the curve), 28S-region height, 28S + area ratio and the 18S:28S ratio24. To a large extent, RIN score was a + measure of ribosome RNA integrity. However, in most RNA-seq experiments, + ribosome RNAs were depleted from the library to enrich mRNA through either + ribo-minus or polyA selection procedure. + +* RIN only measures the overall RNA quality of an RNA sample. However, in real + situation, the degradation rate may differs significantly among + transcripts, depending on factors such as “AU-rich sequence”, “transcript + length”, “GC content”, “secondary structure” and the “RNA-protein + complex”. Therefore, RIN is practically not very useful in downstream + analysis such as adjusting the gene expression count. + +* RIN has very limited sensitivity to measure substantially degraded RNA + samples such as preserved clinical tissues. (ref: + http://www.illumina.com/documents/products/technotes/technote-truseq-rna-access.pdf). + +To overcome these limitations, we developed TIN, an algorithm that is able +to measure RNA integrity at transcript level. TIN calculates a score (0 <= +TIN <= 100) for each expressed transcript, however, the medTIN (i.e. +meidan TIN score across all the transcripts) can also be used to measure +the RNA integrity at sample level. Below plots demonstrated TIN is a +useful metric to measure RNA integrity in both transcriptome-wise and +transcript-wise, as demonstrated by the high concordance with both RIN and +RNA fragment size (estimated from RNA-seq read pairs). + + +## Inputs + +Input BAM/SAM file + Alignment file in BAM/SAM format. + +Reference gene model + Gene Model in BED format. Must be standard 12-column BED file. + +Minimum coverage + Minimum number of reads mapped to a tracript (default is 10). + +Sample size + Number of equal-spaced nucleotide positions picked from mRNA. Note: if + this number is larger than the length of mRNA (L), it will be halved until + it’s smaller than L (default is 100). + +Subtract background + Subtract background noise (estimated from intronic reads). Only use this + option if there are substantial intronic reads. + + +## Outputs + +Text + Table that includes the gene identifier (geneID), chromosome (chrom), + transcript start (tx_start), transcript end (tx_end), and transcript + integrity number (TIN). + +Example output: + +------ ----- ---------- --------- ------------- +geneID chrom tx_start tx_end TIN +------ ----- ---------- --------- ------------- +ABCC2 chr10 101542354 101611949 67.6446525761 +IPMK chr10 59951277 60027694 86.383618429 +RUFY2 chr10 70100863 70167051 43.8967503948 +------ ----- ---------- --------- ------------- + +@ABOUT@ + +]]> + </help> + + <expand macro="citations" /> + +</tool>