Mercurial > repos > nilesh > rseqc
comparison tin.xml @ 51:09846d5169fa draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/rseqc commit 37fb1988971807c6a072e1afd98eeea02329ee83
author | iuc |
---|---|
date | Tue, 14 Mar 2017 10:23:21 -0400 |
parents | |
children | 5873cd7afb67 |
comparison
equal
deleted
inserted
replaced
50:f242ee103277 | 51:09846d5169fa |
---|---|
1 <tool id="rseqc_tin" name="Transcript Integrity Number" version="@WRAPPER_VERSION@"> | |
2 <description> | |
3 evaluates RNA integrity at a transcript level | |
4 </description> | |
5 | |
6 <macros> | |
7 <import>rseqc_macros.xml</import> | |
8 </macros> | |
9 | |
10 <expand macro="requirements" /> | |
11 | |
12 <expand macro="stdio" /> | |
13 | |
14 <version_command><![CDATA[tin.py --version]]></version_command> | |
15 | |
16 <!-- Generate output files here because tin.py removes all instances of "bam" | |
17 in the filename --> | |
18 <command><![CDATA[ | |
19 #import re | |
20 ln -sf '${input}' 'input.bam' && | |
21 ln -sf '${input.metadata.bam_index}' 'input.bam.bai' && | |
22 tin.py -i 'input.bam' --refgene='${refgene}' --minCov=${minCov} | |
23 --sample-size=${samplesize} ${subtractbackground} | |
24 ]]> | |
25 </command> | |
26 | |
27 <inputs> | |
28 <expand macro="bam_param" /> | |
29 <expand macro="refgene_param" /> | |
30 <param name="minCov" type="integer" value="10" label="Minimum coverage (default=10)" | |
31 help="Minimum number of reads mapped to a transcript (--minCov)." /> | |
32 <param name="samplesize" type="integer" value="100" label="Sample size (default=100)" | |
33 help="Number of equal-spaced nucleotide positions picked from mRNA. | |
34 Note: if this number is larger than the length of mRNA (L), it will | |
35 be halved until is's smaller than L. (--sample-size)." /> | |
36 <param name="subtractbackground" type="boolean" value="false" falsevalue="" | |
37 truevalue="--subtract-background" label="Subtract background noise | |
38 (default=No)" help="Subtract background noise (estimated from | |
39 intronic reads). Only use this option if there are substantial | |
40 intronic reads (--subtract-background)." /> | |
41 </inputs> | |
42 | |
43 <outputs> | |
44 <data name="outputsummary" format="tabular" from_work_dir="input.summary.txt" label="TIN on ${on_string} (summary)" /> | |
45 <data name="outputxls" format="xls" from_work_dir="input.tin.xls" label="TIN on ${on_string} (tin)" /> | |
46 </outputs> | |
47 | |
48 <!-- PDF Files contain R version, must avoid checking for diff --> | |
49 <tests> | |
50 <test> | |
51 <param name="input" value="pairend_strandspecific_51mer_hg19_chr1_1-100000.bam"/> | |
52 <param name="refgene" value="hg19_RefSeq_chr1_1-100000.bed"/> | |
53 <output name="outputsummary" file="output.tin.summary.txt"/> | |
54 <output name="outputxls" file="output.tin.xls"/> | |
55 </test> | |
56 </tests> | |
57 | |
58 <help><![CDATA[ | |
59 ## tin.py | |
60 | |
61 This program is designed to evaluate RNA integrity at transcript level. TIN | |
62 (transcript integrity number) is named in analogous to RIN (RNA integrity | |
63 number). RIN (RNA integrity number) is the most widely used metric to | |
64 evaluate RNA integrity at sample (or transcriptome) level. It is a very | |
65 useful preventive measure to ensure good RNA quality and robust, | |
66 reproducible RNA sequencing. However, it has several weaknesses: | |
67 | |
68 * RIN score (1 <= RIN <= 10) is not a direct measurement of mRNA quality. | |
69 RIN score heavily relies on the amount of 18S and 28S ribosome RNAs, which | |
70 was demonstrated by the four features used by the RIN algorithm: the | |
71 “total RNA ratio” (i.e. the fraction of the area in the region of 18S and | |
72 28S compared to the total area under the curve), 28S-region height, 28S | |
73 area ratio and the 18S:28S ratio24. To a large extent, RIN score was a | |
74 measure of ribosome RNA integrity. However, in most RNA-seq experiments, | |
75 ribosome RNAs were depleted from the library to enrich mRNA through either | |
76 ribo-minus or polyA selection procedure. | |
77 | |
78 * RIN only measures the overall RNA quality of an RNA sample. However, in real | |
79 situation, the degradation rate may differs significantly among | |
80 transcripts, depending on factors such as “AU-rich sequence”, “transcript | |
81 length”, “GC content”, “secondary structure” and the “RNA-protein | |
82 complex”. Therefore, RIN is practically not very useful in downstream | |
83 analysis such as adjusting the gene expression count. | |
84 | |
85 * RIN has very limited sensitivity to measure substantially degraded RNA | |
86 samples such as preserved clinical tissues. (ref: | |
87 http://www.illumina.com/documents/products/technotes/technote-truseq-rna-access.pdf). | |
88 | |
89 To overcome these limitations, we developed TIN, an algorithm that is able | |
90 to measure RNA integrity at transcript level. TIN calculates a score (0 <= | |
91 TIN <= 100) for each expressed transcript, however, the medTIN (i.e. | |
92 meidan TIN score across all the transcripts) can also be used to measure | |
93 the RNA integrity at sample level. Below plots demonstrated TIN is a | |
94 useful metric to measure RNA integrity in both transcriptome-wise and | |
95 transcript-wise, as demonstrated by the high concordance with both RIN and | |
96 RNA fragment size (estimated from RNA-seq read pairs). | |
97 | |
98 | |
99 ## Inputs | |
100 | |
101 Input BAM/SAM file | |
102 Alignment file in BAM/SAM format. | |
103 | |
104 Reference gene model | |
105 Gene Model in BED format. Must be standard 12-column BED file. | |
106 | |
107 Minimum coverage | |
108 Minimum number of reads mapped to a tracript (default is 10). | |
109 | |
110 Sample size | |
111 Number of equal-spaced nucleotide positions picked from mRNA. Note: if | |
112 this number is larger than the length of mRNA (L), it will be halved until | |
113 it’s smaller than L (default is 100). | |
114 | |
115 Subtract background | |
116 Subtract background noise (estimated from intronic reads). Only use this | |
117 option if there are substantial intronic reads. | |
118 | |
119 | |
120 ## Outputs | |
121 | |
122 Text | |
123 Table that includes the gene identifier (geneID), chromosome (chrom), | |
124 transcript start (tx_start), transcript end (tx_end), and transcript | |
125 integrity number (TIN). | |
126 | |
127 Example output: | |
128 | |
129 ------ ----- ---------- --------- ------------- | |
130 geneID chrom tx_start tx_end TIN | |
131 ------ ----- ---------- --------- ------------- | |
132 ABCC2 chr10 101542354 101611949 67.6446525761 | |
133 IPMK chr10 59951277 60027694 86.383618429 | |
134 RUFY2 chr10 70100863 70167051 43.8967503948 | |
135 ------ ----- ---------- --------- ------------- | |
136 | |
137 @ABOUT@ | |
138 | |
139 ]]> | |
140 </help> | |
141 | |
142 <expand macro="citations" /> | |
143 | |
144 </tool> |