comparison mmquant.xml @ 1:87c5fa8651c1 draft

planemo upload commit fb76aa0a938a2498d3206e6039bc1d9906e6c2ce-dirty
author m-zytnicki
date Wed, 15 Feb 2017 06:03:00 -0500
parents 60abb6540004
children fc9d40c697e8
comparison
equal deleted inserted replaced
0:60abb6540004 1:87c5fa8651c1
27 -l "$overlap" 27 -l "$overlap"
28 "$gene_name" 28 "$gene_name"
29 -c "$count" 29 -c "$count"
30 -m "$merge" 30 -m "$merge"
31 -o "$output" 31 -o "$output"
32 -d "$n_overlap"
33 -D "$pc_overlap"
32 ]]></command> 34 ]]></command>
33 <inputs> 35 <inputs>
34 <param name="annotation" type="data" label="Annotation" format="gtf" /> 36 <param name="annotation" type="data" label="Annotation" format="gtf" />
35 <repeat name="reads_info" title="Reads" min="1" default="1"> 37 <repeat name="reads_info" title="Reads" min="1" default="1">
36 <param name="reads" type="data" label="Reads" multiple="false" format="sam,bam" /> 38 <param name="reads" type="data" label="Reads" multiple="false" format="sam,bam" />
45 </repeat> 47 </repeat>
46 <param name="overlap" type="float" value="-1" label="Overlap type" help="&lt;0: read is included, &lt;1: overlap, otherwise: # nt" /> 48 <param name="overlap" type="float" value="-1" label="Overlap type" help="&lt;0: read is included, &lt;1: overlap, otherwise: # nt" />
47 <param name="gene_name" type="boolean" label="Print gene name instead of IDs" truevalue="-g" falsevalue="" help="use gene name instead of gene ID in the output file" /> 49 <param name="gene_name" type="boolean" label="Print gene name instead of IDs" truevalue="-g" falsevalue="" help="use gene name instead of gene ID in the output file" />
48 <param name="count" type="integer" value="0" min="0" label="Count threshold" help="Do not display genes with less than N reads" /> 50 <param name="count" type="integer" value="0" min="0" label="Count threshold" help="Do not display genes with less than N reads" />
49 <param name="merge" type="float" value="0.0" min="0.0" max="1.0" label="Merge threshold" help="Merge gene aggregate count with parent aggregate if count is low" /> 51 <param name="merge" type="float" value="0.0" min="0.0" max="1.0" label="Merge threshold" help="Merge gene aggregate count with parent aggregate if count is low" />
52 <param name="n_overlap" type="integer" value="30" min="1" label="Difference of overlapping" help="Number of overlapping bp between the best matches and the other matches" />
53 <param name="pc_overlap" type="float" value="0.5" min="0.0" max="1.0" label="Ratio of overlapping" help="Ratio of overlapping bp between the best matches and the other matches" />
50 </inputs> 54 </inputs>
51 <outputs> 55 <outputs>
52 <data name="output" format="txt" label="${tool.name} on ${on_string}" /> 56 <data name="output" format="txt" label="${tool.name} on ${on_string}" />
53 </outputs> 57 </outputs>
54 <tests> 58 <tests>
99 .. _samtools: http://www.htslib.org/ 103 .. _samtools: http://www.htslib.org/
100 .. _specification: https://samtools.github.io/hts-specs/SAMv1.pdf 104 .. _specification: https://samtools.github.io/hts-specs/SAMv1.pdf
101 .. _TopHat2: http://ccb.jhu.edu/software/tophat/index.shtml 105 .. _TopHat2: http://ccb.jhu.edu/software/tophat/index.shtml
102 .. _STAR: https://github.com/alexdobin/STAR/releases 106 .. _STAR: https://github.com/alexdobin/STAR/releases
103 107
108 **Read mapping to several genes**
109
110 We will suppose here that the ``-l 1`` strategy is used (i.e. a read is attributed to a gene as soon as at least 1 nucleotide overlap). The example can be extended to other strategies as well.
111
112 If a read (say, of size 100), maps unambiguously and overlaps with gene A and B, it will be counted as 1 for the new "gene" gene_A--gene_B. However, suppose that only 1 nucleotide overlaps with gene A, whereas 100 nucleotides overlap with gene B (yes, genes A and B overlap). You probably would like to attribute the read to gene B.
113
114 The options ``Difference of overlapping`` and ``Ratio of overlapping`` control this. We compute the number of overlapping nucleotides between a read and the overlapping genes. If a read overlaps "significantly" more with one gene than with all the other genes, they will attribute the read to the former gene only.
115
116 The option ``Difference of overlapping`` *n* computes the differences of overlapping nucleotides. Let us name *N_A* and *N_B* the number of overlapping nucleotides with genes A and B respectively. If *N_A >= N_B + n*, then the read will be attributed to gene A only.
117
118 The option ``Ratio of overlapping`` *m* compares the ratio of overlapping nucleotides. If *N_A / N_B >= m*, then the read will be attributed to gene A only.
119
120 If both option ``Difference of overlapping`` *n* and ``Ratio of overlapping`` *m* are used, then the read will be attributed to gene A only iff both *N_A >= N_B + n* and *N_A / N_B >= m*.
121
104 122
105 **Output file** 123 **Output file**
106 124
107 The output is a tab-separated file, to be use in EdgeR or DESeq, for instance. If the user provided *n* reads files, the output will contain *n+1* columns: 125 The output is a tab-separated file, to be use in EdgeR or DESeq, for instance. If the user provided *n* reads files, the output will contain *n+1* columns:
108 126