Mercurial > repos > m-zytnicki > mmquant
diff mmquant.xml @ 1:87c5fa8651c1 draft
planemo upload commit fb76aa0a938a2498d3206e6039bc1d9906e6c2ce-dirty
author | m-zytnicki |
---|---|
date | Wed, 15 Feb 2017 06:03:00 -0500 |
parents | 60abb6540004 |
children | fc9d40c697e8 |
line wrap: on
line diff
--- a/mmquant.xml Thu Aug 11 03:26:32 2016 -0400 +++ b/mmquant.xml Wed Feb 15 06:03:00 2017 -0500 @@ -29,6 +29,8 @@ -c "$count" -m "$merge" -o "$output" + -d "$n_overlap" + -D "$pc_overlap" ]]></command> <inputs> <param name="annotation" type="data" label="Annotation" format="gtf" /> @@ -47,6 +49,8 @@ <param name="gene_name" type="boolean" label="Print gene name instead of IDs" truevalue="-g" falsevalue="" help="use gene name instead of gene ID in the output file" /> <param name="count" type="integer" value="0" min="0" label="Count threshold" help="Do not display genes with less than N reads" /> <param name="merge" type="float" value="0.0" min="0.0" max="1.0" label="Merge threshold" help="Merge gene aggregate count with parent aggregate if count is low" /> + <param name="n_overlap" type="integer" value="30" min="1" label="Difference of overlapping" help="Number of overlapping bp between the best matches and the other matches" /> + <param name="pc_overlap" type="float" value="0.5" min="0.0" max="1.0" label="Ratio of overlapping" help="Ratio of overlapping bp between the best matches and the other matches" /> </inputs> <outputs> <data name="output" format="txt" label="${tool.name} on ${on_string}" /> @@ -101,6 +105,20 @@ .. _TopHat2: http://ccb.jhu.edu/software/tophat/index.shtml .. _STAR: https://github.com/alexdobin/STAR/releases +**Read mapping to several genes** + +We will suppose here that the ``-l 1`` strategy is used (i.e. a read is attributed to a gene as soon as at least 1 nucleotide overlap). The example can be extended to other strategies as well. + +If a read (say, of size 100), maps unambiguously and overlaps with gene A and B, it will be counted as 1 for the new "gene" gene_A--gene_B. However, suppose that only 1 nucleotide overlaps with gene A, whereas 100 nucleotides overlap with gene B (yes, genes A and B overlap). You probably would like to attribute the read to gene B. + +The options ``Difference of overlapping`` and ``Ratio of overlapping`` control this. We compute the number of overlapping nucleotides between a read and the overlapping genes. If a read overlaps "significantly" more with one gene than with all the other genes, they will attribute the read to the former gene only. + +The option ``Difference of overlapping`` *n* computes the differences of overlapping nucleotides. Let us name *N_A* and *N_B* the number of overlapping nucleotides with genes A and B respectively. If *N_A >= N_B + n*, then the read will be attributed to gene A only. + +The option ``Ratio of overlapping`` *m* compares the ratio of overlapping nucleotides. If *N_A / N_B >= m*, then the read will be attributed to gene A only. + +If both option ``Difference of overlapping`` *n* and ``Ratio of overlapping`` *m* are used, then the read will be attributed to gene A only iff both *N_A >= N_B + n* and *N_A / N_B >= m*. + **Output file**