# HG changeset patch # User m-zytnicki # Date 1487156580 18000 # Node ID 87c5fa8651c1ddab7b0fc0157ef2581817ac4a2f # Parent 60abb65400044ca860faad22bee85d4abe42a918 planemo upload commit fb76aa0a938a2498d3206e6039bc1d9906e6c2ce-dirty diff -r 60abb6540004 -r 87c5fa8651c1 mmquant.xml --- a/mmquant.xml Thu Aug 11 03:26:32 2016 -0400 +++ b/mmquant.xml Wed Feb 15 06:03:00 2017 -0500 @@ -29,6 +29,8 @@ -c "$count" -m "$merge" -o "$output" + -d "$n_overlap" + -D "$pc_overlap" ]]> @@ -47,6 +49,8 @@ + + @@ -101,6 +105,20 @@ .. _TopHat2: http://ccb.jhu.edu/software/tophat/index.shtml .. _STAR: https://github.com/alexdobin/STAR/releases +**Read mapping to several genes** + +We will suppose here that the ``-l 1`` strategy is used (i.e. a read is attributed to a gene as soon as at least 1 nucleotide overlap). The example can be extended to other strategies as well. + +If a read (say, of size 100), maps unambiguously and overlaps with gene A and B, it will be counted as 1 for the new "gene" gene_A--gene_B. However, suppose that only 1 nucleotide overlaps with gene A, whereas 100 nucleotides overlap with gene B (yes, genes A and B overlap). You probably would like to attribute the read to gene B. + +The options ``Difference of overlapping`` and ``Ratio of overlapping`` control this. We compute the number of overlapping nucleotides between a read and the overlapping genes. If a read overlaps "significantly" more with one gene than with all the other genes, they will attribute the read to the former gene only. + +The option ``Difference of overlapping`` *n* computes the differences of overlapping nucleotides. Let us name *N_A* and *N_B* the number of overlapping nucleotides with genes A and B respectively. If *N_A >= N_B + n*, then the read will be attributed to gene A only. + +The option ``Ratio of overlapping`` *m* compares the ratio of overlapping nucleotides. If *N_A / N_B >= m*, then the read will be attributed to gene A only. + +If both option ``Difference of overlapping`` *n* and ``Ratio of overlapping`` *m* are used, then the read will be attributed to gene A only iff both *N_A >= N_B + n* and *N_A / N_B >= m*. + **Output file**