Mercurial > repos > mheinzl > td
diff test-data/td_output.tab @ 0:3e56058d9552 draft default tip
planemo upload for repository https://github.com/monikaheinzl/duplexanalysis_galaxy/tree/master/tools/hd commit 9bae9043a53f1e07b502acd1082450adcb6d9e31-dirty
author | mheinzl |
---|---|
date | Wed, 16 Oct 2019 04:17:59 -0400 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/td_output.tab Wed Oct 16 04:17:59 2019 -0400 @@ -0,0 +1,92 @@ +td_data.tab +nr of tags 20 +sample size 20 + +Tag distance separated by family size + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +TD=1 5 1 1 1 1 0 9 +TD=6 3 0 0 0 0 0 3 +TD=7 4 0 0 0 1 0 5 +TD=8 2 0 0 1 0 0 3 +sum 14 1 1 2 2 0 20 + +Family size distribution separated by Tag distance + TD=1 TD=2 TD=3 TD=4 TD=5-8 TD>8 sum +FS=1 5 0 0 0 9 0 14 +FS=2 1 0 0 0 0 0 1 +FS=3 1 0 0 0 0 0 1 +FS=4 1 0 0 0 1 0 2 +FS=6 1 0 0 0 0 0 1 +FS=7 0 0 0 0 1 0 1 +sum 9 0 0 0 11 0 20 + + +max. family size in sample: 7 +absolute frequency: 1 +relative frequency: 0.05 + +Chimera Analysis: +The tags are splitted into two halves (part a and b) for which the Tag distances (TD) are calculated seperately. +The tag distance of the first half (part a) is calculated by comparing part a of the tag in the sample against all a parts in the dataset and by selecting the minimum value (TD a.min). +In the next step, we select those tags that showed the minimum TD and estimate the TD for the second half (part b) of the tag by comparing part b against the previously selected subset. +The maximum value represents then TD b.max. Finally, these process is repeated but starting with part b instead and TD b.min and TD a.max are calculated. +Next, the absolute differences between TD a.min & TD b.max and TD b.min & TD a.max are estimated (delta HD). +These are then divided by the sum of both parts (TD a.min + TD b.max or TD b.min + TD a.max, respectively) which give the relative differences between the partial HDs (rel. delta HD). +For simplicity, we used the maximum value of the relative differences and the respective delta HD. +Note that when only tags that can form a DCS are included in the analysis, the family sizes for both directions (ab and ba) of the strand will be included in the plots. + +length of one half of the tag 12 + +Tag distance of each half in the tag + TD a.min TD b.max TD b.min TD a.max TD a.min + b.max, TD a.max + b.min sum +TD=0 20 0 8 1 0 29 +TD=1 0 0 1 19 8 28 +TD=2 0 0 0 0 1 1 +TD=5 0 0 3 0 0 3 +TD=6 0 0 2 0 3 5 +TD=7 0 1 6 0 4 11 +TD=8 0 2 0 0 7 9 +TD=9 0 1 0 0 1 2 +TD=10 0 2 0 0 2 4 +TD=11 0 7 0 0 7 14 +TD=12 0 7 0 0 7 14 +sum 20 20 20 20 40 120 + +Absolute delta Tag distance within the tag + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +diff=7 1 0 0 0 0 0 1 +diff=8 1 0 0 0 1 0 2 +diff=9 1 0 0 0 0 0 1 +diff=10 2 0 0 0 0 0 2 +diff=11 4 0 1 1 1 0 7 +diff=12 5 1 0 1 0 0 7 +sum 14 1 1 2 2 0 20 + +Chimera analysis: relative delta Tag distance + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +diff=1.0 14 1 1 2 2 0 20 +sum 14 1 1 2 2 0 20 + +All tags are filtered and only those tags where one half is identical (TD=0) and therefore, have a relative delta TD of 1, are kept. +These tags are considered as chimeras. +Tag distance of chimeric families separated after FS + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +TD=7 1 0 0 0 0 0 1 +TD=8 1 0 0 0 1 0 2 +TD=9 1 0 0 0 0 0 1 +TD=10 2 0 0 0 0 0 2 +TD=11 4 0 1 1 1 0 7 +TD=12 5 1 0 1 0 0 7 +sum 14 1 1 2 2 0 20 + +Tag distance of chimeric families separated after DCS and single SSCS (ab, ba) + DCS SSCS ab SSCS ba sum +TD=7.0 0 0 1 1 +TD=8.0 0 1 1 2 +TD=9.0 0 1 0 1 +TD=10.0 0 1 1 2 +TD=11.0 0 3 4 7 +TD=12.0 0 2 5 7 +sum 0 8 12 20 + +