Mercurial > repos > mheinzl > td
view test-data/td_output.tab @ 0:3e56058d9552 draft default tip
planemo upload for repository https://github.com/monikaheinzl/duplexanalysis_galaxy/tree/master/tools/hd commit 9bae9043a53f1e07b502acd1082450adcb6d9e31-dirty
author | mheinzl |
---|---|
date | Wed, 16 Oct 2019 04:17:59 -0400 |
parents | |
children |
line wrap: on
line source
td_data.tab nr of tags 20 sample size 20 Tag distance separated by family size FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum TD=1 5 1 1 1 1 0 9 TD=6 3 0 0 0 0 0 3 TD=7 4 0 0 0 1 0 5 TD=8 2 0 0 1 0 0 3 sum 14 1 1 2 2 0 20 Family size distribution separated by Tag distance TD=1 TD=2 TD=3 TD=4 TD=5-8 TD>8 sum FS=1 5 0 0 0 9 0 14 FS=2 1 0 0 0 0 0 1 FS=3 1 0 0 0 0 0 1 FS=4 1 0 0 0 1 0 2 FS=6 1 0 0 0 0 0 1 FS=7 0 0 0 0 1 0 1 sum 9 0 0 0 11 0 20 max. family size in sample: 7 absolute frequency: 1 relative frequency: 0.05 Chimera Analysis: The tags are splitted into two halves (part a and b) for which the Tag distances (TD) are calculated seperately. The tag distance of the first half (part a) is calculated by comparing part a of the tag in the sample against all a parts in the dataset and by selecting the minimum value (TD a.min). In the next step, we select those tags that showed the minimum TD and estimate the TD for the second half (part b) of the tag by comparing part b against the previously selected subset. The maximum value represents then TD b.max. Finally, these process is repeated but starting with part b instead and TD b.min and TD a.max are calculated. Next, the absolute differences between TD a.min & TD b.max and TD b.min & TD a.max are estimated (delta HD). These are then divided by the sum of both parts (TD a.min + TD b.max or TD b.min + TD a.max, respectively) which give the relative differences between the partial HDs (rel. delta HD). For simplicity, we used the maximum value of the relative differences and the respective delta HD. Note that when only tags that can form a DCS are included in the analysis, the family sizes for both directions (ab and ba) of the strand will be included in the plots. length of one half of the tag 12 Tag distance of each half in the tag TD a.min TD b.max TD b.min TD a.max TD a.min + b.max, TD a.max + b.min sum TD=0 20 0 8 1 0 29 TD=1 0 0 1 19 8 28 TD=2 0 0 0 0 1 1 TD=5 0 0 3 0 0 3 TD=6 0 0 2 0 3 5 TD=7 0 1 6 0 4 11 TD=8 0 2 0 0 7 9 TD=9 0 1 0 0 1 2 TD=10 0 2 0 0 2 4 TD=11 0 7 0 0 7 14 TD=12 0 7 0 0 7 14 sum 20 20 20 20 40 120 Absolute delta Tag distance within the tag FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum diff=7 1 0 0 0 0 0 1 diff=8 1 0 0 0 1 0 2 diff=9 1 0 0 0 0 0 1 diff=10 2 0 0 0 0 0 2 diff=11 4 0 1 1 1 0 7 diff=12 5 1 0 1 0 0 7 sum 14 1 1 2 2 0 20 Chimera analysis: relative delta Tag distance FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum diff=1.0 14 1 1 2 2 0 20 sum 14 1 1 2 2 0 20 All tags are filtered and only those tags where one half is identical (TD=0) and therefore, have a relative delta TD of 1, are kept. These tags are considered as chimeras. Tag distance of chimeric families separated after FS FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum TD=7 1 0 0 0 0 0 1 TD=8 1 0 0 0 1 0 2 TD=9 1 0 0 0 0 0 1 TD=10 2 0 0 0 0 0 2 TD=11 4 0 1 1 1 0 7 TD=12 5 1 0 1 0 0 7 sum 14 1 1 2 2 0 20 Tag distance of chimeric families separated after DCS and single SSCS (ab, ba) DCS SSCS ab SSCS ba sum TD=7.0 0 0 1 1 TD=8.0 0 1 1 2 TD=9.0 0 1 0 1 TD=10.0 0 1 1 2 TD=11.0 0 3 4 7 TD=12.0 0 2 5 7 sum 0 8 12 20