comparison test-data/td_output.tab @ 0:3e56058d9552 draft default tip

planemo upload for repository https://github.com/monikaheinzl/duplexanalysis_galaxy/tree/master/tools/hd commit 9bae9043a53f1e07b502acd1082450adcb6d9e31-dirty
author mheinzl
date Wed, 16 Oct 2019 04:17:59 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:3e56058d9552
1 td_data.tab
2 nr of tags 20
3 sample size 20
4
5 Tag distance separated by family size
6 FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum
7 TD=1 5 1 1 1 1 0 9
8 TD=6 3 0 0 0 0 0 3
9 TD=7 4 0 0 0 1 0 5
10 TD=8 2 0 0 1 0 0 3
11 sum 14 1 1 2 2 0 20
12
13 Family size distribution separated by Tag distance
14 TD=1 TD=2 TD=3 TD=4 TD=5-8 TD>8 sum
15 FS=1 5 0 0 0 9 0 14
16 FS=2 1 0 0 0 0 0 1
17 FS=3 1 0 0 0 0 0 1
18 FS=4 1 0 0 0 1 0 2
19 FS=6 1 0 0 0 0 0 1
20 FS=7 0 0 0 0 1 0 1
21 sum 9 0 0 0 11 0 20
22
23
24 max. family size in sample: 7
25 absolute frequency: 1
26 relative frequency: 0.05
27
28 Chimera Analysis:
29 The tags are splitted into two halves (part a and b) for which the Tag distances (TD) are calculated seperately.
30 The tag distance of the first half (part a) is calculated by comparing part a of the tag in the sample against all a parts in the dataset and by selecting the minimum value (TD a.min).
31 In the next step, we select those tags that showed the minimum TD and estimate the TD for the second half (part b) of the tag by comparing part b against the previously selected subset.
32 The maximum value represents then TD b.max. Finally, these process is repeated but starting with part b instead and TD b.min and TD a.max are calculated.
33 Next, the absolute differences between TD a.min & TD b.max and TD b.min & TD a.max are estimated (delta HD).
34 These are then divided by the sum of both parts (TD a.min + TD b.max or TD b.min + TD a.max, respectively) which give the relative differences between the partial HDs (rel. delta HD).
35 For simplicity, we used the maximum value of the relative differences and the respective delta HD.
36 Note that when only tags that can form a DCS are included in the analysis, the family sizes for both directions (ab and ba) of the strand will be included in the plots.
37
38 length of one half of the tag 12
39
40 Tag distance of each half in the tag
41 TD a.min TD b.max TD b.min TD a.max TD a.min + b.max, TD a.max + b.min sum
42 TD=0 20 0 8 1 0 29
43 TD=1 0 0 1 19 8 28
44 TD=2 0 0 0 0 1 1
45 TD=5 0 0 3 0 0 3
46 TD=6 0 0 2 0 3 5
47 TD=7 0 1 6 0 4 11
48 TD=8 0 2 0 0 7 9
49 TD=9 0 1 0 0 1 2
50 TD=10 0 2 0 0 2 4
51 TD=11 0 7 0 0 7 14
52 TD=12 0 7 0 0 7 14
53 sum 20 20 20 20 40 120
54
55 Absolute delta Tag distance within the tag
56 FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum
57 diff=7 1 0 0 0 0 0 1
58 diff=8 1 0 0 0 1 0 2
59 diff=9 1 0 0 0 0 0 1
60 diff=10 2 0 0 0 0 0 2
61 diff=11 4 0 1 1 1 0 7
62 diff=12 5 1 0 1 0 0 7
63 sum 14 1 1 2 2 0 20
64
65 Chimera analysis: relative delta Tag distance
66 FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum
67 diff=1.0 14 1 1 2 2 0 20
68 sum 14 1 1 2 2 0 20
69
70 All tags are filtered and only those tags where one half is identical (TD=0) and therefore, have a relative delta TD of 1, are kept.
71 These tags are considered as chimeras.
72 Tag distance of chimeric families separated after FS
73 FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum
74 TD=7 1 0 0 0 0 0 1
75 TD=8 1 0 0 0 1 0 2
76 TD=9 1 0 0 0 0 0 1
77 TD=10 2 0 0 0 0 0 2
78 TD=11 4 0 1 1 1 0 7
79 TD=12 5 1 0 1 0 0 7
80 sum 14 1 1 2 2 0 20
81
82 Tag distance of chimeric families separated after DCS and single SSCS (ab, ba)
83 DCS SSCS ab SSCS ba sum
84 TD=7.0 0 0 1 1
85 TD=8.0 0 1 1 2
86 TD=9.0 0 1 0 1
87 TD=10.0 0 1 1 2
88 TD=11.0 0 3 4 7
89 TD=12.0 0 2 5 7
90 sum 0 8 12 20
91
92