diff test-data/td_output.tab @ 0:3e56058d9552 draft default tip

planemo upload for repository https://github.com/monikaheinzl/duplexanalysis_galaxy/tree/master/tools/hd commit 9bae9043a53f1e07b502acd1082450adcb6d9e31-dirty
author mheinzl
date Wed, 16 Oct 2019 04:17:59 -0400
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/td_output.tab	Wed Oct 16 04:17:59 2019 -0400
@@ -0,0 +1,92 @@
+td_data.tab
+nr of tags	20
+sample size	20
+
+Tag distance separated by family size
+	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
+TD=1	5	1	1	1	1	0	9	
+TD=6	3	0	0	0	0	0	3	
+TD=7	4	0	0	0	1	0	5	
+TD=8	2	0	0	1	0	0	3	
+sum	14	1	1	2	2	0	20	
+
+Family size distribution separated by Tag distance
+	TD=1	TD=2	TD=3	TD=4	TD=5-8	TD>8	sum	
+FS=1	5	0	0	0	9	0	14	
+FS=2	1	0	0	0	0	0	1	
+FS=3	1	0	0	0	0	0	1	
+FS=4	1	0	0	0	1	0	2	
+FS=6	1	0	0	0	0	0	1	
+FS=7	0	0	0	0	1	0	1	
+sum	9	0	0	0	11	0	20	
+
+
+max. family size in sample:	7
+absolute frequency:	1
+relative frequency:	0.05
+
+Chimera Analysis:
+The tags are splitted into two halves (part a and b) for which the Tag distances (TD) are calculated seperately.
+The tag distance of the first half (part a) is calculated by comparing part a of the tag in the sample against all a parts in the dataset and by selecting the minimum value (TD a.min).
+In the next step, we select those tags that showed the minimum TD and estimate the TD for the second half (part b) of the tag by comparing part b against the previously selected subset.
+The maximum value represents then TD b.max. Finally, these process is repeated but starting with part b instead and TD b.min and TD a.max are calculated.
+Next, the absolute differences between TD a.min & TD b.max and TD b.min & TD a.max are estimated (delta HD).
+These are then divided by the sum of both parts (TD a.min + TD b.max or TD b.min + TD a.max, respectively) which give the relative differences between the partial HDs (rel. delta HD).
+For simplicity, we used the maximum value of the relative differences and the respective delta HD.
+Note that when only tags that can form a DCS are included in the analysis, the family sizes for both directions (ab and ba) of the strand will be included in the plots.
+
+length of one half of the tag	12
+
+Tag distance of each half in the tag
+	TD a.min	TD b.max	TD b.min	TD a.max	TD a.min + b.max, TD a.max + b.min	sum	
+TD=0	20	0	8	1	0	29	
+TD=1	0	0	1	19	8	28	
+TD=2	0	0	0	0	1	1	
+TD=5	0	0	3	0	0	3	
+TD=6	0	0	2	0	3	5	
+TD=7	0	1	6	0	4	11	
+TD=8	0	2	0	0	7	9	
+TD=9	0	1	0	0	1	2	
+TD=10	0	2	0	0	2	4	
+TD=11	0	7	0	0	7	14	
+TD=12	0	7	0	0	7	14	
+sum	20	20	20	20	40	120	
+
+Absolute delta Tag distance within the tag
+	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
+diff=7	1	0	0	0	0	0	1	
+diff=8	1	0	0	0	1	0	2	
+diff=9	1	0	0	0	0	0	1	
+diff=10	2	0	0	0	0	0	2	
+diff=11	4	0	1	1	1	0	7	
+diff=12	5	1	0	1	0	0	7	
+sum	14	1	1	2	2	0	20	
+
+Chimera analysis: relative delta Tag distance
+	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
+diff=1.0	14	1	1	2	2	0	20	
+sum	14	1	1	2	2	0	20	
+
+All tags are filtered and only those tags where one half is identical (TD=0) and therefore, have a relative delta TD of 1, are kept.
+These tags are considered as chimeras.
+Tag distance of chimeric families separated after FS
+	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
+TD=7	1	0	0	0	0	0	1	
+TD=8	1	0	0	0	1	0	2	
+TD=9	1	0	0	0	0	0	1	
+TD=10	2	0	0	0	0	0	2	
+TD=11	4	0	1	1	1	0	7	
+TD=12	5	1	0	1	0	0	7	
+sum	14	1	1	2	2	0	20	
+
+Tag distance of chimeric families separated after DCS and single SSCS (ab, ba)
+	DCS	SSCS ab	SSCS ba	sum	
+TD=7.0	0	0	1	1	
+TD=8.0	0	1	1	2	
+TD=9.0	0	1	0	1	
+TD=10.0	0	1	1	2	
+TD=11.0	0	3	4	7	
+TD=12.0	0	2	5	7	
+sum	0	8	12	20	
+
+