diff test-data/hd_output.tab @ 25:9e384b0741f1 draft

planemo upload for repository https://github.com/monikaheinzl/duplexanalysis_galaxy/tree/master/tools/hd commit b8a2f7b7615b2bcd3b602027af31f4e677da94f6-dirty
author mheinzl
date Tue, 14 May 2019 03:29:37 -0400
parents
children 6b15b3b6405c
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/hd_output.tab	Tue May 14 03:29:37 2019 -0400
@@ -0,0 +1,77 @@
+hd_data.tab
+number of tags per file	20 (from 20) against 20
+
+Hamming distance separated by family size
+	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
+HD=1	5	1	1	1	1	0	9	
+HD=6	3	0	0	0	0	0	3	
+HD=7	4	0	0	0	1	0	5	
+HD=8	2	0	0	1	0	0	3	
+sum	14	1	1	2	2	0	20	
+
+Family size distribution separated by Hamming distance
+	HD=1	HD=2	HD=3	HD=4	HD=5-8	HD>8	sum	
+FS=1	5	0	0	0	9	0	14	
+FS=2	1	0	0	0	0	0	1	
+FS=3	1	0	0	0	0	0	1	
+FS=4	1	0	0	0	1	0	2	
+FS=6	1	0	0	0	0	0	1	
+FS=7	0	0	0	0	1	0	1	
+sum	9	0	0	0	11	0	20	
+
+
+max. family size in sample:	7
+absolute frequency:	1
+relative frequency:	0.05
+
+The Hamming distances were calculated by comparing the first halve against all halves and selected the minimum value (HD a).
+For the second half of the tag, we compared them against all tags which resulted in the minimum HD of the previous step and selected the maximum value (HD b').
+Finally, it was possible to calculate the absolute and relative differences between the HDs (absolute and relative delta HD).
+These calculations were repeated, but starting with the second half in the first step to find all possible chimeras in the data (HD b and HD  For simplicity we used the maximum value between the delta values in the end.
+When only tags that can form DCS were allowed in the analysis, family sizes for the forward and reverse (ab and ba) will be included in the plots.
+length of one part of the tag = 12
+
+Hamming distance of each half in the tag
+	HD a	HD b'	HD b	HD a'	HD a+b	sum	
+HD=0	20	0	8	1	0	29	
+HD=1	0	0	1	19	8	28	
+HD=2	0	0	0	0	1	1	
+HD=5	0	0	3	0	0	3	
+HD=6	0	0	2	0	3	5	
+HD=7	0	1	6	0	4	11	
+HD=8	0	2	0	0	7	9	
+HD=9	0	1	0	0	1	2	
+HD=10	0	2	0	0	2	4	
+HD=11	0	7	0	0	7	14	
+HD=12	0	7	0	0	7	14	
+sum	20	20	20	20	40	120	
+
+Absolute delta Hamming distances within the tag
+	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
+diff=7	1	0	0	0	0	0	1	
+diff=8	1	0	0	0	1	0	2	
+diff=9	1	0	0	0	0	0	1	
+diff=10	2	0	0	0	0	0	2	
+diff=11	4	0	1	1	1	0	7	
+diff=12	5	1	0	1	0	0	7	
+sum	14	1	1	2	2	0	20	
+
+Chimera analysis: relative delta Hamming distances
+	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
+diff=1.0	14	1	1	2	2	0	20	
+sum	14	1	1	2	2	0	20	
+
+Chimeras:
+All tags were filtered: only those tags where at least one half was identical (HD=0) and therefore, had a relative delta of 1 were kept. These tags are considered as chimeric.
+So the Hamming distances of the chimeric tags are shown.
+Hamming distances of chimeras
+	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
+HD=7	1	0	0	0	0	0	1	
+HD=8	1	0	0	0	1	0	2	
+HD=9	1	0	0	0	0	0	1	
+HD=10	2	0	0	0	0	0	2	
+HD=11	4	0	1	1	1	0	7	
+HD=12	5	1	0	1	0	0	7	
+sum	14	1	1	2	2	0	20	
+
+