Mercurial > repos > mheinzl > hd
diff test-data/output_file2.tabular @ 19:2e9f7ea7ae93 draft
planemo upload for repository https://github.com/monikaheinzl/duplexanalysis_galaxy/tree/master/tools/hd commit dfaab79252a858e8df16bbea3607ebf1b6962e5a-dirty
author | mheinzl |
---|---|
date | Mon, 08 Oct 2018 05:56:04 -0400 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/output_file2.tabular Mon Oct 08 05:56:04 2018 -0400 @@ -0,0 +1,97 @@ +Test_data2 +number of tags per file 20 (from 20) against 20 + +Hamming distance separated by family size + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +HD=1 2 0 0 0 1 0 3 +HD=6 0 0 0 1 0 1 2 +HD=7 2 0 1 1 2 1 7 +HD=8 1 0 1 0 2 1 5 +HD=9 1 0 0 0 0 1 2 +HD=10 1 0 0 0 0 0 1 +sum 7 0 2 2 5 4 20 + +Family size distribution separated by Hamming distance + HD=1 HD=2 HD=3 HD=4 HD=5-8 HD>8 sum +FS=1 2 0 0 0 3 2 7 +FS=3 0 0 0 0 2 0 2 +FS=4 0 0 0 0 2 0 2 +FS=5 0 0 0 0 1 0 1 +FS=6 0 0 0 0 1 0 1 +FS=7 1 0 0 0 0 0 1 +FS=8 0 0 0 0 1 0 1 +FS=9 0 0 0 0 1 0 1 +FS=12 0 0 0 0 2 0 2 +FS=13 0 0 0 0 1 1 2 +sum 3 0 0 0 14 3 20 + + +max. family size: 13 +absolute frequency: 2 +relative frequency: 0.1 + +The hamming distances were calculated by comparing each half of all tags against the tag(s) with the minimum Hamming distance per half. +It is possible that one tag can have the minimum HD from multiple tags, so the sample size in this calculation differs from the sample size entered by the user. +actual number of tags with min HD = 79 (sample size by user = 20) +length of one part of the tag = 12 + +Hamming distance of each half in the tag + HD a HD b' HD b HD a' HD a+b sum +HD=0 20 0 0 5 0 25 +HD=1 22 4 4 3 8 41 +HD=2 9 2 0 9 2 22 +HD=3 0 0 0 10 0 10 +HD=4 0 0 2 1 0 3 +HD=5 0 0 5 0 0 5 +HD=6 0 5 7 0 3 15 +HD=7 0 7 10 0 10 27 +HD=8 0 6 0 0 10 16 +HD=9 0 7 0 0 17 24 +HD=10 0 11 0 0 13 24 +HD=11 0 8 0 0 7 15 +HD=12 0 1 0 0 5 6 +HD=13 0 0 0 0 4 4 +sum 51 51 28 28 79 237 + +Absolute delta Hamming distances within the tag + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +diff=1 5 0 0 1 5 0 11 +diff=2 4 0 0 0 0 0 4 +diff=3 1 0 2 1 1 0 5 +diff=4 1 0 1 0 2 1 5 +diff=5 2 0 0 0 4 6 12 +diff=6 1 0 0 1 1 7 10 +diff=7 2 0 1 0 0 0 3 +diff=8 0 0 1 0 1 3 5 +diff=9 6 0 0 1 3 4 14 +diff=10 4 0 0 0 3 2 9 +diff=11 0 0 0 0 0 1 1 +sum 26 0 5 4 20 24 79 + +Chimera analysis: relative delta Hamming distances + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +diff=0.1 1 0 0 1 1 0 3 +diff=0.3 3 0 2 0 0 0 5 +diff=0.4 1 0 0 1 3 0 5 +diff=0.5 0 0 1 0 0 1 2 +diff=0.6 1 0 0 0 3 7 11 +diff=0.7 1 0 0 0 1 5 7 +diff=0.8 10 0 0 0 2 9 21 +diff=1.0 9 0 2 2 10 2 25 +sum 26 0 5 4 20 24 79 + +Chimeras: +All tags were filtered: only those tags where at least one half is identical with the half of the min. tag are kept. +So the hamming distance of the non-identical half is compared. +Hamming distances of non-zero half + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +HD=1 4 0 0 0 4 0 8 +HD=2 2 0 0 0 0 0 2 +HD=6 0 0 0 1 0 2 3 +HD=7 1 0 1 0 0 0 2 +HD=8 0 0 1 0 1 0 2 +HD=9 1 0 0 1 2 0 4 +HD=10 1 0 0 0 3 0 4 +sum 9 0 2 2 10 2 25 + +