What it does
This tool will generate all possible combination of observed STR length profiles of the consecutive alleles from given error profile. The range of observed read lengths can be filtered to contain only those that are frequently occur using "Minimum error rate to be considered" parameter.
This program will collect the lists of valid (pass "Minimum error rate to be considered" threshold) observed length profiles from combination of consecutive allele lengths. The lists that are equivalent or the subset of the other lists will be removed. For each depth and each list, length profile were generated from combination with replacement which compatible with python 2.7. There could be redundant error profiles generated from different lists if more than one combination of allele is generated due to overlap range of observed microsatellite lengths. The user need to remove them which can be done easily using sort | uniq command in unix.
Citation
When you use this tool, please cite Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research
Input
Output
Example
Suppose that we provide the following STR length profile
true obs. reads 9 9 100000 10 10 91456 10 9 1259 11 11 39657 11 10 1211 11 12 514
Using the default minimum probability (fraction of reads) of 0.00000001 and motif = A, all observed STR lengths are valid. The program will generated lists of observed length profiles from consecutive allele lengths
9:10 = [9,10] 10:11 = [9,10,11,12]
Lists that are subsets of other lists will be removed. In this example, [9,10] will not be considered.
The program will then generate all combinations with replacement for each depth from each list. Using maximum read depth levels =3, we will get the following output.
chr 9,9 A chr 9,10 A chr 9,11 A chr 9,12 A chr 10,10 A chr 10,11 A chr 10,12 A chr 11,11 A chr 11,12 A chr 12,12 A chr 9,9,9 A chr 9,9,10 A chr 9,9,11 A chr 9,9,12 A chr 9,10,10 A chr 9,10,11 A chr 9,10,12 A chr 9,11,11 A chr 9,11,12 A chr 9,12,12 A chr 10,10,10 A chr 10,10,11 A chr 10,10,12 A chr 10,11,11 A chr 10,11,12 A chr 10,12,12 A chr 11,11,11 A chr 11,11,12 A chr 11,12,12 A chr 12,12,12 A