GECKO (GEnome Comparison with K-mers Out-of-core) is a fast, modular application designed to identify collections of HSPs in a pairwise genome comparisons. By employing novel filtering and data storing strategies, it is able to compare genome-sized sequences in less time.
Manual
To use GECKO, simply upload two .fasta files and select these as Sequence X and as Sequence Y.
Once so, choose the parameters that best suit your comparison:
- Minimum length: This parameter is the minimum length in nucleotides for an HSP (similarity fragment) to be conserved. Any HSP below this length will be filtered out of the comparison. It is recommended to use around 40 bp for small organisms (e.g. bacterial mycoplasma or E. Coli) and around 100 bp or more for larger organisms (e.g. human chromosomes).
- Minimum similarity: This parameter is analogous to the minimum length, however, instead of length, the similarity is used as threshold. The similarity is calculated as the score attained by an HSP divided by the maximum possible score. Use values above 50 to filter noise.
- Word length: This parameter is the seed size used to find HSPs. A smaller seed size will increase sensitivity and decrease performance, whereas a larger seed size will decrease sensitivity and increase performance. Recommended values are 12 or 16 for smaller organisms and 32 for larger organisms. These values must be multiples of 4.