Notice: Values for Name and ID of sequence will be generated automatically if left blank.
What it does
Takes a reference sequence database (represented by a FastA file, possibly in gzip format) as input and produces an index that can be used by the malt tool as input. If MALT is to be used as a taxonomic and/or functional analysis tool as well as an alignment tool, then this MALT index builder tool must be provided with a number of mapping files that are used to map reference sequences to taxonomic or functional classes or to locate genes in DNA reference sequences.
Options
- Specify protein alphabet reduction - specify the alphabet reduction in the case of protein reference sequences.
- Specify seed settings - specify the settings for controlling how MALT uses its seed-and-extend approach based on “spaced seeds”.
- Shapes - specify the seed shapes used. For DNA sequences, the default seed shape is: 111110111011110110111111. For protein sequences, by default MALT uses the following four shapes: 111101101110111, 1111000101011001111, 11101001001000100101111 and 11101001000010100010100111.
- Maximim hits per seed - specify the maximum number of hits per seed - MALT uses this to calculate a maximum number of hits per hash value.