Galaxy | Tool Preview

LASTZ (version 1.04.22+galaxy0)
If your TARGET is in history, choose 'from your history' option
If your genome of interest is not listed, contact the Galaxy team
These are the sequences that you are aligning against TARGET
It is highly recommended to use lastz_32 instead of lastz if the reference genome size is greater than 2G
Where to looks
Where to look 0
Scorings
Scoring 0
Seedings
Seeding 0
HSPs (Gap-free extension)s
HSPs (Gap-free extension) 0
Chainings
Chaining 0
Gapped extensions
Gapped extension 0
Filterings
Filtering 0
Interpolations
Interpolation 0
Outputs
Output 0

What is does

LASTZ is designed to preprocess one sequence or set of sequences (which we collectively call the TARGET) and then align several QUERY sequences to it. It was developed by Bob Harris in the lab of Webb Miller at Penn State.

Read documentation before proceeding. LASTZ is a complex tool with many parameter options. Fortunately, there is a great manual maintained by its author. Default parameters may be sufficient to obtain the initial idea about how similar your sequences are, but to produce reliable alignments you may need to tweak the parameters. Read the manual.

Galaxy version of LASTZ sets --ambiguous=iupac as default (see Scoring section). This prevents LASTZ from erroring out if one of the DNA inputrs contains "non-standard" nucleotides.

About LASTZ parameters

Galaxy's version of LASTZ has nine parameter sections (Where to look, Scoring, Seeding, HSPs, Chaining, Gapped extension, Filtering, Interpolation, and Output). These sections closely follow parameter description in the manual.

Defaults

here are defaults for some of the most important parameters:

--seed=<pattern>       set seed pattern (12of19, 14of22, or general pattern)
                       (default is 1110100110010101111)
                       SEE "Seeding" SECTION -> "Select seed type"

--[no]transition       allow (or don't) one transition in a seed hit
                       (by default a transition is allowed)
                       SEE "Seeding" SECTION -> "Allow transitions"

--[no]chain            perform chaining
                       (by default no chaining is performed)
                       SEE "Chaining" SECTION

--[no]gapped           perform gapped alignment (instead of gap-free)
                       (by default gapped alignment is performed)
                       SEE "Gapped extension" SECTION

--strand=both          search both strands
--strand=plus          search + strand only (matching strand of query spec)
                       (by default both strands are searched)
                       SEE "Where to look" SECTION

--scores=<file>        read substitution and gap scores from a file
                       SEE "Scoring" SECTION

--xdrop=<score>        set x-drop threshold (default is 10sub[A][A])
                       SEE "HSPs" SECTION

--ydrop=<score>        set y-drop threshold (default is open+300extend)
                       SEE "Gapped extension" SECTION

--hspthresh=<score>    set threshold for high scoring pairs (default is 3000)
                       ungapped extensions scoring lower are discarded
                       <score> can also be a percentage or base count
                       SEE "HSPs" SECTION

--gappedthresh=<score> set threshold for gapped alignments
                       gapped extensions scoring lower are discarded
                       <score> can also be a percentage or base count
                       (default is to use same value as --hspthresh)
                       SEE "Gapped extension" SECTION

Substitution matrix

By default the HOXD70 substitution scores are used (from Chiaromonte et al. 2002):

bad_score          = X:-1000  # used for sub['X'][*] and sub[*]['X']
fill_score         = -100     # used when sub[*][*] is not defined
gap_open_penalty   =  400
gap_extend_penalty =   30

     A     C     G     T
A   91  -114   -31  -123
C -114   100  -125   -31
G  -31  -125   100  -114
T -123   -31  -114    91

Matrix can be supplied as an input to Read the substitution scores parameter in Scoring section. Substitution matrix can be inferred from your data using another LASTZ-based tool (LASTZ_D: Infer substitution scores).

Output

This version of LASTZ produces one output by default: a BAM alignment file. Other formats as well as a Dot Plot can be configured in Output section. This incarnation of LASTZ produces outputs without comment line starting with '#'. To learn identity of each column, consult formats section of LASTZ manual.