annotate lastz_d.xml @ 5:bd84ff27bc16 draft

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
author devteam
date Mon, 26 Feb 2018 15:37:53 -0500
parents
children b6d7308c3728
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
5
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
1 <tool id="lastz_d_wrapper" name="LASTZ_D" version="1.3">
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
2 <description>: estimate substitution scores matrix</description>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
3 <macros>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
4 <import>lastz_macros.xml</import>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
5 </macros>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
6 <requirements>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
7 <requirement type="package" version="@LASTZ_CONDA_VERSION@">lastz</requirement>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
8 </requirements>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
9 <command detect_errors="exit_code"><![CDATA[
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
10 lastz_D
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
11 @TARGET_INPUT_COMMAND_LINE@
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
12 '${query}'
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
13 #if $score_file:
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
14 '--inferonly=${score_file}'
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
15 #else:
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
16 --inferonly
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
17 #end if
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
18 '--infscores=${output}'
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
19 ]]>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
20 </command>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
21 <inputs>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
22 <expand macro="target_input"/>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
23 <param name="query" format="fasta,fasta.gz,fastq.gz" type="data" label="Select QUERY sequence(s)" help="These are the sequences that you are aligning against TARGET"/>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
24 <param name="score_file" type="data" format="txt" optional="true" label="Control file for inference" argument="--inferonly[=control_file]" help="Optional controf file. If nothing is selected, LASTZ_D uses default described in the manual"/>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
25 </inputs>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
26 <outputs>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
27 <data format="txt" name="output" label="${tool.name} on ${on_string}: substituion matrix"/>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
28 </outputs>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
29 <tests>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
30 <test>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
31 <param name="ref_source" value="history" />
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
32 <param name="target" value="chrM_human.fa" />
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
33 <param name="query" value="chrM_mouse.fa" />
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
34 <output name="output" value="lastz_d_test1.out" />
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
35 </test>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
36 <test>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
37 <param name="ref_source" value="history" />
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
38 <param name="target" value="chrM_human.fa" />
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
39 <param name="query" value="chrM_mouse.fa" />
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
40 <param name="score_file" value="lastz_d_ctrl_file.txt" />
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
41 <output name="output" value="lastz_d_test2.out" />
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
42 </test>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
43 </tests>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
44
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
45 <help><![CDATA[
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
46
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
47 **What is does**
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
48
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
49 LASTZ_D is a non-integer (**D** stands for Double) version of LASTZ that can be used to estimate substitution matrix that will be used to score alignments. It was developed by `Bob Harris <http://www.bx.psu.edu/~rsharris/>`_ in the lab of Webb Miller at Penn State as a part of LASTZ. Matrix computed by this tool is to be used by LASTZ (see below).
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
50
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
51 .. class:: warningmark
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
52
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
53 **Read documentation** before proceeding. LASTZ is a complex tool with many parameter options. Fortunately, there is a `great manual <https://lastz.github.io/lastz/>`_ maintained by its author. The two sections that are particularly relevant to the inference of substitution matrix are `Inferring Score Sets <http://www.bx.psu.edu/~rsharris/lastz/README.lastz-1.04.00.html#adv_inference>`_ and `Inference Control File <http://www.bx.psu.edu/~rsharris/lastz/README.lastz-1.04.00.html#fmt_inference>`_.
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
54
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
55 **Notes on the inference**
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
56
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
57 Inference is achieved by computing the probability of each of the 18 different alignment events (gap open, gap extend, and 16 substitutions). These probabilities are estimated from alignments of the sequences. Of course, at first we don't have alignments, so the process begins by using a generic scoring set to create alignments, infer scores from those, then realign, and so on, until the scores stabilize or "converge". Ungapped alignments are performed until the substitution scores converge, then gapped alignments are performed (holding the substitution scores constant) until the gap penalties converge. In the end you get a matrix like this::
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
58
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
59 # (a LASTZ scoring set, created by "LASTZ --infer")
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
60
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
61 bad_score = X:-1781 # used for sub[X][*] and sub[*][X]
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
62 fill_score = -178 # used when sub[*][*] not otherwise defined
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
63 gap_open_penalty = 400
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
64 gap_extend_penalty = 30
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
65
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
66 A C G T
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
67 A 72 -79 -49 -97
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
68 C -79 100 -178 -49
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
69 G -49 -178 100 -79
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
70 T -97 -49 -79 72
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
71
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
72 This dataset can then be used as an input to the **Read the substitution scores** parameter of LASTZ (Parameter section *Scoring*).
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
73
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
74 The iterative process can fail if there's not a lot of sequence to align. E.g. if after the 4th iteration there's nothing in the central 50% denominators go to zero and the process fails.
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
75
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
76 If the sequences you are aligning have GC content different than the usual ACGT 30-20-20-30 split, scoring inference should discover this and give you better alignments.
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
77
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
78
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
79 ]]>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
80 </help>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
81 <expand macro="citations"/>
bd84ff27bc16 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/lastz commit a0a0480a8df511d23ed6101a489ca06337f5ed56
devteam
parents:
diff changeset
82 </tool>