comparison tools/mira4_0/mira4_bait.xml @ 3:a4f602cc3aa9 draft

v0.0.9, was missing mirabait. Adds tests for miraconvert
author peterjc
date Fri, 02 Oct 2015 06:12:23 -0400
parents
children 1713289d9908
comparison
equal deleted inserted replaced
2:4eb32a3d67d1 3:a4f602cc3aa9
1 <tool id="mira_4_0_bait" name="MIRA v4.0 mirabait" version="0.0.9">
2 <description>Filter reads using kmer matches</description>
3 <requirements>
4 <requirement type="binary">mirabait</requirement>
5 <requirement type="package" version="4.0.2">MIRA</requirement>
6 </requirements>
7 <stdio>
8 <!-- Assume anything other than zero is an error -->
9 <exit_code range="1:" />
10 <exit_code range=":-1" />
11 </stdio>
12 <version_command interpreter="python">mira4_bait.py --version</version_command>
13 <command interpreter="python">
14 mira4_bait.py $input_reads.ext $output_choice $strand_choice $kmer_length $min_occurence "$bait_file" "$input_reads" "$output_reads"
15 </command>
16 <inputs>
17 <param name="bait_file" type="data" format="fasta,fastq,mira" required="true" label="Bait file (what to look for)" />
18 <param name="input_reads" type="data" format="fasta,fastq,mira" required="true" label="Reads to search" />
19 <param name="output_choice" type="select" label="Output positive matches, or negative matches?">
20 <option value="pos">Just positive matches</option>
21 <option value="neg">Just negative matches</option>
22 </param>
23 <param name="strand_choice" type="select" label="Check for matches on both strands?">
24 <option value="both">Check both strands</option>
25 <option value="fwd">Just forward strand</option>
26 </param>
27 <param name="kmer_length" type="integer" value="31" min="1" max="32"
28 label="k-mer length" help="Maximum 32" />
29 <param name="min_occurence" type="integer" value="1" min="1"
30 label="Minimum k-mer occurence"
31 help="How many k-mer matches do you want per read? Minimum one" />
32 </inputs>
33 <outputs>
34 <data name="output_reads" format_source="input_reads" metadata_source="input_reads"
35 label="$input_reads.name #if str($output_choice)=='pos' then 'matching' else 'excluding matches to' # $bait_file.name"/>
36 </outputs>
37 <tests>
38 <test>
39 <param name="bait_file" value="tvc_bait.fasta" ftype="fasta" />
40 <param name="input_reads" value="tvc_mini.fastq" ftype="fastqsanger" />
41 <output name="output_reads" file="tvc_mini_bait_pos.fastq" ftype="fastqsanger" />
42 </test>
43 <test>
44 <param name="bait_file" value="tvc_bait.fasta" ftype="fasta" />
45 <param name="input_reads" value="tvc_mini.fastq" ftype="fastqsanger" />
46 <param name="kmer_length" value="32" />
47 <param name="min_occurence" value="50" />
48 <output name="output_reads" file="tvc_mini_bait_strict.fastq" ftype="fastqsanger" />
49 </test>
50 <test>
51 <param name="bait_file" value="tvc_bait.fasta" ftype="fasta" />
52 <param name="input_reads" value="tvc_mini.fastq" ftype="fastqsanger" />
53 <param name="output_choice" value="neg" />
54 <output name="output_reads" file="tvc_mini_bait_neg.fastq" ftype="fastqsanger" />
55 </test>
56 </tests>
57 <help>
58 **What it does**
59
60 Runs the ``mirabait`` utility from MIRA v4.0 to filter your input reads
61 according to whether or not they contain perfect kmer matches to your
62 bait file. By default this looks for 31-mers (kmers or *k*-mers where
63 the fragment length *k* is 31), and only requires a single matching kmer.
64
65 The ``mirabait`` utility is useful in many applications and pipelines
66 outside of using the main MIRA tool for assembly or mapping.
67
68 .. class:: warningmark
69
70 Note ``mirabait`` cannot be used on protein (amino acid) sequences.
71
72 **Example Usage**
73
74 To remove over abundant entries like rRNA sequences, run ``mirabait`` with
75 known rRNA sequences as the bait and select the *negative* matches.
76
77 To do targeted assembly by fishing out reads belonging to a gene and just
78 assemble these, run ``mirabait`` with the gene of interest as the bait and
79 select the *positive* matches.
80
81 To iteratively reconstruct mitochondria you could start by fishing out reads
82 matching any known mitochondrial sequence, assembly those, and repeat.
83
84
85 **Notes on paired read**
86
87 .. class:: warningmark
88
89 While MIRA 4.0 is aware of many read naming conventions to identify paired read
90 partners, this version of the ``mirabait`` tool considers each read in isolation.
91 Applying it to paired read files may leave you with orphaned reads.
92
93 The version of ``mirabait`` included in MIRA 4.9.5 onwards is pair-aware.
94
95
96 **Citation**
97
98 If you use this Galaxy tool in work leading to a scientific publication please
99 cite the following papers:
100
101 Peter J.A. Cock, Björn A. Grüning, Konrad Paszkiewicz and Leighton Pritchard (2013).
102 Galaxy tools and workflows for sequence analysis with applications
103 in molecular plant pathology. PeerJ 1:e167
104 http://dx.doi.org/10.7717/peerj.167
105
106 Bastien Chevreux, Thomas Wetter and Sándor Suhai (1999).
107 Genome Sequence Assembly Using Trace Signals and Additional Sequence Information.
108 Computer Science and Biology: Proceedings of the German Conference on Bioinformatics (GCB) 99, pp. 45-56.
109 http://www.bioinfo.de/isb/gcb99/talks/chevreux/main.html
110
111 This wrapper is available to install into other Galaxy Instances via the Galaxy
112 Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/mira4_assembler
113 </help>
114 <citations>
115 <citation type="doi">10.7717/peerj.167</citation>
116 <citation type="bibtex">@ARTICLE{Chevreux1999-mira3,
117 author = {B. Chevreux and T. Wetter and S. Suhai},
118 year = {1999},
119 title = {Genome Sequence Assembly Using Trace Signals and Additional Sequence Information},
120 journal = {Computer Science and Biology: Proceedings of the German Conference on Bioinformatics (GCB)}
121 volume = {99},
122 pages = {45-56},
123 url = {http://www.bioinfo.de/isb/gcb99/talks/chevreux/main.html}
124 }</citation>
125 </citations>
126 </tool>