annotate tools/filters/seq_filter_by_id.xml @ 2:abdd608c869b draft

Uploaded v0.0.5, checked script return code for errors X copes with a FASTA entry missing an ID.
author peterjc
date Wed, 24 Apr 2013 11:34:12 -0400
parents 262f08104540
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2
abdd608c869b Uploaded v0.0.5, checked script return code for errors X copes with a FASTA entry missing an ID.
peterjc
parents: 1
diff changeset
1 <tool id="seq_filter_by_id" name="Filter sequences by ID" version="0.0.5">
0
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
2 <description>from a tabular file</description>
1
262f08104540 Uploaded v0.0.4 which includes a unit test and is faster at filtering FASTA files with large records (e.g. whole chromosomes)
peterjc
parents: 0
diff changeset
3 <version_command interpreter="python">seq_filter_by_id.py --version</version_command>
0
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
4 <command interpreter="python">
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
5 seq_filter_by_id.py $input_tabular $columns $input_file $input_file.ext
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
6 #if $output_choice_cond.output_choice=="both"
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
7 $output_pos $output_neg
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
8 #elif $output_choice_cond.output_choice=="pos"
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
9 $output_pos -
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
10 #elif $output_choice_cond.output_choice=="neg"
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
11 - $output_neg
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
12 #end if
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
13 </command>
2
abdd608c869b Uploaded v0.0.5, checked script return code for errors X copes with a FASTA entry missing an ID.
peterjc
parents: 1
diff changeset
14 <stdio>
abdd608c869b Uploaded v0.0.5, checked script return code for errors X copes with a FASTA entry missing an ID.
peterjc
parents: 1
diff changeset
15 <!-- Anything other than zero is an error -->
abdd608c869b Uploaded v0.0.5, checked script return code for errors X copes with a FASTA entry missing an ID.
peterjc
parents: 1
diff changeset
16 <exit_code range="1:" />
abdd608c869b Uploaded v0.0.5, checked script return code for errors X copes with a FASTA entry missing an ID.
peterjc
parents: 1
diff changeset
17 <exit_code range=":-1" />
abdd608c869b Uploaded v0.0.5, checked script return code for errors X copes with a FASTA entry missing an ID.
peterjc
parents: 1
diff changeset
18 </stdio>
0
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
19 <inputs>
1
262f08104540 Uploaded v0.0.4 which includes a unit test and is faster at filtering FASTA files with large records (e.g. whole chromosomes)
peterjc
parents: 0
diff changeset
20 <param name="input_file" type="data" format="fasta,fastq,sff" label="Sequence file to filter on the identifiers" help="FASTA, FASTQ, or SFF format." />
0
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
21 <param name="input_tabular" type="data" format="tabular" label="Tabular file containing sequence identifiers"/>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
22 <param name="columns" type="data_column" data_ref="input_tabular" multiple="True" numerical="False" label="Column(s) containing sequence identifiers" help="Multi-select list - hold the appropriate key while clicking to select multiple columns">
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
23 <validator type="no_options" message="Pick at least one column"/>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
24 </param>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
25 <conditional name="output_choice_cond">
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
26 <param name="output_choice" type="select" label="Output positive matches, negative matches, or both?">
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
27 <option value="both">Both positive matches (ID on list) and negative matches (ID not on list), as two files</option>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
28 <option value="pos">Just positive matches (ID on list), as a single file</option>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
29 <option value="neg">Just negative matches (ID not on list), as a single file</option>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
30 </param>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
31 <!-- Seems need these dummy entries here, compare this to indels/indel_sam2interval.xml -->
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
32 <when value="both" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
33 <when value="pos" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
34 <when value="neg" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
35 </conditional>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
36 </inputs>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
37 <outputs>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
38 <data name="output_pos" format="fasta" label="With matched ID">
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
39 <!-- TODO - Replace this with format="input:input_fastq" if/when that works -->
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
40 <change_format>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
41 <when input_dataset="input_file" attribute="extension" value="sff" format="sff" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
42 <when input_dataset="input_file" attribute="extension" value="fastq" format="fastq" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
43 <when input_dataset="input_file" attribute="extension" value="fastqsanger" format="fastqsanger" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
44 <when input_dataset="input_file" attribute="extension" value="fastqsolexa" format="fastqsolexa" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
45 <when input_dataset="input_file" attribute="extension" value="fastqillumina" format="fastqillumina" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
46 <when input_dataset="input_file" attribute="extension" value="fastqcssanger" format="fastqcssanger" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
47 </change_format>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
48 <filter>output_choice_cond["output_choice"] != "neg"</filter>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
49 </data>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
50 <data name="output_neg" format="fasta" label="Without matched ID">
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
51 <!-- TODO - Replace this with format="input:input_fastq" if/when that works -->
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
52 <change_format>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
53 <when input_dataset="input_file" attribute="extension" value="sff" format="sff" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
54 <when input_dataset="input_file" attribute="extension" value="fastq" format="fastq" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
55 <when input_dataset="input_file" attribute="extension" value="fastqsanger" format="fastqsanger" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
56 <when input_dataset="input_file" attribute="extension" value="fastqsolexa" format="fastqsolexa" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
57 <when input_dataset="input_file" attribute="extension" value="fastqillumina" format="fastqillumina" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
58 <when input_dataset="input_file" attribute="extension" value="fastqcssanger" format="fastqcssanger" />
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
59 </change_format>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
60 <filter>output_choice_cond["output_choice"] != "pos"</filter>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
61 </data>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
62 </outputs>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
63 <tests>
1
262f08104540 Uploaded v0.0.4 which includes a unit test and is faster at filtering FASTA files with large records (e.g. whole chromosomes)
peterjc
parents: 0
diff changeset
64 <test>
262f08104540 Uploaded v0.0.4 which includes a unit test and is faster at filtering FASTA files with large records (e.g. whole chromosomes)
peterjc
parents: 0
diff changeset
65 <param name="input_file" value="k12_ten_proteins.fasta" ftype="fasta" />
262f08104540 Uploaded v0.0.4 which includes a unit test and is faster at filtering FASTA files with large records (e.g. whole chromosomes)
peterjc
parents: 0
diff changeset
66 <param name="input_tabular" value="k12_hypothetical.tabular" ftype="tabular" />
262f08104540 Uploaded v0.0.4 which includes a unit test and is faster at filtering FASTA files with large records (e.g. whole chromosomes)
peterjc
parents: 0
diff changeset
67 <param name="columns" value="1" />
262f08104540 Uploaded v0.0.4 which includes a unit test and is faster at filtering FASTA files with large records (e.g. whole chromosomes)
peterjc
parents: 0
diff changeset
68 <param name="output_choice" value="pos" />
262f08104540 Uploaded v0.0.4 which includes a unit test and is faster at filtering FASTA files with large records (e.g. whole chromosomes)
peterjc
parents: 0
diff changeset
69 <output name="output_pos" file="k12_hypothetical.fasta" ftype="fasta" />
262f08104540 Uploaded v0.0.4 which includes a unit test and is faster at filtering FASTA files with large records (e.g. whole chromosomes)
peterjc
parents: 0
diff changeset
70 </test>
0
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
71 </tests>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
72 <requirements>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
73 <requirement type="python-module">Bio</requirement>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
74 </requirements>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
75 <help>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
76
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
77 **What it does**
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
78
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
79 By default it divides a FASTA, FASTQ or Standard Flowgram Format (SFF) file in
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
80 two, those sequences with or without an ID present in the tabular file column(s)
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
81 specified. You can opt to have a single output file of just the matching records,
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
82 or just the non-matching ones.
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
83
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
84 Note that the order of sequences in the original sequence file is preserved, as
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
85 is any Roche XML Manifest in an SFF file. Also, if any sequences share an
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
86 identifier (which would be very unusual in SFF files), duplicates are not removed.
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
87
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
88 **Example Usage**
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
89
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
90 You may have performed some kind of contamination search, for example running
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
91 BLASTN against a database of cloning vectors or bacteria, giving you a tabular
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
92 file containing read identifiers. You could use this tool to extract only the
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
93 reads without BLAST matches (i.e. those which do not match your contaminant
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
94 database).
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
95
1
262f08104540 Uploaded v0.0.4 which includes a unit test and is faster at filtering FASTA files with large records (e.g. whole chromosomes)
peterjc
parents: 0
diff changeset
96 You may have a file of FASTA sequences which has been used with some analysis
0
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
97 tool giving tabular output, which has then been filtered on some criteria.
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
98 You can then use this tool to divide the original FASTA file into those entries
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
99 matching or not matching your criteria (those with or without their identifier
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
100 in the filtered tabular file).
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
101
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
102 **Citation**
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
103
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
104 This tool uses Biopython to read and write SFF files. If you use this tool in
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
105 scientific work leading to a publication, please cite the Biopython application
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
106 note (and Galaxy too of course):
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
107
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
108 Cock et al 2009. Biopython: freely available Python tools for computational
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
109 molecular biology and bioinformatics. Bioinformatics 25(11) 1422-3.
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
110 http://dx.doi.org/10.1093/bioinformatics/btp163 pmid:19304878.
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
111
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
112 </help>
5844f6a450ed Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
113 </tool>