annotate cutadapt.xml @ 5:1dada50cca8a

Support for cutadapt 0.9.5, added quality trimming and additional output options
author Lance Parsons <lparsons@princeton.edu>
date Fri, 22 Jul 2011 11:03:00 -0400
parents 0a872e59164c
children 2d6671b10919
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
1 <tool id="cutadapt" name="Cutadapt" version="0.9.5.a">
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
2 <description>Remove adapter sequences from Fastq/Fasta</description>
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
3 <requirements>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
4 <requirement type="python-module">cutadapt</requirement>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
5 </requirements>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
6
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
7 <command>cutadapt
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
8 #if $input.extension.startswith( "fastq"):
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
9 --format=fastq
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
10 #else
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
11 --format=$input.extension
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
12 #end if
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
13 #for $a in $adapters
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
14 --adapter='${a.adapter_source.adapter}'
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
15 #end for
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
16 #for $aa in $anywhere_adapters
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
17 --anywhere='${aa.anywhere_adapter_source.anywhere_adapter}'
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
18 #end for
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
19 --error-rate=$error_rate
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
20 --times=$count
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
21 --overlap=$overlap
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
22 #if str($min) != '0':
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
23 --minimum-length=$min
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
24 #end if
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
25 #if str($max) != '0':
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
26 --maximum-length=$max
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
27 #end if
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
28 #if str($quality_cutoff) != '0':
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
29 --quality-cutoff=$quality_cutoff
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
30 #end if
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
31 $discard
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
32 --output='$output'
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
33 #if str( $output_params.output_type ) == "additional":
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
34 #if $output_params.rest_file:
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
35 --rest-file=$rest_output
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
36 #end if
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
37 #if $output_params.too_short_file:
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
38 --too-short-output=$too_short_output
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
39 #end if
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
40 #if $output_params.untrimmed_file:
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
41 --untrimmed-output=$untrimmed_output
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
42 #end if
3
7ed26fc9fa8a Updated for cutadapt 0.9.4, no longer need python wrapper
Lance Parsons <lparsons@princeton.edu>
parents: 2
diff changeset
43 #end if
7ed26fc9fa8a Updated for cutadapt 0.9.4, no longer need python wrapper
Lance Parsons <lparsons@princeton.edu>
parents: 2
diff changeset
44 '$input'
4
0a872e59164c Added discard_stderr_wrapper.sh script to catch report and redirect to stdout
Lance Parsons <lparsons@princeton.edu>
parents: 3
diff changeset
45 > $report
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
46 </command>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
47 <inputs>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
48 <param format="fastqsanger, fasta" name="input" type="data" optional="false" label="Fastq file to trim" length="100"/>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
49
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
50 <repeat name="adapters" title="3' Adapters">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
51 <conditional name="adapter_source">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
52 <param name="adapter_source_list" type="select" label="Source" >
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
53 <option value="prebuilt" selected="true">Standard (select from the list below)</option>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
54 <option value="user">Enter custom sequence</option>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
55 </param>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
56
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
57 <when value="user">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
58 <param name="adapter" size="30" label="Enter custom 3' adapter sequence" type="text" value="AATTGGCC" help="Sequence of an adapter that was ligated to the 3' end. The adapter itself and anything that follows is trimmed. If multiple adapters are specified, only the best matching adapter is trimmed."/>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
59 </when>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
60
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
61 <when value="prebuilt">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
62 <param name="adapter" type="select" label="Choose 3' adapter" help="Sequence of an adapter that was ligated to the 3' end. The adapter itself and anything that follows is trimmed. If multiple adapters are specified, only the best matching adapter is trimmed.">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
63 <options from_file="fastx_clipper_sequences.txt">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
64 <column name="name" index="1"/>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
65 <column name="value" index="0"/>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
66 </options>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
67 </param>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
68 </when>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
69 </conditional>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
70 </repeat>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
71
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
72 <repeat name="anywhere_adapters" title="5' or 3' (Anywhere) Adapters" help="Sequence of an adapter that was ligated to the 5' or 3' end. If the adapter is found within the read or overlapping the 3' end of the read, the behavior is the same as for the -a option. If the adapter overlaps the 5' end (beginning of the read), the initial portion of the read matching the adapter is trimmed, but anything that follows is kept. If multiple -a or -b options are given, only the best matching adapter is trimmed.">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
73 <conditional name="anywhere_adapter_source">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
74 <param name="anywhere_adapter_source_list" type="select" label="Source">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
75 <option value="prebuilt" selected="true">Standard (select from the list below)</option>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
76 <option value="user">Enter custom sequence</option>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
77 </param>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
78
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
79 <when value="user">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
80 <param name="anywhere_adapter" size="30" label="Enter custom 5' or 3' adapter sequence" type="text" value="AATTGGCC" help="Sequence of an adapter that was ligated to the 5' or 3' end. If the adapter is found within the read or overlapping the 3' end of the read, the behavior is the same as for the -a option. If the adapter overlaps the 5' end (beginning of the read), the initial portion of the read matching the adapter is trimmed, but anything that follows is kept. If multiple -a or -b options are given, only the best matching adapter is trimmed."/>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
81 </when>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
82 <when value="prebuilt">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
83 <param name="anywhere_adapter" type="select" label="Choose 5' or 3' adapter" help="Sequence of an adapter that was ligated to the 5' or 3' end. If the adapter is found within the read or overlapping the 3' end of the read, the behavior is the same as for the -a option. If the adapter overlaps the 5' end (beginning of the read), the initial portion of the read matching the adapter is trimmed, but anything that follows is kept. If multiple -a or -b options are given, only the best matching adapter is trimmed.">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
84 <options from_file="fastx_clipper_sequences.txt">
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
85 <column name="name" index="1"/>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
86 <column name="value" index="0"/>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
87 </options>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
88 </param>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
89 </when>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
90 </conditional>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
91 </repeat>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
92
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
93 <param name="error_rate" type="float" min="0" max="1" value="0.1" label="Maximum error rate" help="Maximum allowed error rate (no. of errors divided by the length of the matching region)." />
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
94 <param name="count" type="integer" min="1" value="1" label="Match times" help="Try to remove adapters at most COUNT times. Useful when an adapter gets appended multiple times." />
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
95 <param name="overlap" type="integer" min="1" value="3" label="Minimum overlap length" help="Minimum overlap length. If the overlap between the adapter and the sequence is shorter than LENGTH, the read is not modified." />
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
96 <param name="discard" type="boolean" value="false" truevalue="--discard" falsevalue="" label="Discard Trimmed Reads" help="Discard reads that contain the adapter instead of trimming them. Use the 'Minimum overlap length' option in order to avoid throwing away too many randomly matching reads!" />
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
97 <param name="min" type="integer" min="0" optional="true" value="0" label="Minimum length" help="Discard trimmed reads that are shorter than LENGTH. Reads that are too short even before adapter removal are also discarded. In colorspace, an initial primer is not counted. Value of 0 means no minimum length." />
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
98 <param name="max" type="integer" min="0" optional="true" value="0" label="Maximum length" help="Discard trimmed reads that are longer than LENGTH. Reads that are too long even before adapter removal are also discarded. In colorspace, an initial primer is not counted. Value of 0 means no maximum length." />
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
99 <param name="quality_cutoff" type="integer" min="0" optional="true" value="0" label="Quality cutoff" help="Trim low-quality ends from reads before adapter removal. The algorithm is the same as the one used by BWA (Subtract CUTOFF from all qualities; compute partial sums from all indices to the end of the sequence; cut sequence at the index at which the sum is minimal). Value of 0 means no quality trimming." />
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
100 <conditional name="output_params">
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
101 <param name="output_type" type="select" label="Additional output options" help="By default all reads will be put in the same file. However, reads with adapters matching in the middle, unmatched reads, and too-short reads can be saved in separate files.">
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
102 <option value="default">Default</option>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
103 <option value="additional">Additional output files</option>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
104 </param>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
105 <when value="default" />
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
106 <when value="additional">
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
107 <param name="rest_file" type="boolean" value="false" label="Rest of Read" help="When the adapter matches in the middle of a read, write the rest (after the adapter) into a file."/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
108 <param name="too_short_file" type="boolean" value="false" label="Too Short Reads" help="Write reads that are too short (according to minimum length specified) to a file. (default: discard reads)"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
109 <param name="untrimmed_file" type="boolean" value="false" label="Untrimmed Reads" help="Write reads that do not contain the adapter to a separate file, instead of writing them to the regular output file. (default: output to same file as trimmed)"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
110 </when>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
111 </conditional>
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
112 </inputs>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
113 <outputs>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
114 <data format="txt" name="report" label="${tool.name} on ${on_string} (Report)" />
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
115 <data format="input" name="output" metadata_source="input"/>
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
116 <data format="input" name="rest_output" metadata_source="input" label="${tool.name} on ${on_string} (Rest of Reads)" >
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
117 <filter>(output_params['output_type'] == "additional")</filter>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
118 <filter>(output_params['rest_file'] is True)</filter>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
119 </data>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
120 <data format="input" name="too_short_output" metadata_source="input" label="${tool.name} on ${on_string} (Too Short Reads)" >
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
121 <filter>(output_params['output_type'] == "additional")</filter>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
122 <filter>(output_params['too_short_file'] is True)</filter>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
123 </data>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
124 <data format="input" name="untrimmed_output" metadata_source="input" label="${tool.name} on ${on_string} (Untrimmed Reads)" >
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
125 <filter>(output_params['output_type'] == "additional")</filter>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
126 <filter>(output_params['untrimmed_file'] is True)</filter>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
127 </data>
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
128 </outputs>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
129
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
130 <tests>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
131 <test>
5
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
132 <param name="input" value="cutadapt_small.fastq" ftype="fastqsanger"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
133 <param name="adapter_source_list" value="user"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
134 <param name="adapter" value=""/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
135 <param name="anywhere_adapter_source_list" value="user"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
136 <param name="anywhere_adapter" value="TTAGACATATCTCCGTCG"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
137 <param name="output_type" value="default"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
138 <output name="output" file="cutadapt_small.out"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
139 </test>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
140 <test>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
141 <param name="input" value="cutadapt_small.fastq" ftype="fastqsanger"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
142 <param name="adapter_source_list" value="user"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
143 <param name="adapter" value="TTAGACATATCTCCGTCG"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
144 <param name="anywhere_adapter_source_list" value="user"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
145 <param name="anywhere_adapter" value=""/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
146 <param name="discard" value="true"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
147 <param name="output_type" value="default"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
148 <output name="output" file="cutadapt_discard.out"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
149 </test>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
150 <test>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
151 <param name="input" value="cutadapt_rest.fa" ftype="fasta"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
152 <param name="adapter_source_list" value="user"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
153 <param name="adapter" value="ADAPTER"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
154 <param name="anywhere_adapter_source_list" value="user"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
155 <param name="anywhere_adapter" value=""/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
156 <param name="output_type" value="additional"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
157 <param name="rest_file" value="true"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
158 <output name="output" file="cutadapt_rest.out"/>
1dada50cca8a Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents: 4
diff changeset
159 <output name="rest_output" file="cutadapt_rest2.out"/>
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
160 </test>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
161 </tests>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
162
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
163 <help>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
164 This tool removes adapter sequences from DNA high-throughput
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
165 sequencing data. This is usually necessary when the read length of the
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
166 machine is longer than the molecule that is sequenced, such as in
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
167 microRNA data.
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
168
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
169 The tool is based on the opensource cutadapt_ tool.
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
170
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
171 -----
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
172
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
173 **Algorithm**
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
174
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
175 cutadapt uses a simple semi-global alignment algorithm, without any special optimizations.
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
176 For speed, the algorithm is implemented as a Python extension module in calignmodule.c.
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
177
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
178
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
179 **Partial adapter matches**
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
180
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
181 Cutadapt correctly deals with partial adapter matches. As an example, suppose
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
182 your adapter sequence is "ADAPTER" (specified via 3' Adapters parameter).
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
183 If you have these input sequences:
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
184
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
185 ::
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
186
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
187 MYSEQUENCEADAPTER
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
188 MYSEQUENCEADAP
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
189 MYSEQUENCEADAPTERSOMETHINGELSE
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
190
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
191 All of them will be trimmed to "MYSEQUENCE". If the sequence starts with an
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
192 adapter, like this:
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
193
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
194 ::
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
195
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
196 ADAPTERSOMETHING
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
197
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
198 It will be empty after trimming.
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
199
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
200 When the allowed error rate is sufficiently high, errors in
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
201 the adapter sequence are allowed. For example, ADABTER (1 mismatch), ADAPTR (1 deletion),
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
202 and ADAPPTER (1 insertion) will all be recognized if the error rate is set to 0.15.
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
203
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
204
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
205 **Allowing adapters anywhere**
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
206
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
207 Cutadapt assumes that any adapter specified via the *3` Adapters* parameter
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
208 was ligated to the 3' end of the sequence. This is the correct assumption for
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
209 at least the SOLiD and Illumina small RNA protocols and probably others.
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
210
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
211 If, on the other hand, your adapter can also be ligated to the 5' end (on
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
212 purpose or by accident), you should tell cutadapt so by using the *5' or 3' (Anywhere)
2
f6b94b76d16b Update help
Lance Parsons <lparsons@princeton.edu>
parents: 0
diff changeset
213 Adapters* parameter. It will then use a different alignment algorithm and
0
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
214 correctly trim adapters that appear in the beginning of a read. An adapter
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
215 specified this way will also be found if it appears only partially in the
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
216 beginning of a read. For example, these sequences
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
217
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
218 ::
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
219
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
220 ADAPTERMYSEQUENCE
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
221 PTERMYSEQUENCE
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
222
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
223 will be trimmed to "MYSEQUENCE". Note that the regular algorithm would trim
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
224 the first read to an empty sequence.
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
225
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
226 This parameter currently does not work with color space data.
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
227
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
228
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
229 .. _cutadapt: http://code.google.com/p/cutadapt/
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
230 </help>
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
231
8b064ea16722 Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff changeset
232 </tool>