comparison tools/fastq/fastq_paired_unpaired.xml @ 3:6a14074bc810 draft

Uploaded v0.0.8, automated Biopython dependency handling via ToolShed; MIT license; reST markup for README file.
author peterjc
date Mon, 29 Jul 2013 09:28:55 -0400
parents
children
comparison
equal deleted inserted replaced
2:324775a016ce 3:6a14074bc810
1 <tool id="fastq_paired_unpaired" name="Divide FASTQ file into paired and unpaired reads" version="0.0.7">
2 <description>using the read name suffices</description>
3 <version_command interpreter="python">fastq_paired_unpaired.py --version</version_command>
4 <command interpreter="python">
5 fastq_paired_unpaired.py $input_fastq.extension $input_fastq
6 #if $output_choice_cond.output_choice=="separate"
7 $output_forward $output_reverse
8 #elif $output_choice_cond.output_choice=="interleaved"
9 $output_paired
10 #end if
11 $output_singles
12 </command>
13 <stdio>
14 <!-- Anything other than zero is an error -->
15 <exit_code range="1:" />
16 <exit_code range=":-1" />
17 </stdio>
18 <inputs>
19 <param name="input_fastq" type="data" format="fastq" label="FASTQ file to divide into paired and unpaired reads"/>
20 <conditional name="output_choice_cond">
21 <param name="output_choice" type="select" label="How to output paired reads?">
22 <option value="separate">Separate (two FASTQ files, for the forward and reverse reads, in matching order).</option>
23 <option value="interleaved">Interleaved (one FASTQ file, alternating forward read then partner reverse read).</option>
24 </param>
25 <!-- Seems need these dummy entries here, compare this to indels/indel_sam2interval.xml -->
26 <when value="separate" />
27 <when value="interleaved" />
28 </conditional>
29 </inputs>
30 <outputs>
31 <data name="output_singles" format="input" label="Orphan or single reads"/>
32 <data name="output_forward" format="input" label="Forward paired reads">
33 <filter>output_choice_cond["output_choice"] == "separate"</filter>
34 </data>
35 <data name="output_reverse" format="input" label="Reverse paired reads">
36 <filter>output_choice_cond["output_choice"] == "separate"</filter>
37 </data>
38 <data name="output_paired" format="input" label="Interleaved paired reads">
39 <filter>output_choice_cond["output_choice"] == "interleaved"</filter>
40 </data>
41 </outputs>
42 <tests>
43 <test>
44 <param name="input_fastq" value="sanger-pairs-mixed.fastq" ftype="fastq"/>
45 <param name="output_choice" value="separate"/>
46 <output name="output_singles" file="sanger-pairs-singles.fastq" ftype="fastq"/>
47 <output name="output_forward" file="sanger-pairs-forward.fastq" ftype="fastq"/>
48 <output name="output_reverse" file="sanger-pairs-reverse.fastq" ftype="fastq"/>
49 </test>
50 <test>
51 <param name="input_fastq" value="sanger-pairs-mixed.fastq" ftype="fastq"/>
52 <param name="output_choice" value="interleaved"/>
53 <output name="output_singles" file="sanger-pairs-singles.fastq" ftype="fastq"/>
54 <output name="output_paired" file="sanger-pairs-interleaved.fastq" ftype="fastq"/>
55 </test>
56 </tests>
57 <help>
58
59 **What it does**
60
61 Using the common read name suffix conventions, it divides a FASTQ file into
62 paired reads, and orphan or single reads.
63
64 The input file should be a valid FASTQ file which has been sorted so that
65 any partner forward+reverse reads are consecutive. The output files all
66 preserve this sort order. Pairing are recognised based on standard name
67 suffices. See below or run the tool with no arguments for more details.
68
69 Any reads where the forward/reverse naming suffix used is not recognised
70 are treated as orphan reads. The tool supports the /1 and /2 convention
71 originally used by Illumina, .f and .r convention, the Sanger convention
72 (see http://staden.sourceforge.net/manual/pregap4_unix_50.html for details),
73 and the current Illumina convention where the reads get the same identifier
74 with the fragment number in the description, for example:
75
76 * @HWI-ST916:79:D04M5ACXX:1:1101:10000:100326 1:N:0:TGNCCA
77 * @HWI-ST916:79:D04M5ACXX:1:1101:10000:100326 2:N:0:TGNCCA
78
79 Note that this does support multiple forward and reverse reads per template
80 (which is quite common with Sanger sequencing), e.g. this which is sorted
81 alphabetically:
82
83 * WTSI_1055_4p17.p1kapIBF
84 * WTSI_1055_4p17.p1kpIBF
85 * WTSI_1055_4p17.q1kapIBR
86 * WTSI_1055_4p17.q1kpIBR
87
88 or this where the reads already come in pairs:
89
90 * WTSI_1055_4p17.p1kapIBF
91 * WTSI_1055_4p17.q1kapIBR
92 * WTSI_1055_4p17.p1kpIBF
93 * WTSI_1055_4p17.q1kpIBR
94
95 both become:
96
97 * WTSI_1055_4p17.p1kapIBF paired with WTSI_1055_4p17.q1kapIBR
98 * WTSI_1055_4p17.p1kpIBF paired with WTSI_1055_4p17.q1kpIBR
99
100 **Citation**
101
102 This tool is available to install into other Galaxy Instances via the Galaxy
103 Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/fastq_paired_unpaired
104 </help>
105 </tool>