Galaxy | Tool Preview

pyFastqJoiner (version 1.0.0)
FastQ format
FastQ format

pyFastqJoiner

pyFastqJoiner is part of the pyCRAC package. Merges paired sequences from two fastq or fasta formatted files.

Example:

Forward reaction:

@FCC102EACXX:3:1101:1343:2181#ATCACGAT/1##CAATAG
CAAATTAGAGTGTTCAAAGCAGGCGTATTGCTCGAAT
+
`efhYb][bdQQ`eeaeaYbeY^ceU__IXa[^ZYae
@FCC102EACXX:3:1101:1424:2248#ATCACGAT/1##CCAGGA
CTAACCATAAACTATGCCTACTAGGGATCCAGAGGTG
+
^_adddhJbaehbedd`dIb_^cXaRI^BBBBBBBBB
@FCC102EACXX:3:1101:1623:2036#ATCACGAN/1##CTCAGC
CAAAGTTAGGGGATCGAAGATGATCAGATACCGTCGT
+
bghfc^YbgbeadggfdffeaS^ac_X^cegaGZ_ef
@FCC102EACXX:3:1101:1574:2214#ATCACGAT/1##CGTTTT
CTAATGACCCACTCGGCACCTTACGAAATCAAAGTCT
+
cdfgYY`cefhhZef\eaggXaceeghfQaeghWNW
Reverse reaction:

@FCC102EACXX:3:1101:1343:2181#ATCACGAT/2
AGCCTTTAAGTTTCAGCCTTGCGACCATACTCCCCCCAGAACCCAAAGA
+
YJaSJ`Z`K`YbSb[[daeJRR[YeWd_I^I^ecgc]OV\bdeaegbXb
@FCC102EACXX:3:1101:1424:2248#ATCACGAT/2
AAGTCCTTTAAGTTACAGCCTTGCGACCATACTACACCCAGAACCCAAA
+
YJJ\`JQY\`KJ`gY[[QRYY[[`H[_ceI^e[PYO^IWOHW^eaefhh
@FCC102EACXX:3:1101:1623:2036#ATCACGAN/2
GGCCAATCCTTATTGTGTCTGGACCTGGTGAGTTTCCCCGTGTTGAGTC
+
PP\`ccQ`eY[bQQ[d`ghehaghfgdg[`gb^bd[ePbH^c_c\a_eg

Here the ":" character is used to split the two sequences. This character tells pyFastqSplitter where to split the sequences.
This character is ignored by pyFastqDuplicateRemover

Result:

@FCC102EACXX:3:1101:1343:2181#ATCACGAT/1##CAATAG@FCC102EACXX:3:1101:1343:2181#ATCACGAT/2
CAAATTAGAGTGTTCAAAGCAGGCGTATTGCTCGAAT:AGCCTTTAAGTTTCAGCCTTGCGACCATACTCCCCCCAGAACCCAAAGA
+
`efhYb][bdQQ`eeaeaYbeY^ceU__IXa[^ZYaeYJaSJ`Z`K`YbSb[[daeJRR[YeWd_I^I^ecgc]OV\bdeaegbXb
@FCC102EACXX:3:1101:1424:2248#ATCACGAT/1##CCAGGA@FCC102EACXX:3:1101:1424:2248#ATCACGAT/2
CTAACCATAAACTATGCCTACTAGGGATCCAGAGGTG:AAGTCCTTTAAGTTACAGCCTTGCGACCATACTACACCCAGAACCCAAA
+
^_adddhJbaehbedd`dIb_^cXaRI^BBBBBBBBBYJJ\`JQY\`KJ`gY[[QRYY[[`H[_ceI^e[PYO^IWOHW^eaefhh
@FCC102EACXX:3:1101:1623:2036#ATCACGAN/1##CTCAGC@FCC102EACXX:3:1101:1623:2036#ATCACGAN/2
CAAAGTTAGGGGATCGAAGATGATCAGATACCGTCGT:GGCCAATCCTTATTGTGTCTGGACCTGGTGAGTTTCCCCGTGTTGAGTC
+
bghfc^YbgbeadggfdffeaS^ac_X^cegaGZ_efPP\`ccQ`eY[bQQ[d`ghehaghfgdg[`gb^bd[ePbH^c_c\a_eg

Parameter list

Options:

-f fastq_file1 fastq_file2
                    Provide the names of two raw data files separated by a single space.
                    Make sure the first file is the data file of the forward (/1) sequencing reaction.

--file_type=FASTQ
                    Can join fasta and fastq files. Fastq is default

-o mergedfastq.fastq, --outfile=mergedfastq.fastq
                    provide the name of the output file. By default it
                    will be printed to the standard output

-c :
                    This option adds the '|' character between the DNA
                    sequences so that it is much easier to split the data
                    again later on