pyFastqJoiner
pyFastqJoiner is part of the pyCRAC package. Merges paired sequences from two fastq or fasta formatted files.
Example:
Forward reaction: @FCC102EACXX:3:1101:1343:2181#ATCACGAT/1##CAATAG CAAATTAGAGTGTTCAAAGCAGGCGTATTGCTCGAAT + `efhYb][bdQQ`eeaeaYbeY^ceU__IXa[^ZYae @FCC102EACXX:3:1101:1424:2248#ATCACGAT/1##CCAGGA CTAACCATAAACTATGCCTACTAGGGATCCAGAGGTG + ^_adddhJbaehbedd`dIb_^cXaRI^BBBBBBBBB @FCC102EACXX:3:1101:1623:2036#ATCACGAN/1##CTCAGC CAAAGTTAGGGGATCGAAGATGATCAGATACCGTCGT + bghfc^YbgbeadggfdffeaS^ac_X^cegaGZ_ef @FCC102EACXX:3:1101:1574:2214#ATCACGAT/1##CGTTTT CTAATGACCCACTCGGCACCTTACGAAATCAAAGTCT + cdfgYY`cefhhZef\eaggXaceeghfQaeghWNW Reverse reaction: @FCC102EACXX:3:1101:1343:2181#ATCACGAT/2 AGCCTTTAAGTTTCAGCCTTGCGACCATACTCCCCCCAGAACCCAAAGA + YJaSJ`Z`K`YbSb[[daeJRR[YeWd_I^I^ecgc]OV\bdeaegbXb @FCC102EACXX:3:1101:1424:2248#ATCACGAT/2 AAGTCCTTTAAGTTACAGCCTTGCGACCATACTACACCCAGAACCCAAA + YJJ\`JQY\`KJ`gY[[QRYY[[`H[_ceI^e[PYO^IWOHW^eaefhh @FCC102EACXX:3:1101:1623:2036#ATCACGAN/2 GGCCAATCCTTATTGTGTCTGGACCTGGTGAGTTTCCCCGTGTTGAGTC + PP\`ccQ`eY[bQQ[d`ghehaghfgdg[`gb^bd[ePbH^c_c\a_eg Here the ":" character is used to split the two sequences. This character tells pyFastqSplitter where to split the sequences. This character is ignored by pyFastqDuplicateRemover Result: @FCC102EACXX:3:1101:1343:2181#ATCACGAT/1##CAATAG@FCC102EACXX:3:1101:1343:2181#ATCACGAT/2 CAAATTAGAGTGTTCAAAGCAGGCGTATTGCTCGAAT:AGCCTTTAAGTTTCAGCCTTGCGACCATACTCCCCCCAGAACCCAAAGA + `efhYb][bdQQ`eeaeaYbeY^ceU__IXa[^ZYaeYJaSJ`Z`K`YbSb[[daeJRR[YeWd_I^I^ecgc]OV\bdeaegbXb @FCC102EACXX:3:1101:1424:2248#ATCACGAT/1##CCAGGA@FCC102EACXX:3:1101:1424:2248#ATCACGAT/2 CTAACCATAAACTATGCCTACTAGGGATCCAGAGGTG:AAGTCCTTTAAGTTACAGCCTTGCGACCATACTACACCCAGAACCCAAA + ^_adddhJbaehbedd`dIb_^cXaRI^BBBBBBBBBYJJ\`JQY\`KJ`gY[[QRYY[[`H[_ceI^e[PYO^IWOHW^eaefhh @FCC102EACXX:3:1101:1623:2036#ATCACGAN/1##CTCAGC@FCC102EACXX:3:1101:1623:2036#ATCACGAN/2 CAAAGTTAGGGGATCGAAGATGATCAGATACCGTCGT:GGCCAATCCTTATTGTGTCTGGACCTGGTGAGTTTCCCCGTGTTGAGTC + bghfc^YbgbeadggfdffeaS^ac_X^cegaGZ_efPP\`ccQ`eY[bQQ[d`ghehaghfgdg[`gb^bd[ePbH^c_c\a_eg
Parameter list
Options:
-f fastq_file1 fastq_file2 Provide the names of two raw data files separated by a single space. Make sure the first file is the data file of the forward (/1) sequencing reaction. --file_type=FASTQ Can join fasta and fastq files. Fastq is default -o mergedfastq.fastq, --outfile=mergedfastq.fastq provide the name of the output file. By default it will be printed to the standard output -c : This option adds the '|' character between the DNA sequences so that it is much easier to split the data again later on