Repository 'umi_tools_extract'
hg clone https://toolshed.g2.bx.psu.edu/repos/iuc/umi_tools_extract

Changeset 0:418b961e0576 (2017-08-10)
Next changeset 1:79436b3019e9 (2017-08-29)
Commit message:
planemo upload commit 453bb3b44d9f27908cbe2677378da88b9f77b5cf
added:
test-data/out_R1.fastq.gz
test-data/out_R2.fastq.gz
test-data/out_SE.fastq
test-data/out_paired.log
test-data/out_single.log
test-data/t_R1.fastq
test-data/t_R1.fastq.gz
test-data/t_R2.fastq.gz
umi-tools_extract.xml
b
diff -r 000000000000 -r 418b961e0576 test-data/out_R1.fastq.gz
b
Binary file test-data/out_R1.fastq.gz has changed
b
diff -r 000000000000 -r 418b961e0576 test-data/out_R2.fastq.gz
b
Binary file test-data/out_R2.fastq.gz has changed
b
diff -r 000000000000 -r 418b961e0576 test-data/out_SE.fastq
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/out_SE.fastq Thu Aug 10 06:37:09 2017 -0400
b
b"@@ -0,0 +1,288 @@\n+@HISEQ:105:C2UE1ACXX:3:1101:11160:2245_TTA 1:N:0:CAGATC\n+AAAAGTAGTTAATATATTAGATTTGTTTGATAGTGGTAGTATATATTTTTTATTTTAGTATTTAGGAGGTAGAGGTAGATGAATTTTTGAGTTTAAAG\n++\n+BBBFFFFBFFFFFIIIIIIBIIIIBIFFBFIIBBFFFFFFFFIIIIIIIIIIIIIIIBFIIIIIB7BBBFBBFFFF77<F7BFFFFFF7B7BBFFFF7\n+@HISEQ:105:C2UE1ACXX:3:1101:19467:2281_TTT 1:N:0:CATATC\n+TTGGTTAGGGTGAGATGTATAGTTTGGATTTTAGTGATTTTTGTAAAGGGGGAAAAGAATGGAGTTTTGGGTGTAGTGAGAGGTTATAGGAGTAGGGA\n++\n+<B<<FFFBBFFFFFFFFFIFIBFFI<<BFFFIIBBBFFIIIIBFFIIBBF7BBFFFBFFF77BBFBFF777BBBBBB<<<B<7<7BBF77<7<70000\n+@HISEQ:105:C2UE1ACXX:3:1101:7009:2740_TAG 1:N:0:CAGATC\n+AAGTTTTGTTTTTTATTTGGAGGTTATGGAATGTTAAGTAAGGTTTTTTTGGGTTTTGTTATTTATTTGATAATTGTGATTGTAATGTTAATAAGGGA\n++\n+BBBFFFFFFFFFFIIIIIB<FFFFFFIBBFIIFFFIIFIIIFFIFFIIII<0<BBFFBFFFFFFFFFF<BBFFFF<BBBBF7BFFF<BBFFFFF<00<\n+@HISEQ:105:C2UE1ACXX:3:1101:19067:2707_TTT 1:N:0:CAGATC\n+GTTTTTTTATTTGATATTTTAAAGGTTTTTTTTTTTTTTTTAGAAAATTTTTTTTAGTAAGATAGATTTTAAAGGGTTTGTTTTTTTTTTTTTTTTTT\n++\n+BBBFFFFFFFFFBBFFFFIIIFIFFIFFIIIIIIIFFFFFB7'0<B0<BBFF'7<0'0<<''0<'0'0<BBB<<B'7'0'0'0<BFFFFFFFFFFFFF\n+@HISEQ:105:C2UE1ACXX:3:1101:4999:3182_AGG 1:N:0:CAGATC\n+GTTTTATGAGGATTTTAGGGGAGTGATTGTTTAAAGTTTATAAGATTTATGATTTATATATAGTTAGAATAGTATGTGTTAAATAAATATAAAGGGAG\n++\n+<BBFFFFBFFFFFFIIFBBBFFFIIFFIBFFIFIIBFFIIIIIFFIIIIIBFFIIIIIIIIIBFFIBFIII<FFFBFBFFFFFFFFFFFFFFF<07<7\n+@HISEQ:105:C2UE1ACXX:3:1101:2300:3263_GTT 1:N:0:TAGATC\n+TTATTTTTAATAAAATTTTTATTATTTAATTTATTAGTTAATATTTAGGAGTTTTATGTTGTGGTAAAATTTTGTTAGAGAGATAGAGAAAGTATTTA\n++\n+BBBFFFFFFFFBFFIIIIIIBIBBFFI0<FFFBFF70B00<'0<FFFBBBBFF''0B0BB000'7BBFFFFII'<BF0<'<''07'70BBF'7BBBFB\n+@HISEQ:105:C2UE1ACXX:3:1101:5605:3427_AGA 1:N:0:CAGATC\n+AGATGAGAGGTATAGGATGTGGGGAGTTTTAGTAAGATTTATAGATAAGAAGTGGTTCGGTTATAGGATTTGTTTTAGATTTTTAGATTTTTTTGTGT\n++\n+BBBFBFFFFFFFFFFBFFFFIIIIIIIBFFFBFFIBFFIIIIIFFIIIFFIBF7BFBFBBFFIFF<<FFFFBBFFBF0<BFFFFF7<BFFFFFF0B7B\n+@HISEQ:105:C2UE1ACXX:3:1101:8129:3589_GAT 1:N:0:CAGATC\n+TTTTAGTTTTTAGTTAGGATTATACGTTTATTGTGATAAAAGAGTTTTTTGATTTATCGGGTTATGTTAGGGTTTATTGATATTAGGGAATTTGAAGA\n++\n+BBBFFBFFFFFFBFFFBBFIIFIIFBFFFFII<BBFFFIIIBFFIFIIIIBBFFIIII<7<FBFF<FBF<7<BBFFFF7<BFFFF700<BBFF0<B'<\n+@HISEQ:105:C2UE1ACXX:3:1101:14304:3866_TAA 1:N:0:CAGATC\n+GTGTTTATATAGGGGATTTTTGAGTTTGATAGGTTGTTTTTGTAGAGGGTAGAATTTTGTGGAAATGTTGGTATTGGTTAAGGGGTTTTAGTGAAGAA\n++\n+<BBFFFFFFFFFFFFFFFFII<BBFFFBFFIFBFFBFFFII7FFFFFFFIF<BFIIII<F<BBFIFBFB7<BFFFBBBBBF777<B7BBF7B<BB7BB\n+@HISEQ:105:C2UE1ACXX:3:1101:12720:4398_AGT 1:N:0:CAGATC\n+GTATGTGTGTGTGTGTGTGTATTTAATTGAAGTTGGGTTTGGTGATATATATGTTTAATTTTAGTATTTTAGTGGTAGAGGTAGGTTAATTTTTGTTG\n++\n+<BBFBFFFFFFFFIFIFIFFFFFIIFII0BFBFFB7BFFF<BFFFFIIIIIIBFIIIIFIIII<FIFIIIFBFBBFB7B<<BB7<B<BFFFFFF0<B0\n+@HISEQ:105:C2UE1ACXX:3:1101:14945:4439_ATT 1:N:0:CAGATC\n+AGTGTTGAGTGGAGTATTAGAGAAGAGAAATAAGATAATAAAGTAATAGTTGTGATTAGGAGGTTTTTATAAGTTGATGGTTTATGTTAAGTAAGTTT\n++\n+BBBBFFBFFFFFFFFFFIIBIFFIFFBFFIIIIBFFIIIIIIBFFIIIBFFBFFFIIIFBFFBFBFFIIIIIFFFBBF77BBFFF7BBFF<BBF<<BF\n+@HISEQ:105:C2UE1ACXX:3:1101:8616:4508_AGG 1:N:0:CAGATC\n+AGAAATTTTGGGGGTGTAGGAGTGGTAGGATAGGAGTGTTGTTTTGTAATAGTTTTTTTTGAGGTTTAATAGGTAGGGTAGTTATTTTTAGTATTGTA\n++\n+BBBFFFFFFBBBFFFIIFFFIFFFIIFFIIFIFBFFFIIF<FBFI<FFIIIBFFFFFFFF077<B<BBBFF<<BBBBBB<7B<BFFFFFF0BBFF0<B\n+@HISEQ:105:C2UE1ACXX:3:1101:18975:4834_GAG 1:N:0:CAGATC\n+TATAAATGGTAATTTTGTAATTTAAAGATTTAAAAGTAATTATTGGTAATAGTTATTTGTGGGAGGTTGAGGTAGGGGGATTTTTGTAGAGATCGGAA\n++\n+BBBFFFFBBFFFFIIIBFIIIIIFIIFFFFIFIIIBFIIIIIIIB<FFIFIBFFFIIIBFFBFFFII<<BFBFFFFIFF<BBBFF0BB7<<<BF707B\n+@HISEQ:105:C2UE1ACXX:3:1101:4984:5374_TTG 1:N:0:CAGATC\n+TGTATCGAGGTTTGAATGAGAGTGGTATTTTTGTTATTTGTTAGTTAATGGTTTTGAGTATTAGTTTGGAAAATGATAATAAGTATTAGTTGAGGTGT\n++\n+B<BFFFBFFFFBF<FFFBFFFFFFFIFFFIII<BFFIIIBFFFFFFIIIBBFFFIBBBFFFIIBFFF<<BFFFFBBFFFFFF<BBFFF<BB7<7<BBB\n+@HISEQ:105:C2UE1ACXX:3:1101:12336:6058_TTT 1:N:0:CAGATC\n+ATTTAGATGATGGTTTTTTTATTTGATATTTTAAAGGTTTTTTTTTTTTTTTTTTAAAAAATTTTTTTTATAAAAATAAATTTTAAAGGGTTTTTTTT\n++\n+BBBFFBFFBFFBBFFFIIIIIIIIBFFIIIIIIIIBFFFFIIIIIFFFFFFFFF700<BF0BBFFF0<70''00'0'000'<<'<B<00'0'00007<\n+@HISEQ:105:C2UE1ACXX:3:1101:5999:6265_ATG 1:N:0:CAGATC\n+TATATTGTTATATATTTGTGTTTTTTTTGAA"..b"FFFIBBFFIIIIFIBFFIBBFFFIFIFIIFFIFFIBFIIIFBFFFIIIIIBFBBFFFBFFF<<FFFFI70<7BFF<BFFFFFFBBBF7\n+@HISEQ:105:C2UE1ACXX:3:1101:13060:22287_TGT 1:N:0:CAGATC\n+TAGTGAAGAATAAATTTTTATGTTGTATATTATTTTTTTTTAGTTCGTATATATCGGTATACGTGTTAGGATTTATAAAGATAGTTATTATTTTTTGT\n++\n+BBBFFFFFFFFFFIIIIIIIIBFIFFIIIIIIIIIIIIIIII<FFIBFFIIIIIIB<BFFFFBBBBBF<<BBFFFBFFF<BBF7<BFFFFFFFFFF7B\n+@HISEQ:105:C2UE1ACXX:3:1101:7272:22581_TTA 1:N:0:CAGATC\n+TTAATGATATTAAGAATTTTTTAAAGAATTTTATTTTTTTTAGGAATAGAAGGAGGAGGAGTATTTTGATCGATTTTTTAGGTTTTTTATAGGTGGAG\n++\n+BBBFFBFFFFFFF<BFIIIIIIFIIBFFFFIIFIIIIIIIII<<FFFIBFIBBFFBFFFFFFBBFFF<BBF<BBFFFFFF00<BBFFFFFF000<777\n+@HISEQ:105:C2UE1ACXX:3:1101:10060:23020_TTT 1:N:0:CAGATC\n+TTTTAGATTATTTAAGAAGGTATTAGGTTTTTAAGAGGAAAGGGTAGTCTTATAGTTTTGAGTATTTTTTTTAAAAGGAAGTAAGGATGGTGTTTTTA\n++\n+BBBFFBFFFFFFFIIBFIFFFFFIBBBFFFFIIIBFFFIFIBFFIFFFFIIFFFBFFII7B<FFFIIIIIFFFFBB77<B<BBF<7BB<<BBBBBFFF\n+@HISEQ:105:C2UE1ACXX:3:1101:14440:23104_TTT 1:N:0:CAGATC\n+TTTAAATTTAAGTTAAGGTTTGGGGAGTTGATTTTTGTTTTGTGGGTTGTTTTTTTTGTAGGAGTTGGTTTTTAGAGGTTTTTAGGAATTTTTGGTGT\n++\n+BBBFFFFFFFFFFFIIBBIFIBBBFFFFFBFFIIIIBFFIIBFBFFFFFFIIIIFFF7BB<<BBFB<BBBBFFB7<00B<BBFF00<BBFFFF''77B\n+@HISEQ:105:C2UE1ACXX:3:1101:6941:23338_GTT 1:N:0:CAGATC\n+AGAAAGGTTTTAAGTTGGTTGGGAATATAGGGGTTTTTTAGAGTTTTTATTAGGAGTTATAGTGTGTTGAATTTGGTTTTGGGTGTTGATTATAGGTT\n++\n+BBBFFFFFFFFFFBFFBBFFBBBFFFFIIBFFFIFFIIIIBFFFFFIIIIIIB<FFFFFFFBFBFFFF<BBFFF<<BBBB700<BBB0BBFFFF00<B\n+@HISEQ:105:C2UE1ACXX:3:1101:10069:23622_TTT 1:N:0:CAGATC\n+TTTTGTTTTAGGGTTTTATTTTTGTGTTTTATTTTTATTTTCGTATTATTAGTTTTTTTTATACGTTATTTGTAGAAGGTTAGTTTTTTTAATTTAGG\n++\n+BBBFBFFFFFBBBFFFIFIIIIIBFBFFIIBFIIIIIIIIIIBFFFIIIIIFFIIIIIIFFFFF<BFFFFF<BF<<B77BBB7BBFFFFFFFFFFF00\n+@HISEQ:105:C2UE1ACXX:3:1101:14079:24078_AGT 1:N:0:CAGATC\n+TTTAAAGTTTTTAGTTTTGAGTGGAATTTTAAGAATATTAGTGCGTTTTAAGCTTAGGTAGTTTTGGTAGTTTGAAAGTAATAGGGTGTATTTTGTAA\n++\n+BBBFFFFFFFFFFBFFFI<BBFFFFFFIIIIIBFFFIIIIBFFFFFFIIIIBFFIIBBFFFFFFI<7BFBFFF<BBF<BBFFF<07BBBBBFFF<BFF\n+@HISEQ:105:C2UE1ACXX:3:1101:12064:24631_TTT 1:N:0:CCGATC\n+TTATAGTGTATTTATATATATGAAATGAATTAATGAATTTTAAAAAAAAAGAAAGTAAGTTGTTTTTAGGATTGATATTTAGAGTTAATTTTTTGAGT\n++\n+BBBFFBFFFFFFFIIIIIIIIBFIFF0BFIIIIIBFFIIIIIIIIIIIIIBFIIFFFIBFFBFFFFFFB<BFF7BBFFFFF<B<BBBFFFFFFF'70<\n+@HISEQ:105:C2UE1ACXX:3:1101:11630:24964_GAT 1:N:0:TAGATC\n+TTAGTTTTTTTAGTGTTTTTTATTTATTTCGTTTTATTATTGGAGTTTGTTAAGAAAATTAGGGTTTGATTTGGATGTTAAGGATTGGTTTTTTTTTT\n++\n+BBBBFFFFFFFFBFFFFIIIIIIIIIIIIIBFFFIFIIIIIB7BFFFFBFFIIBFIIIIIFB<BFBFBFFFF<7BB<BBFF<7<BB77BBFFFFFFFF\n+@HISEQ:105:C2UE1ACXX:3:1101:12594:24878_AAT 1:N:0:CCGATC\n+TTTAATAGGATATGATATTATTTAATTTATAGATTATGGAAATTTTTTATATTTAATGAAGAAAGTTGGAATGTTTTGGGAGGTGTTTAGAATAAATA\n++\n+BBBFFFFB0BFFF<FFIIIFIIFIIIFIFIF<BFFFF<<BFFFIIIIIIIIIIIIIIBFFBFFIBFB7'<FF<<BBF0''77<BFB<BB0<BBBBBFF\n+@HISEQ:105:C2UE1ACXX:3:1101:4483:25030_ATT 1:N:0:CAGATC\n+AGGATGGTGTTTTTATTTTTAGATTTATATTATTTTGTTATATTTGTATTTGAGTAAGTTTATGGGTTTTTTAAAGAGGTAGGAGGAAGTTTTTTGTT\n++\n+B<BFFFFFFFFFFIIIIIIIIBBFIIFIIIIFIIII<BFFIFFIIBFIIII<BBFFIFIFIIIB7<FFFIIIFBF<<77BB00<70<B7BBBFFF0BB\n+@HISEQ:105:C2UE1ACXX:3:1101:12198:25235_TTA 1:N:0:CAGATC\n+ATATATGTAGTTTGTATTATTTTTGTTATAGTATATAAAGGTTAAAGAGTAGTTGTTTTAATTTTAGAGGTGGAGATTGGGTTGTATAGTTTTGGTTT\n++\n+BBBFFFBFFFFFFBFFIIFIIIII0BF<FF0BFFFIFFIBBFBFII<FBFBBFF7FFFIIIIIIII<F7<B'<BB7<B''0<<0<BBB77<BB'0<<B\n+@HISEQ:105:C2UE1ACXX:3:1101:20477:25084_TGT 1:N:0:CATATC\n+AGAGTTTATTGAGAAGTAAAGTATTAATTTTATGGGAGAAATGGGATAGAGGTAGTAGAAGTTGTTATGGAATGGGATTAATTAGGAAGTTAATTAAG\n++\n+BBBFFFFFFF0BBFFBFFIIFFIIIIIIIIIFIBB7FFFFIIFFBFFIFIBFFFFIIFIIFFFBFFFF77BFF<<B<FFFFFFF70<B7BBFFFFFF7\n+@HISEQ:105:C2UE1ACXX:3:1101:5725:25359_TAG 1:N:0:CAGATC\n+GAGAAATAAGATAATAAAGTAATAGTTGTGATTAGGAGGTTTTTTATAAGTTGATGGTTTATGTTAAGTAAGTTTATTAAGAAGTATAGTATTATATA\n++\n+BBBFFFFFFBFFFIIIIIIIIIIIBIFBFFFFIIBBFFFFFFFIIIIIIFFFBFFBBFFFIIBFFIIBFIIFIFIFFFFF<BF<BBFF7<<FFFFFFF\n+@HISEQ:105:C2UE1ACXX:3:1101:5502:25591_AGA 1:N:0:CAGATC\n+ATATGATTTTATTTTTAGGGATAATATTTTTTAAGTGAATTTTGATTTTTTGGTTAGTTATTTTGATGATGTGTAGAGGGTGTATAGTTTTTGGATAT\n++\n+BBBFBFFFFFFFFIIIIBBBFFFIIIIIIIIIIIBFFFIIIIIBFFIIIII<7BFFBFFFIIII<FF<FFBFFFFFB<<<BFBBBF<<BFFF00<BFB\n"
b
diff -r 000000000000 -r 418b961e0576 test-data/out_paired.log
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/out_paired.log Thu Aug 10 06:37:09 2017 -0400
b
@@ -0,0 +1,116 @@
+# output generated by extract --bc-pattern=NNNXXX --stdin=input_read1.gz --read2-in=input_read2.gz --stdout out1.gz --read2-out=out2.gz --log=/home/ubuntu/mount/git/temp/galaxy/database/files/000/dataset_863.dat
+# job started at Tue Aug  8 15:14:12 2017 on packer-test -- cba9bf6d-f8cd-49c4-8184-3f4a090cbc3e
+# pid: 32709, system: Linux 4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017 x86_64
+# log2stderr                              : False
+# loglevel                                : 1
+# pattern                                 : NNNXXX
+# pattern2                                : None
+# prime3                                  : False
+# quality_encoding                        : None
+# quality_filter_threshold                : None
+# random_seed                             : None
+# read2_in                                : input_read2.gz
+# read2_out                               : out2.gz
+# short_help                              : None
+# split                                   : False
+# stats                                   : True
+# stderr                                  : <_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>
+# stdin                                   : <_io.TextIOWrapper name='input_read1.gz' encoding='ascii'>
+# stdlog                                  : <_io.TextIOWrapper name='/home/ubuntu/mount/git/temp/galaxy/database/files/000/dataset_863.dat' mode='a' encoding='UTF-8'>
+# stdout                                  : <_io.TextIOWrapper name='out1.gz' encoding='ascii'>
+# timeit_file                             : None
+# timeit_header                           : None
+# timeit_name                             : all
+Barcode UMI Sample Count
+AAAAGT AAA AGT 1
+TTTTTT TTT TTT 1
+TTGGTT TTG GTT 1
+AAGTTT AAG TTT 1
+ATATAA ATA TAA 1
+GTTTTT GTT TTT 4
+GTTTTA GTT TTA 1
+TATAGA TAT AGA 1
+AAGTAT AAG TAT 1
+TTATTT TTA TTT 1
+AGATGA AGA TGA 1
+TTTTAG TTT TAG 2
+GTGTTT GTG TTT 1
+GTATGT GTA TGT 1
+AGTGTT AGT GTT 1
+AGAAAT AGA AAT 1
+TATAAA TAT AAA 1
+TGTATC TGT ATC 1
+TAAAAT TAA AAT 1
+ATTATT ATT ATT 1
+ATTTAG ATT TAG 1
+TATATT TAT ATT 1
+ATTTGG ATT TGG 1
+AAAATA AAA ATA 1
+AAGTTA AAG TTA 1
+GTTGTA GTT GTA 1
+TTTGAA TTT GAA 1
+ATTTGA ATT TGA 1
+TTTATA TTT ATA 2
+ATACGT ATA CGT 1
+GGTTAG GGT TAG 1
+GGTGTT GGT GTT 1
+TGGTTG TGG TTG 1
+AGTTAT AGT TAT 1
+GAGTAG GAG TAG 1
+TGTGTT TGT GTT 1
+TGAATT TGA ATT 1
+TTTAGG TTT AGG 2
+TAGATG TAG ATG 1
+TTGGAT TTG GAT 1
+TTGGAC TTG GAC 1
+TGTATG TGT ATG 1
+AGAGTG AGA GTG 1
+AGAATA AGA ATA 1
+TTGTTT TTG TTT 1
+AATATA AAT ATA 1
+TTATTA TTA TTA 1
+TTAAAG TTA AAG 2
+AGGAGT AGG AGT 1
+ATAAGT ATA AGT 1
+TTGTGG TTG TGG 1
+GTATTA GTA TTA 1
+AGTGGA AGT GGA 1
+TTTGGG TTT GGG 1
+TAAGAG TAA GAG 1
+TTGGGA TTG GGA 1
+TATAGG TAT AGG 1
+TTTTTG TTT TTG 1
+AAGAGG AAG AGG 1
+GAGAGA GAG AGA 1
+TTGGTG TTG GTG 1
+TAGAGG TAG AGG 1
+ATTTTG ATT TTG 1
+TGGATG TGG ATG 1
+ATTTTT ATT TTT 1
+TTATAT TTA TAT 1
+TGTTTG TGT TTG 1
+GTTGGT GTT GGT 1
+TCTTTA TCT TTA 1
+ATTGTT ATT GTT 2
+ATAAGA ATA AGA 1
+TATCGA TAT CGA 1
+TTTGTT TTT GTT 1
+GTGTAT GTG TAT 1
+AGTGAA AGT GAA 1
+GTTGTG GTT GTG 1
+TAGTGA TAG TGA 1
+TATTGA TAT TGA 1
+TTAATG TTA ATG 1
+TTTAAA TTT AAA 2
+AGAAAG AGA AAG 1
+TTTTGT TTT TGT 1
+TTATAG TTA TAG 1
+AGGTGT AGG TGT 1
+TTAGTT TTA GTT 1
+TTTAAT TTT AAT 1
+AGGATG AGG ATG 1
+ATATAT ATA TAT 1
+AGAGTT AGA GTT 1
+GAGAAA GAG AAA 1
+ATATGA ATA TGA 1
+# job finished in 0 seconds at Tue Aug  8 15:14:12 2017 --  0.41  0.02  0.00  0.00 -- cba9bf6d-f8cd-49c4-8184-3f4a090cbc3e
b
diff -r 000000000000 -r 418b961e0576 test-data/out_single.log
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/out_single.log Thu Aug 10 06:37:09 2017 -0400
b
@@ -0,0 +1,92 @@
+# output generated by extract --bc-pattern=XXXNNN --stdin=/home/ubuntu/mount/git/temp/galaxy/database/files/000/dataset_867.dat --stdout /home/ubuntu/mount/git/temp/galaxy/database/files/000/dataset_886.dat --3prime --quality-filter-threshold 10 --quality-encoding phred33 --log=/home/ubuntu/mount/git/temp/galaxy/database/files/000/dataset_887.dat
+# job started at Wed Aug  9 09:51:05 2017 on packer-test -- e5896848-87b1-4bf4-a96c-b7a4a83a0b8b
+# pid: 7652, system: Linux 4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017 x86_64
+# log2stderr                              : False
+# loglevel                                : 1
+# pattern                                 : XXXNNN
+# pattern2                                : None
+# prime3                                  : True
+# quality_encoding                        : phred33
+# quality_filter_threshold                : 10
+# random_seed                             : None
+# read2_in                                : None
+# read2_out                               : None
+# short_help                              : None
+# split                                   : False
+# stats                                   : True
+# stderr                                  : <_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>
+# stdin                                   : <_io.TextIOWrapper name='/home/ubuntu/mount/git/temp/galaxy/database/files/000/dataset_867.dat' mode='r' encoding='UTF-8'>
+# stdlog                                  : <_io.TextIOWrapper name='/home/ubuntu/mount/git/temp/galaxy/database/files/000/dataset_887.dat' mode='a' encoding='UTF-8'>
+# stdout                                  : <_io.TextIOWrapper name='/home/ubuntu/mount/git/temp/galaxy/database/files/000/dataset_886.dat' mode='w' encoding='UTF-8'>
+# timeit_file                             : None
+# timeit_header                           : None
+# timeit_name                             : all
+Barcode UMI Sample Count
+AAGTTA TTA AAG 1
+GGATTT TTT GGA 1
+GGATAG TAG GGA 1
+TTTTTT TTT TTT 4
+GAGAGG AGG GAG 1
+TTAGTT GTT TTA 1
+TGTAGA AGA TGT 1
+AGAGAT GAT AGA 1
+GAATAA TAA GAA 1
+TTGAGT AGT TTG 1
+TTTATT ATT TTT 1
+GTAAGG AGG GTA 1
+GAAGAG GAG GAA 1
+TGTTTG TTG TGT 1
+ATAATG ATG ATA 1
+TTAGAT GAT TTA 1
+GTCTGA TGA GTC 1
+ATCATC ATC ATC 1
+GTCACC ACC GTC 1
+AACTCC TCC AAC 1
+TTATTA TTA TTA 1
+TGTTGT TGT TGT 2
+GTTTAT TAT GTT 1
+ATGAGT AGT ATG 1
+ATAATT ATT ATA 1
+TTGGTT GTT TTG 1
+GTTTTG TTG GTT 1
+CACGTC GTC CAC 1
+TTGTGT TGT TTG 1
+TGAATT ATT TGA 1
+GGATGT TGT GGA 1
+AGTCAC CAC AGT 1
+GTTTTT TTT GTT 1
+TATAGT AGT TAT 1
+TTTTAT TAT TTT 1
+TAAAAA AAA TAA 1
+ATAGGT GGT ATA 1
+ATTTTA TTA ATT 1
+TGTATA ATA TGT 1
+TGTTTT TTT TGT 2
+AGATTT TTT AGA 1
+TAGAAT AAT TAG 1
+GAAGGA GGA GAA 1
+GTTTGA TGA GTT 1
+TTTTGT TGT TTT 1
+ATTTTT TTT ATT 1
+TGTTAT TAT TGT 1
+CAGGAT GAT CAG 1
+GTATTT TTT GTA 1
+TGGGTA GTA TGG 1
+AAAGTT GTT AAA 1
+GGTGTG GTG GGT 1
+AAGATC ATC AAG 1
+TAGGTT GTT TAG 1
+GAGTTA TTA GAG 1
+TTATTT TTT TTA 1
+GTTGTT GTT GTT 1
+AGGTTT TTT AGG 1
+TAAAGT AGT TAA 1
+AGTTTT TTT AGT 1
+TTTGAT GAT TTT 1
+ATAAAT AAT ATA 1
+GTTATT ATT GTT 1
+TTTTTA TTA TTT 1
+AAGTGT TGT AAG 1
+ATATAG TAG ATA 1
+TATAGA AGA TAT 1
+# job finished in 0 seconds at Wed Aug  9 09:51:05 2017 --  0.36  0.02  0.00  0.00 -- e5896848-87b1-4bf4-a96c-b7a4a83a0b8b
b
diff -r 000000000000 -r 418b961e0576 test-data/t_R1.fastq
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/t_R1.fastq Thu Aug 10 06:37:09 2017 -0400
b
b"@@ -0,0 +1,400 @@\n+@HISEQ:105:C2UE1ACXX:3:1101:11160:2245 1:N:0:CAGATC\n+AAAAGTAGTTAATATATTAGATTTGTTTGATAGTGGTAGTATATATTTTTTATTTTAGTATTTAGGAGGTAGAGGTAGATGAATTTTTGAGTTTAAAGTTA\n++\n+BBBFFFFBFFFFFIIIIIIBIIIIBIFFBFIIBBFFFFFFFFIIIIIIIIIIIIIIIBFIIIIIB7BBBFBBFFFF77<F7BFFFFFF7B7BBFFFF7<BF\n+@HISEQ:105:C2UE1ACXX:3:1101:19338:2197 1:N:0:CAGATC\n+TTTTTTTTTAGAGGGATTAGTTTTTTTTATTGAGGTTTTTGAAAGTTGTTGTATGTTAATTGTTTTTAGAATGTTGGGTATAAGTAGGATTTAGGTCTATT\n++\n+BBBFFFFFFB0<BF7BBBF7BFFIIIII7BF'0<0BBFFF'<BB'<B7<B07<B7<BFBBF0<BBBBB0<<B0BB<<000<BF00<<'0<BBB0'00BBF#\n+@HISEQ:105:C2UE1ACXX:3:1101:19467:2281 1:N:0:CATATC\n+TTGGTTAGGGTGAGATGTATAGTTTGGATTTTAGTGATTTTTGTAAAGGGGGAAAAGAATGGAGTTTTGGGTGTAGTGAGAGGTTATAGGAGTAGGGATTT\n++\n+<B<<FFFBBFFFFFFFFFIFIBFFI<<BFFFIIBBBFFIIIIBFFIIBBF7BBFFFBFFF77BBFBFF777BBBBBB<<<B<7<7BBF77<7<700007BB\n+@HISEQ:105:C2UE1ACXX:3:1101:7009:2740 1:N:0:CAGATC\n+AAGTTTTGTTTTTTATTTGGAGGTTATGGAATGTTAAGTAAGGTTTTTTTGGGTTTTGTTATTTATTTGATAATTGTGATTGTAATGTTAATAAGGGATAG\n++\n+BBBFFFFFFFFFFIIIIIB<FFFFFFIBBFIIFFFIIFIIIFFIFFIIII<0<BBFFBFFFFFFFFFF<BBFFFF<BBBBF7BFFF<BBFFFFF<00<BB0\n+@HISEQ:105:C2UE1ACXX:3:1101:13708:2613 1:N:0:CAGATC\n+ATATAATAGATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGAAAGATAATTAATTTTTAAAATTTTTTTTTTTTTAATAAAA\n++\n+BBBFFFFFBFFFFIIIIIIIIIIIFFFFFB7B007BB0''''''0077BFF7'077BF0''0''''''00''00''''00'<BBBFBBFFFFF########\n+@HISEQ:105:C2UE1ACXX:3:1101:19067:2707 1:N:0:CAGATC\n+GTTTTTTTATTTGATATTTTAAAGGTTTTTTTTTTTTTTTTAGAAAATTTTTTTTAGTAAGATAGATTTTAAAGGGTTTGTTTTTTTTTTTTTTTTTTTTT\n++\n+BBBFFFFFFFFFBBFFFFIIIFIFFIFFIIIIIIIFFFFFB7'0<B0<BBFF'7<0'0<<''0<'0'0<BBB<<B'7'0'0'0<BFFFFFFFFFFFFFFFF\n+@HISEQ:105:C2UE1ACXX:3:1101:4999:3182 1:N:0:CAGATC\n+GTTTTATGAGGATTTTAGGGGAGTGATTGTTTAAAGTTTATAAGATTTATGATTTATATATAGTTAGAATAGTATGTGTTAAATAAATATAAAGGGAGAGG\n++\n+<BBFFFFBFFFFFFIIFBBBFFFIIFFIBFFIFIIBFFIIIIIFFIIIIIBFFIIIIIIIIIBFFIBFIII<FFFBFBFFFFFFFFFFFFFFF<07<7<<7\n+@HISEQ:105:C2UE1ACXX:3:1101:16790:3145 1:N:0:CAGATC\n+TATAGAGGTATTTTGTTATTTTGTTTTAGTTATTGCGGGTTAGAGTAGATGGTTATTTTTAGTAGAGTATTGTTTGTTGTTTTTTATATGTGGTATAGAGG\n++\n+BBBF<BBFFFFFFI<FFFIFIIBBFFIIBFFFFI<BBBFFFF<FFFF7FF70<BFFIFIIF7BFBB<FFBF'BBF'<<0<BFFBBBBBF'<70<<<B####\n+@HISEQ:105:C2UE1ACXX:3:1101:18065:3106 1:N:0:CAGATC\n+AAGTATTTGTTATATATATTTTAAAGTTTTTTTTTTTTTTAGGAATTTTTTTTTATAATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGGTTTTTTT\n++\n+BBBFFFFFBFFFFIIIIFIIIIIIIFIFIIIIIIIIIIFF''00<<BB'<<'7''000<BFFF77<77BFF77BFBB07<7<BFFFFFFFF##########\n+@HISEQ:105:C2UE1ACXX:3:1101:2300:3263 1:N:0:TAGATC\n+TTATTTTTAATAAAATTTTTATTATTTAATTTATTAGTTAATATTTAGGAGTTTTATGTTGTGGTAAAATTTTGTTAGAGAGATAGAGAAAGTATTTAGTT\n++\n+BBBFFFFFFFFBFFIIIIIIBIBBFFI0<FFFBFF70B00<'0<FFFBBBBFF''0B0BB000'7BBFFFFII'<BF0<'<''07'70BBF'7BBBFB7<B\n+@HISEQ:105:C2UE1ACXX:3:1101:5605:3427 1:N:0:CAGATC\n+AGATGAGAGGTATAGGATGTGGGGAGTTTTAGTAAGATTTATAGATAAGAAGTGGTTCGGTTATAGGATTTGTTTTAGATTTTTAGATTTTTTTGTGTAGA\n++\n+BBBFBFFFFFFFFFFBFFFFIIIIIIIBFFFBFFIBFFIIIIIFFIIIFFIBF7BFBFBBFFIFF<<FFFFBBFFBF0<BFFFFF7<BFFFFFF0B7BF0<\n+@HISEQ:105:C2UE1ACXX:3:1101:8129:3589 1:N:0:CAGATC\n+TTTTAGTTTTTAGTTAGGATTATACGTTTATTGTGATAAAAGAGTTTTTTGATTTATCGGGTTATGTTAGGGTTTATTGATATTAGGGAATTTGAAGAGAT\n++\n+BBBFFBFFFFFFBFFFBBFIIFIIFBFFFFII<BBFFFIIIBFFIFIIIIBBFFIIII<7<FBFF<FBF<7<BBFFFF7<BFFFF700<BBFF0<B'<07B\n+@HISEQ:105:C2UE1ACXX:3:1101:14304:3866 1:N:0:CAGATC\n+GTGTTTATATAGGGGATTTTTGAGTTTGATAGGTTGTTTTTGTAGAGGGTAGAATTTTGTGGAAATGTTGGTATTGGTTAAGGGGTTTTAGTGAAGAATAA\n++\n+<BBFFFFFFFFFFFFFFFFII<BBFFFBFFIFBFFBFFFII7FFFFFFFIF<BFIIII<F<BBFIFBFB7<BFFFBBBBBF777<B7BBF7B<BB7BBFFF\n+@HISEQ:105:C2UE1ACXX:3:1101:12720:4398 1:N:0:CAGATC\n+GTATGTGTGTGTGTGTGTGTATTTAATTGAAGTTGGGTTTGGTGATATATATGTTTAATTTTAGTATTTTAGTGGTAGAGGTAGGTTAATTTTTGTTGAGT\n++\n+<BBFBFFFFFFFFIFIFIFFFFFIIFII0BFBFFB7BFFF<BFFFFIIIIIIBFIIIIFIIII<FIFIIIFBFBBFB7B<<BB7<B<BFFFFFF0<B0<7B\n+@HISEQ:105:C2UE1ACXX:3:1101:14945:4439 1:N:0:CAGATC\n+AGTGTTGAGTGGAGTATTAGAGAAGAGAAATAAGATAATAAAGTAATAGTTGTGATTAGGAGGTTTTTATAAGTTGATGGTTTATGTTAAGTAAGTTTATT\n++\n+BBBBFFBFFFFFFFFFFIIBIFFIFFBFFIIIIBFFIIIIIIBFFIIIBFFBFFFIIIFBFFBFBFFIIIIIFFFBBF77BBFFF7BBFF<BBF<<BFFFB\n+@HISEQ:105:C2UE1ACXX:3:1101:8616:4508 1:N:0:CAGATC\n+AGA"..b"0<BFFBFFFBF7FBFIIFIF7BFFI7<7B<BBFB'<BF'7BBFF7BBBBBBB#######\n+@HISEQ:105:C2UE1ACXX:3:1101:7272:22581 1:N:0:CAGATC\n+TTAATGATATTAAGAATTTTTTAAAGAATTTTATTTTTTTTAGGAATAGAAGGAGGAGGAGTATTTTGATCGATTTTTTAGGTTTTTTATAGGTGGAGTTA\n++\n+BBBFFBFFFFFFF<BFIIIIIIFIIBFFFFIIFIIIIIIIII<<FFFIBFIBBFFBFFFFFFBBFFF<BBF<BBFFFFFF00<BBFFFFFF000<777B<B\n+@HISEQ:105:C2UE1ACXX:3:1101:10060:23020 1:N:0:CAGATC\n+TTTTAGATTATTTAAGAAGGTATTAGGTTTTTAAGAGGAAAGGGTAGTCTTATAGTTTTGAGTATTTTTTTTAAAAGGAAGTAAGGATGGTGTTTTTATTT\n++\n+BBBFFBFFFFFFFIIBFIFFFFFIBBBFFFFIIIBFFFIFIBFFIFFFFIIFFFBFFII7B<FFFIIIIIFFFFBB77<B<BBF<7BB<<BBBBBFFFFFF\n+@HISEQ:105:C2UE1ACXX:3:1101:14440:23104 1:N:0:CAGATC\n+TTTAAATTTAAGTTAAGGTTTGGGGAGTTGATTTTTGTTTTGTGGGTTGTTTTTTTTGTAGGAGTTGGTTTTTAGAGGTTTTTAGGAATTTTTGGTGTTTT\n++\n+BBBFFFFFFFFFFFIIBBIFIBBBFFFFFBFFIIIIBFFIIBFBFFFFFFIIIIFFF7BB<<BBFB<BBBBFFB7<00B<BBFF00<BBFFFF''77BBBF\n+@HISEQ:105:C2UE1ACXX:3:1101:6941:23338 1:N:0:CAGATC\n+AGAAAGGTTTTAAGTTGGTTGGGAATATAGGGGTTTTTTAGAGTTTTTATTAGGAGTTATAGTGTGTTGAATTTGGTTTTGGGTGTTGATTATAGGTTGTT\n++\n+BBBFFFFFFFFFFBFFBBFFBBBFFFFIIBFFFIFFIIIIBFFFFFIIIIIIB<FFFFFFFBFBFFFF<BBFFF<<BBBB700<BBB0BBFFFF00<B7<B\n+@HISEQ:105:C2UE1ACXX:3:1101:10069:23622 1:N:0:CAGATC\n+TTTTGTTTTAGGGTTTTATTTTTGTGTTTTATTTTTATTTTCGTATTATTAGTTTTTTTTATACGTTATTTGTAGAAGGTTAGTTTTTTTAATTTAGGTTT\n++\n+BBBFBFFFFFBBBFFFIFIIIIIBFBFFIIBFIIIIIIIIIIBFFFIIIIIFFIIIIIIFFFFF<BFFFFF<BF<<B77BBB7BBFFFFFFFFFFF00<BB\n+@HISEQ:105:C2UE1ACXX:3:1101:14079:24078 1:N:0:CAGATC\n+TTTAAAGTTTTTAGTTTTGAGTGGAATTTTAAGAATATTAGTGCGTTTTAAGCTTAGGTAGTTTTGGTAGTTTGAAAGTAATAGGGTGTATTTTGTAAAGT\n++\n+BBBFFFFFFFFFFBFFFI<BBFFFFFFIIIIIBFFFIIIIBFFFFFFIIIIBFFIIBBFFFFFFI<7BFBFFF<BBF<BBFFF<07BBBBBFFF<BFFF<B\n+@HISEQ:105:C2UE1ACXX:3:1101:12064:24631 1:N:0:CCGATC\n+TTATAGTGTATTTATATATATGAAATGAATTAATGAATTTTAAAAAAAAAGAAAGTAAGTTGTTTTTAGGATTGATATTTAGAGTTAATTTTTTGAGTTTT\n++\n+BBBFFBFFFFFFFIIIIIIIIBFIFF0BFIIIIIBFFIIIIIIIIIIIIIBFIIFFFIBFFBFFFFFFB<BFF7BBFFFFF<B<BBBFFFFFFF'70<<BF\n+@HISEQ:105:C2UE1ACXX:3:1101:6662:24968 1:N:0:CAGATC\n+AGGTGTCGTTTAATTGTTTAGGTTTATGGTATTGTGTTTCGTTTTTTTGGTATTTGTGAGGGTAGAATTGTTTTTGGGTTTTAATTTTTTTAAGTATGGGA\n++\n+BBBFFFFFFFFFFIIBFFIFBBFFFFIB<FFFI<FBFFFFBFFIIIII77BFFIIBFBFBBBFBBFFFF<BBFFF'07BBBBFFFFFFFFFFB0<BF####\n+@HISEQ:105:C2UE1ACXX:3:1101:11630:24964 1:N:0:TAGATC\n+TTAGTTTTTTTAGTGTTTTTTATTTATTTCGTTTTATTATTGGAGTTTGTTAAGAAAATTAGGGTTTGATTTGGATGTTAAGGATTGGTTTTTTTTTTGAT\n++\n+BBBBFFFFFFFFBFFFFIIIIIIIIIIIIIBFFFIFIIIIIB7BFFFFBFFIIBFIIIIIFB<BFBFBFFFF<7BB<BBFF<7<BB77BBFFFFFFFF0<B\n+@HISEQ:105:C2UE1ACXX:3:1101:12594:24878 1:N:0:CCGATC\n+TTTAATAGGATATGATATTATTTAATTTATAGATTATGGAAATTTTTTATATTTAATGAAGAAAGTTGGAATGTTTTGGGAGGTGTTTAGAATAAATAAAT\n++\n+BBBFFFFB0BFFF<FFIIIFIIFIIIFIFIF<BFFFF<<BFFFIIIIIIIIIIIIIIBFFBFFIBFB7'<FF<<BBF0''77<BFB<BB0<BBBBBFFBBF\n+@HISEQ:105:C2UE1ACXX:3:1101:4483:25030 1:N:0:CAGATC\n+AGGATGGTGTTTTTATTTTTAGATTTATATTATTTTGTTATATTTGTATTTGAGTAAGTTTATGGGTTTTTTAAAGAGGTAGGAGGAAGTTTTTTGTTATT\n++\n+B<BFFFFFFFFFFIIIIIIIIBBFIIFIIIIFIIII<BFFIFFIIBFIIII<BBFFIFIFIIIB7<FFFIIIFBF<<77BB00<70<B7BBBFFF0BBFFF\n+@HISEQ:105:C2UE1ACXX:3:1101:12198:25235 1:N:0:CAGATC\n+ATATATGTAGTTTGTATTATTTTTGTTATAGTATATAAAGGTTAAAGAGTAGTTGTTTTAATTTTAGAGGTGGAGATTGGGTTGTATAGTTTTGGTTTTTA\n++\n+BBBFFFBFFFFFFBFFIIFIIIII0BF<FF0BFFFIFFIBBFBFII<FBFBBFF7FFFIIIIIIII<F7<B'<BB7<B''0<<0<BBB77<BB'0<<BBFB\n+@HISEQ:105:C2UE1ACXX:3:1101:20477:25084 1:N:0:CATATC\n+AGAGTTTATTGAGAAGTAAAGTATTAATTTTATGGGAGAAATGGGATAGAGGTAGTAGAAGTTGTTATGGAATGGGATTAATTAGGAAGTTAATTAAGTGT\n++\n+BBBFFFFFFF0BBFFBFFIIFFIIIIIIIIIFIBB7FFFFIIFFBFFIFIBFFFFIIFIIFFFBFFFF77BFF<<B<FFFFFFF70<B7BBFFFFFF7B<B\n+@HISEQ:105:C2UE1ACXX:3:1101:5725:25359 1:N:0:CAGATC\n+GAGAAATAAGATAATAAAGTAATAGTTGTGATTAGGAGGTTTTTTATAAGTTGATGGTTTATGTTAAGTAAGTTTATTAAGAAGTATAGTATTATATATAG\n++\n+BBBFFFFFFBFFFIIIIIIIIIIIBIFBFFFFIIBBFFFFFFFIIIIIIFFFBFFBBFFFIIBFFIIBFIIFIFIFFFFF<BF<BBFF7<<FFFFFFFFF0\n+@HISEQ:105:C2UE1ACXX:3:1101:5502:25591 1:N:0:CAGATC\n+ATATGATTTTATTTTTAGGGATAATATTTTTTAAGTGAATTTTGATTTTTTGGTTAGTTATTTTGATGATGTGTAGAGGGTGTATAGTTTTTGGATATAGA\n++\n+BBBFBFFFFFFFFIIIIBBBFFFIIIIIIIIIIIBFFFIIIIIBFFIIIII<7BFFBFFFIIII<FF<FFBFFFFFB<<<BFBBBF<<BFFF00<BFBF7<\n"
b
diff -r 000000000000 -r 418b961e0576 test-data/t_R1.fastq.gz
b
Binary file test-data/t_R1.fastq.gz has changed
b
diff -r 000000000000 -r 418b961e0576 test-data/t_R2.fastq.gz
b
Binary file test-data/t_R2.fastq.gz has changed
b
diff -r 000000000000 -r 418b961e0576 umi-tools_extract.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/umi-tools_extract.xml Thu Aug 10 06:37:09 2017 -0400
[
b'@@ -0,0 +1,255 @@\n+<tool id="umi_tools_extract" name="UMI-tools extract" version="0.4.4.0">\n+    <description>Extract UMI from fastq files</description>\n+    <requirements>\n+        <requirement type="package" version="0.4.4">umi_tools</requirement>\n+    </requirements>\n+    <command detect_errors="exit_code"><![CDATA[\n+        #set $gz = False\n+        #if $input_type.type == \'single\':\n+            #if $input_type.input_single.is_of_type("fastq.gz", "fastqsanger.gz"):\n+                ln -s \'$input_type.input_single\' input_single.gz &&\n+                #set $gz = True\n+            #end if\n+        #else\n+            #if $input_type.input_read1.is_of_type("fastq.gz", "fastqsanger.gz"):\n+                ln -s \'$input_type.input_read1\' input_read1.gz &&\n+                ln -s \'$input_type.input_read2\' input_read2.gz &&\n+                #set $gz = True\n+            #end if\n+        #end if\n+        umi_tools extract\n+            --bc-pattern=\'$bc_pattern\'\n+            #if $input_type.type == \'single\':\n+                #if $gz:\n+                    --stdin=input_single.gz\n+                    --stdout out.gz\n+                #else\n+                    --stdin=\'$input_type.input_single\'\n+                    --stdout \'$out\'\n+                #end if\n+            #else:\n+                #if $gz:\n+                    --stdin=input_read1.gz\n+                    --read2-in=input_read2.gz\n+                    --stdout out1.gz\n+                    --read2-out=out2.gz\n+                #else:\n+                    --stdin=\'$input_type.input_read1\'\n+                    --read2-in=\'$input_type.input_read2\'\n+                    --stdout \'$out1\'\n+                    --read2-out=\'$out2\'\n+                #end if\n+                #if $input_type.barcode.split == "1":\n+                    --split-barcode\n+                    --bc-pattern2=\'$input_type.barcode.bc_pattern2\'\n+                #end if\n+            #end if\n+            #if not $prime3:\n+                --3prime\n+            #end if\n+            #if $quality.quality_selector ==\'true\':\n+                --quality-filter-threshold \'$quality.quality_filter_threshold\'\n+                --quality-encoding \'$quality.quality_encoding\'\n+            #end if\n+            #if $print_log == "1":\n+                --log=\'$out_log\'\n+            #else\n+                --supress-stats\n+            #end if\n+        #if $gz:\n+            #if $input_type.type == \'single\':\n+                && mv out.gz \'$out\'\n+            #else\n+                && mv out1.gz \'$out1\'\n+                && mv out2.gz \'$out2\'\n+            #end if\n+        #end if\n+    ]]></command>\n+    <inputs>\n+        <conditional name="input_type">\n+            <param name="type" type="select" label="Library type">\n+                <option value="single">Single-end</option>\n+                <option value="paired">Paired-end</option>\n+            </param>\n+            <when value="single">\n+                <param name="input_single" type="data" format="fastq,fastq.gz" label="Reads in FASTQ format" />\n+            </when>\n+            <when value="paired">\n+                <param name="input_read1" type="data" format="fastq,fastq.gz" label="Reads in FASTQ format" />\n+                <param name="input_read2" type="data" format="fastq,fastq.gz" label="Reads in FASTQ format" />\n+                <conditional name="barcode">\n+                    <param name="split" argument="--split-barcode" type="select" label="Barcode on both reads?">\n+                        <option value="0">Barcode on first read only</option>\n+                        <option value="1">Barcode on both reads</option>\n+                    </param>\n+                    <when value="0">\n+                    </when>\n+                    <when value="1">\n+                        <param name="bc_pattern2" argument="--bc-pattern2" type="text" value="" label="Barcode pattern for second read"\n+                            help="Use this option to specify the format of the UMI/barcode for\n+   '..b'   </data>\n+    </outputs>\n+    <tests>\n+        <test>\n+            <param name="type" value="single" />\n+            <param name="input_single" value="t_R1.fastq" ftype="fastq" />\n+            <param name="bc_pattern" value="XXXNNN" />\n+            <param name="prime3" value="0" />\n+            <param name="quality_selector" value="true" />\n+            <param name="quality_filter_threshold" value="10" />\n+            <param name="quality_encoding" value="phred33" />\n+            <output name="out" file="out_SE.fastq" />\n+            <output name="out_log" file="out_single.log" lines_diff="15"/>\n+        </test>\n+        <test>\n+            <param name="type" value="paired" />\n+            <param name="input_read1" value="t_R1.fastq.gz" ftype="fastq.gz" />\n+            <param name="input_read2" value="t_R2.fastq.gz" ftype="fastq.gz" />\n+            <param name="bc_pattern" value="NNNXXX" />\n+            <output name="out1" file="out_R1.fastq.gz" decompress="true" />\n+            <output name="out2" file="out_R2.fastq.gz" decompress="true" />\n+            <output name="out_log" file="out_paired.log" lines_diff="10"/>\n+        </test>\n+    </tests>\n+    <help><![CDATA[\n+\n+\n+UMI-tools extract.py - Extract UMI from fastq\n+=============================================\n+\n+Purpose\n+-------\n+\n+Extract UMI barcode from a read and add it to the read name, leaving\n+any sample barcode in place. Can deal with paired end reads and UMIs\n+split across the paired ends\n+\n+Options\n+-------\n+\n+--split-barcode\n+       By default the UMI is assumed to be on the first read. Use this\n+       option if the UMI is contained on both reads and specify the\n+       pattern of the barcode/UMI on the second read using the option\n+       ``--bc-pattern2``\n+\n+--bc-pattern\n+       Use this option to specify the format of the UMI/barcode. Use Ns to\n+       represent the random positions and Xs to indicate the bc positions.\n+       Bases with Ns will be extracted and added to the read name. Remaining\n+       bases, marked with an X will be reattached to the read.\n+\n+       E.g. If the pattern is NNXXNN,\n+       Then the read:\n+\n+       @HISEQ:87:00000000 read1\n+       AAGGTTGCTGATTGGATGGGCTAG\n+       DA1AEBFGGCG01DFH00B1FF0B\n+       +\n+\n+       will become:\n+       @HISEQ:87:00000000_AATT read1\n+       GGGCTGATTGGATGGGCTAG\n+       1AFGGCG01DFH00B1FF0B\n+       +\n+\n+--bc-pattern2\n+       Use this option to specify the format of the UMI/barcode for\n+       the second read pair if required. If --bc-pattern2 is not\n+       supplied, this defaults to the same pattern as --bc-pattern\n+\n+--3prime\n+       By default the barcode is assumed to be on the 5\' end of the read, but\n+       use this option to sepecify that it is on the 3\' end instead\n+\n+-L\n+       Specify a log file to retain logging information and final statistics\n+\n+--split-barcode\n+       barcode is split across read pair\n+\n+--quality-filter-threshold=QUALITY_FILTER_THRESHOLD\n+       Remove reads where any UMI base quality score falls\n+       below this threshold\n+--quality-encoding=QUALITY_ENCODING\n+       Quality score encoding. Choose from phred33[33-77]\n+       phred64 [64-106] or solexa [59-106]\n+\n+Usage:\n+------\n+\n+For single ended reads:\n+        umi_tools extract --bc-pattern=[PATTERN] -L extract.log [OPTIONS]\n+\n+reads from stdin and outputs to stdout.\n+\n+For paired end reads:\n+        umi_tools extract --bc-pattern=[PATTERN] --read2-in=[FASTQIN] --read2-out=[FASTQOUT] -L extract.log [OPTIONS]\n+\n+reads end one from stdin and end two from FASTQIN and outputs end one to stdin\n+and end two to FASTQOUT.\n+\n+    ]]></help>\n+    <citations>\n+        <citation type="doi">10.1101/gr.209601.116</citation>\n+        <citation type="bibtex">\n+            @misc{githubUMI-tools,\n+            title = {UMI-tools},\n+            publisher = {GitHub},\n+            journal = {GitHub repository},\n+            url = {https://github.com/CGATOxford/UMI-tools},\n+            }\n+        </citation>\n+    </citations>\n+</tool>\n'