Create a temporary index with the offered files from the user. Utilizing the script: bismark_genome_preparation Generating index with: 'bismark_genome_preparation --bowtie2 /tmp/tmpAHSx4i' Writing bisulfite genomes out into a single MFA (multi FastA) file Bisulfite Genome Indexer version v0.22.1 (last modified: 14 April 2019) Step I - Prepare genome folders - completed Total number of conversions performed: C->T: 146875 G->A: 150504 Step II - Genome bisulfite conversions - completed Bismark Genome Preparation - Step III: Launching the Bowtie 2 indexer Please be aware that this process can - depending on genome size - take several hours! Settings: Output files: "BS_CT.*.bt2" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 4 (one in 16) FTable chars: 10 Strings: unpacked Max bucket size: default Max bucket size, sqrt multiplier: default Max bucket size, len divisor: 4 Difference-cover sample period: 1024 Endianness: little Actual local endianness: little Sanity checking: disabled Assertions: disabled Random seed: 0 Sizeofs: void*:8, int:4, long:8, size_t:8 Input files DNA, FASTA: genome_mfa.CT_conversion.fa Building a SMALL index Reading reference sizes Time reading reference sizes: 00:00:00 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:00 bmax according to bmaxDivN setting: 189039 Using parameters --bmax 141780 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 141780 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:00:00 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:00 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:00 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 756159 (target: 141779) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 1 No samples; assembling all-inclusive block Sorting block of length 756159 for bucket 1 (Using difference cover) Sorting block time: xxxx Returning block of 756160 for bucket 1 Exited Ebwt loop fchr[A]: 0 fchr[C]: 235897 fchr[G]: 235897 fchr[T]: 386401 fchr[$]: 756159 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 4446745 bytes to primary EBWT file: BS_CT.1.bt2 Wrote 189044 bytes to secondary EBWT file: BS_CT.2.bt2 Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 756159 bwtLen: 756160 sz: 189040 bwtSz: 189040 lineRate: 6 offRate: 4 offMask: 0xfffffff0 ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 47260 offsSz: 189040 lineSz: 64 sideSz: 64 sideBwtSz: 48 sideBwtLen: 192 numSides: 3939 numLines: 3939 ebwtTotLen: 252096 ebwtTotSz: 252096 color: 0 reverse: 0 Total time for call to driver() for forward index: xxxx Reading reference sizes Time reading reference sizes: 00:00:00 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:00 Time to reverse reference sequence: 00:00:00 bmax according to bmaxDivN setting: 189039 Using parameters --bmax 141780 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 141780 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:00:00 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:00 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:00 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 756159 (target: 141779) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 1 No samples; assembling all-inclusive block Sorting block of length 756159 for bucket 1 (Using difference cover) Sorting block time: xxxx Returning block of 756160 for bucket 1 Exited Ebwt loop fchr[A]: 0 fchr[C]: 235897 fchr[G]: 235897 fchr[T]: 386401 fchr[$]: 756159 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 4446745 bytes to primary EBWT file: BS_CT.rev.1.bt2 Wrote 189044 bytes to secondary EBWT file: BS_CT.rev.2.bt2 Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 756159 bwtLen: 756160 sz: 189040 bwtSz: 189040 lineRate: 6 offRate: 4 offMask: 0xfffffff0 ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 47260 offsSz: 189040 lineSz: 64 sideSz: 64 sideBwtSz: 48 sideBwtLen: 192 numSides: 3939 numLines: 3939 ebwtTotLen: 252096 ebwtTotSz: 252096 color: 0 reverse: 1 Total time for backward call to driver() for mirror index: 00:00:01 Settings: Output files: "BS_GA.*.bt2" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 4 (one in 16) FTable chars: 10 Strings: unpacked Max bucket size: default Max bucket size, sqrt multiplier: default Max bucket size, len divisor: 4 Difference-cover sample period: 1024 Endianness: little Actual local endianness: little Sanity checking: disabled Assertions: disabled Random seed: 0 Sizeofs: void*:8, int:4, long:8, size_t:8 Input files DNA, FASTA: genome_mfa.GA_conversion.fa Building a SMALL index Reading reference sizes Time reading reference sizes: 00:00:00 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:00 bmax according to bmaxDivN setting: 189039 Using parameters --bmax 141780 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 141780 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:00:00 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:00 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:00 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 756159 (target: 141779) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 1 No samples; assembling all-inclusive block Sorting block of length 756159 for bucket 1 (Using difference cover) Sorting block time: xxxx Returning block of 756160 for bucket 1 Exited Ebwt loop fchr[A]: 0 fchr[C]: 386401 fchr[G]: 533276 fchr[T]: 533276 fchr[$]: 756159 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 4446745 bytes to primary EBWT file: BS_GA.1.bt2 Wrote 189044 bytes to secondary EBWT file: BS_GA.2.bt2 Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 756159 bwtLen: 756160 sz: 189040 bwtSz: 189040 lineRate: 6 offRate: 4 offMask: 0xfffffff0 ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 47260 offsSz: 189040 lineSz: 64 sideSz: 64 sideBwtSz: 48 sideBwtLen: 192 numSides: 3939 numLines: 3939 ebwtTotLen: 252096 ebwtTotSz: 252096 color: 0 reverse: 0 Total time for call to driver() for forward index: xxxx Reading reference sizes Time reading reference sizes: 00:00:00 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:00 Time to reverse reference sequence: 00:00:00 bmax according to bmaxDivN setting: 189039 Using parameters --bmax 141780 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 141780 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:00:00 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:00 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:00 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 756159 (target: 141779) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 1 No samples; assembling all-inclusive block Sorting block of length 756159 for bucket 1 (Using difference cover) Sorting block time: xxxx Returning block of 756160 for bucket 1 Exited Ebwt loop fchr[A]: 0 fchr[C]: 386401 fchr[G]: 533276 fchr[T]: 533276 fchr[$]: 756159 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 4446745 bytes to primary EBWT file: BS_GA.rev.1.bt2 Wrote 189044 bytes to secondary EBWT file: BS_GA.rev.2.bt2 Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 756159 bwtLen: 756160 sz: 189040 bwtSz: 189040 lineRate: 6 offRate: 4 offMask: 0xfffffff0 ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 47260 offsSz: 189040 lineSz: 64 sideSz: 64 sideBwtSz: 48 sideBwtLen: 192 numSides: 3939 numLines: 3939 ebwtTotLen: 252096 ebwtTotSz: 252096 color: 0 reverse: 1 Total time for backward call to driver() for mirror index: 00:00:01 Running bismark with: 'bismark --bam --gzip --temp_dir /tmp/tmp86syD7 -o /tmp/tmp86syD7/results --quiet --fastq -L 20 -D 15 -R 2 --un --ambiguous /tmp/tmpAHSx4i -1 input1_fq_1.fq -2 input1_fq_2.fq -I 0 -X 500' Bowtie 2 seems to be working fine (tested command 'bowtie2 --version' [2.3.5]) Output format is BAM (default) Alignments will be written out in BAM format. Samtools found here: '/home/abretaud/miniconda3/envs/mulled-v1-9f2317dbfb405ed6926c55752e5c11678eee3256a6ea680d1c0f912251153030/bin/samtools' Reference genome folder provided is /tmp/tmpAHSx4i/ (absolute path is '/tmp/tmpAHSx4i/)' FastQ format specified Input files to be analysed (in current folder '/tmp/tmpFC2FCZ/job_working_directory/000/4/working'): input1_fq_1.fq input1_fq_2.fq Library is assumed to be strand-specific (directional), alignments to strands complementary to the original top or bottom strands will be ignored (i.e. not performed!) Created output directory /tmp/tmp86syD7/results/! Output will be written into the directory: /tmp/tmp86syD7/results/ Using temp directory: /tmp/tmp86syD7 Temporary files will be written into the directory: /tmp/tmp86syD7/ Setting parallelization to single-threaded (default) Summary of all aligner options: -q -L 20 -D 15 -R 2 --score-min L,0,-0.2 --ignore-quals --no-mixed --no-discordant --dovetail --minins 0 --maxins 500 --quiet Current working directory is: /tmp/tmpFC2FCZ/job_working_directory/000/4/working Now reading in and storing sequence information of the genome specified in: /tmp/tmpAHSx4i/ chr chrY_JH584300_random (182347 bp) chr chrY_JH584301_random (259875 bp) chr chrY_JH584302_random (155838 bp) chr chrY_JH584303_random (158099 bp) Single-core mode: setting pid to 1 Paired-end alignments will be performed ======================================= The provided filenames for paired-end alignments are input1_fq_1.fq and input1_fq_2.fq Input files are in FastQ format Writing a C -> T converted version of the input file input1_fq_1.fq to /tmp/tmp86syD7/input1_fq_1.fq_C_to_T.fastq.gz Created C -> T converted version of the FastQ file input1_fq_1.fq (1000 sequences in total) Writing a G -> A converted version of the input file input1_fq_2.fq to /tmp/tmp86syD7/input1_fq_2.fq_G_to_A.fastq.gz Created G -> A converted version of the FastQ file input1_fq_2.fq (1000 sequences in total) Input files are input1_fq_1.fq_C_to_T.fastq.gz and input1_fq_2.fq_G_to_A.fastq.gz (FastQ) Now running 2 instances of Bowtie 2 against the bisulfite genome of /tmp/tmpAHSx4i/ with the specified options: -q -L 20 -D 15 -R 2 --score-min L,0,-0.2 --ignore-quals --no-mixed --no-discordant --dovetail --minins 0 --maxins 500 --quiet Now starting a Bowtie 2 paired-end alignment for CTread1GAread2CTgenome (reading in sequences from /tmp/tmp86syD7/input1_fq_1.fq_C_to_T.fastq.gz and /tmp/tmp86syD7/input1_fq_2.fq_G_to_A.fastq.gz, with the options: -q -L 20 -D 15 -R 2 --score-min L,0,-0.2 --ignore-quals --no-mixed --no-discordant --dovetail --minins 0 --maxins 500 --quiet --norc)) Found first alignment: 1_1/1 77 * 0 0 * * 0 0 TTGTATATATTAGATAAATTAATTTTTTTTGTTTGTATGTTAAATTTTTTAATTAATTTATTAATATTTTGTGAATTTTTAGATA AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEAEEEEEE YT:Z:UP 1_1/2 141 * 0 0 * * 0 0 TTATATATATTAAATAAATTAATTTTTTTTATTTATATATTAAATTTTTTAATTAATTTATTAATATTTTATAAATTTTTAAATA AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEAEEEEEE YT:Z:UP Now starting a Bowtie 2 paired-end alignment for CTread1GAread2GAgenome (reading in sequences from /tmp/tmp86syD7/input1_fq_1.fq_C_to_T.fastq.gz and /tmp/tmp86syD7/input1_fq_2.fq_G_to_A.fastq.gz, with the options: -q -L 20 -D 15 -R 2 --score-min L,0,-0.2 --ignore-quals --no-mixed --no-discordant --dovetail --minins 0 --maxins 500 --quiet --nofw)) Found first alignment: 1_1/1 77 * 0 0 * * 0 0 TTGTATATATTAGATAAATTAATTTTTTTTGTTTGTATGTTAAATTTTTTAATTAATTTATTAATATTTTGTGAATTTTTAGATA AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEAEEEEEE YT:Z:UP 1_1/2 141 * 0 0 * * 0 0 TTATATATATTAAATAAATTAATTTTTTTTATTTATATATTAAATTTTTTAATTAATTTATTAATATTTTATAAATTTTTAAATA AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEAEEEEEE YT:Z:UP >>> Writing bisulfite mapping results to input1_fq_1_bismark_bt2_pe.bam <<< Unmapped sequences will be written to input1_fq_1.fq_unmapped_reads_1.fq.gz and input1_fq_2.fq_unmapped_reads_2.fq.gz Ambiguously mapping sequences will be written to input1_fq_1.fq_ambiguous_reads_1.fq.gz and input1_fq_2.fq_ambiguous_reads_2.fq.gz Reading in the sequence files input1_fq_1.fq and input1_fq_2.fq Processed 1000 sequences in total Successfully deleted the temporary files /tmp/tmp86syD7/input1_fq_1.fq_C_to_T.fastq.gz and /tmp/tmp86syD7/input1_fq_2.fq_G_to_A.fastq.gz Final Alignment report ====================== Sequence pairs analysed in total: 1000 Number of paired-end alignments with a unique best hit: 0 Mapping efficiency: 0.0% Sequence pairs with no alignments under any condition: 1000 Sequence pairs did not map uniquely: 0 Sequence pairs which were discarded because genomic sequence could not be extracted: 0 Number of sequence pairs with unique best (first) alignment came from the bowtie output: CT/GA/CT: 0 ((converted) top strand) GA/CT/CT: 0 (complementary to (converted) top strand) GA/CT/GA: 0 (complementary to (converted) bottom strand) CT/GA/GA: 0 ((converted) bottom strand) Number of alignments to (merely theoretical) complementary strands being rejected in total: 0 Final Cytosine Methylation Report ================================= Total number of C's analysed: 0 Total methylated C's in CpG context: 0 Total methylated C's in CHG context: 0 Total methylated C's in CHH context: 0 Total methylated C's in Unknown context: 0 Total unmethylated C's in CpG context: 0 Total unmethylated C's in CHG context: 0 Total unmethylated C's in CHH context: 0 Total unmethylated C's in Unknown context: 0 Can't determine percentage of methylated Cs in CpG context if value was 0 Can't determine percentage of methylated Cs in CHG context if value was 0 Can't determine percentage of methylated Cs in CHH context if value was 0 Can't determine percentage of methylated Cs in unknown context (CN or CHN) if value was 0 Bismark completed in xxxx ==================== Bismark run complete ====================