annotate RepEnrich.py @ 2:15e3e29f310e draft

planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit c89c33e5ea8fc63f3ea5c0f66ebc5fa822ac734b
author artbio
date Tue, 19 Sep 2017 17:23:15 -0400
parents f6f0f1e5e940
children d1f7ab78f7b5
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
1 #!/usr/bin/env python
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
2
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
3 import argparse
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
4 import csv
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
5 import os
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
6 import shlex
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
7
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
8 import shutil
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
9 import subprocess
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
10 import sys
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
11
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
12 import numpy
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
13
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
14
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
15 parser = argparse.ArgumentParser(description='''
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
16 Part II: Conducting the alignments to the psuedogenomes. Before\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
17 doing this step you will require 1) a bamfile of the unique\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
18 alignments with index 2) a fastq file of the reads mapping to\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
19 more than one location. These files can be obtained using the\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
20 following bowtie options [EXAMPLE: bowtie -S -m 1\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
21 --max multimap.fastq mm9 mate1_reads.fastq] Once you have the\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
22 unique alignment bamfile and the reads mapping to more than one\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
23 location in a fastq file you can run this step. EXAMPLE: python\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
24 master_output.py\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
25 /users/nneretti/data/annotation/hg19/hg19_repeatmasker.txt\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
26 /users/nneretti/datasets/repeatmapping/POL3/Pol3_human/
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
27 HeLa_InputChIPseq_Rep1 HeLa_InputChIPseq_Rep1\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
28 /users/nneretti/data/annotation/hg19/setup_folder\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
29 HeLa_InputChIPseq_Rep1_multimap.fastq\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
30 HeLa_InputChIPseq_Rep1.bam''')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
31 parser.add_argument('--version', action='version', version='%(prog)s 0.1')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
32 parser.add_argument('annotation_file', action='store',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
33 metavar='annotation_file',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
34 help='List RepeatMasker.org annotation file for your\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
35 organism. The file may be downloaded from the\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
36 RepeatMasker.org website. Example:\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
37 /data/annotation/hg19/hg19_repeatmasker.txt')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
38 parser.add_argument('outputfolder', action='store', metavar='outputfolder',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
39 help='List folder to contain results.\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
40 Example: /outputfolder')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
41 parser.add_argument('outputprefix', action='store', metavar='outputprefix',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
42 help='Enter prefix name for data.\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
43 Example: HeLa_InputChIPseq_Rep1')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
44 parser.add_argument('setup_folder', action='store', metavar='setup_folder',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
45 help='List folder that contains the repeat element\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
46 pseudogenomes.\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
47 Example: /data/annotation/hg19/setup_folder')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
48 parser.add_argument('fastqfile', action='store', metavar='fastqfile',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
49 help='Enter file for the fastq reads that map to multiple\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
50 locations. Example: /data/multimap.fastq')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
51 parser.add_argument('alignment_bam', action='store', metavar='alignment_bam',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
52 help='Enter bamfile output for reads that map uniquely.\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
53 Example /bamfiles/old.bam')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
54 parser.add_argument('--pairedend', action='store', dest='pairedend',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
55 default='FALSE',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
56 help='Designate this option for paired-end sequencing.\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
57 Default FALSE change to TRUE')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
58 parser.add_argument('--collapserepeat', action='store', dest='collapserepeat',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
59 metavar='collapserepeat', default='Simple_repeat',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
60 help='Designate this option to generate a collapsed repeat\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
61 type. Uncollapsed output is generated in addition to\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
62 collapsed repeat type. Simple_repeat is default to\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
63 simplify downstream analysis. You can change the\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
64 default to another repeat name to collapse a\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
65 seperate specific repeat instead or if the name of\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
66 Simple_repeat is different for your organism.\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
67 Default Simple_repeat')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
68 parser.add_argument('--fastqfile2', action='store', dest='fastqfile2',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
69 metavar='fastqfile2', default='none',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
70 help='Enter fastqfile2 when using paired-end option.\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
71 Default none')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
72 parser.add_argument('--cpus', action='store', dest='cpus', metavar='cpus',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
73 default="1", type=int,
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
74 help='Enter available cpus per node. The more cpus the\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
75 faster RepEnrich performs. RepEnrich is designed to\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
76 only work on one node. Default: "1"')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
77 parser.add_argument('--allcountmethod', action='store', dest='allcountmethod',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
78 metavar='allcountmethod', default="FALSE",
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
79 help='By default the pipeline only outputs the fraction\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
80 count method. Consdidered to be the best way to\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
81 count multimapped reads. Changing this option will\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
82 include the unique count method, a conservative\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
83 count, and the total count method, a liberal\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
84 counting strategy. Our evaluation of simulated data\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
85 indicated fraction counting is best.\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
86 Default = FALSE, change to TRUE')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
87 parser.add_argument('--is_bed', action='store', dest='is_bed',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
88 metavar='is_bed', default='FALSE',
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
89 help='Is the annotation file a bed file.\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
90 This is also a compatible format. The file needs to\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
91 be a tab seperated bed with optional fields.\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
92 Ex. format: chr\tstart\tend\tName_element\tclass\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
93 \tfamily. The class and family should identical to\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
94 name_element if not applicable.\
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
95 Default FALSE change to TRUE')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
96 args = parser.parse_args()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
97
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
98 # parameters
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
99 annotation_file = args.annotation_file
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
100 outputfolder = args.outputfolder
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
101 outputfile_prefix = args.outputprefix
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
102 setup_folder = args.setup_folder
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
103 repeat_bed = setup_folder + os.path.sep + 'repnames.bed'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
104 unique_mapper_bam = args.alignment_bam
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
105 fastqfile_1 = args.fastqfile
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
106 fastqfile_2 = args.fastqfile2
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
107 cpus = args.cpus
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
108 b_opt = "-k1 -p " + str(1) + " --quiet"
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
109 simple_repeat = args.collapserepeat
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
110 paired_end = args.pairedend
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
111 allcountmethod = args.allcountmethod
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
112 is_bed = args.is_bed
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
113
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
114 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
115 # check that the programs we need are available
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
116 try:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
117 subprocess.call(shlex.split("coverageBed -h"),
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
118 stdout=open(os.devnull, 'wb'),
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
119 stderr=open(os.devnull, 'wb'))
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
120 subprocess.call(shlex.split("bowtie --version"),
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
121 stdout=open(os.devnull, 'wb'),
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
122 stderr=open(os.devnull, 'wb'))
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
123 except OSError:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
124 print("Error: Bowtie or BEDTools not loaded")
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
125 raise
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
126
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
127 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
128 # define a csv reader that reads space deliminated files
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
129 print('Preparing for analysis using RepEnrich...')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
130 csv.field_size_limit(sys.maxsize)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
131
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
132
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
133 def import_text(filename, separator):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
134 for line in csv.reader(open(filename), delimiter=separator,
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
135 skipinitialspace=True):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
136 if line:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
137 yield line
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
138
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
139
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
140 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
141 # build dictionaries to convert repclass and rep families'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
142 if is_bed == "FALSE":
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
143 repeatclass = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
144 repeatfamily = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
145 fin = import_text(annotation_file, ' ')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
146 x = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
147 for line in fin:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
148 if x > 2:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
149 classfamily = []
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
150 classfamily = line[10].split(os.path.sep)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
151 line9 = line[9].replace("(", "_").replace(
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
152 ")", "_").replace("/", "_")
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
153 repeatclass[line9] = classfamily[0]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
154 if len(classfamily) == 2:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
155 repeatfamily[line9] = classfamily[1]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
156 else:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
157 repeatfamily[line9] = classfamily[0]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
158 x += 1
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
159 if is_bed == "TRUE":
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
160 repeatclass = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
161 repeatfamily = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
162 fin = open(annotation_file, 'r')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
163 for line in fin:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
164 line = line.strip('\n')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
165 line = line.split('\t')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
166 theclass = line[4]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
167 thefamily = line[5]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
168 line3 = line[3].replace("(", "_").replace(")", "_").replace("/", "_")
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
169 repeatclass[line3] = theclass
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
170 repeatfamily[line3] = thefamily
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
171 fin.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
172
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
173 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
174 # build list of repeats initializing dictionaries for downstream analysis'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
175 fin = import_text(setup_folder + os.path.sep + 'repgenomes_key.txt', '\t')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
176 repeat_key = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
177 rev_repeat_key = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
178 repeat_list = []
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
179 reptotalcounts = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
180 classfractionalcounts = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
181 familyfractionalcounts = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
182 classtotalcounts = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
183 familytotalcounts = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
184 reptotalcounts_simple = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
185 fractionalcounts = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
186 i = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
187 for line in fin:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
188 reptotalcounts[line[0]] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
189 fractionalcounts[line[0]] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
190 if line[0] in repeatclass:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
191 classtotalcounts[repeatclass[line[0]]] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
192 classfractionalcounts[repeatclass[line[0]]] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
193 if line[0] in repeatfamily:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
194 familytotalcounts[repeatfamily[line[0]]] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
195 familyfractionalcounts[repeatfamily[line[0]]] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
196 if line[0] in repeatfamily:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
197 if repeatfamily[line[0]] == simple_repeat:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
198 reptotalcounts_simple[simple_repeat] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
199 else:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
200 reptotalcounts_simple[line[0]] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
201 repeat_list.append(line[0])
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
202 repeat_key[line[0]] = int(line[1])
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
203 rev_repeat_key[int(line[1])] = line[0]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
204 fin.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
205 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
206 # map the repeats to the psuedogenomes:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
207 if not os.path.exists(outputfolder):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
208 os.mkdir(outputfolder)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
209 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
210 # Conduct the regions sorting
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
211 print('Conducting region sorting on unique mapping reads....')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
212 fileout = outputfolder + os.path.sep + outputfile_prefix + '_regionsorter.txt'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
213 with open(fileout, 'w') as stdout:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
214 command = shlex.split("coverageBed -abam " + unique_mapper_bam + " -b " +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
215 setup_folder + os.path.sep + 'repnames.bed')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
216 p = subprocess.Popen(command, stdout=stdout)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
217 p.communicate()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
218 stdout.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
219 filein = open(outputfolder + os.path.sep + outputfile_prefix
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
220 + '_regionsorter.txt', 'r')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
221 counts = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
222 sumofrepeatreads = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
223 for line in filein:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
224 line = line.split('\t')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
225 if not str(repeat_key[line[3]]) in counts:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
226 counts[str(repeat_key[line[3]])] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
227 counts[str(repeat_key[line[3]])] += int(line[4])
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
228 sumofrepeatreads += int(line[4])
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
229 print('Identified ' + str(sumofrepeatreads) +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
230 'unique reads that mapped to repeats.')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
231 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
232 if paired_end == 'TRUE':
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
233 if not os.path.exists(outputfolder + os.path.sep + 'pair1_bowtie'):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
234 os.mkdir(outputfolder + os.path.sep + 'pair1_bowtie')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
235 if not os.path.exists(outputfolder + os.path.sep + 'pair2_bowtie'):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
236 os.mkdir(outputfolder + os.path.sep + 'pair2_bowtie')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
237 folder_pair1 = outputfolder + os.path.sep + 'pair1_bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
238 folder_pair2 = outputfolder + os.path.sep + 'pair2_bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
239 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
240 print("Processing repeat psuedogenomes...")
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
241 ps = []
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
242 psb = []
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
243 ticker = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
244 for metagenome in repeat_list:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
245 metagenomepath = setup_folder + os.path.sep + metagenome
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
246 file1 = folder_pair1 + os.path.sep + metagenome + '.bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
247 file2 = folder_pair2 + os.path.sep + metagenome + '.bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
248 with open(file1, 'w') as stdout:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
249 command = shlex.split("bowtie " + b_opt + " " +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
250 metagenomepath + " " + fastqfile_1)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
251 p = subprocess.Popen(command, stdout=stdout)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
252 with open(file2, 'w') as stdout:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
253 command = shlex.split("bowtie " + b_opt + " " +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
254 metagenomepath + " " + fastqfile_2)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
255 pp = subprocess.Popen(command, stdout=stdout)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
256 ps.append(p)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
257 ticker += 1
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
258 psb.append(pp)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
259 ticker += 1
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
260 if ticker == cpus:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
261 for p in ps:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
262 p.communicate()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
263 for p in psb:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
264 p.communicate()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
265 ticker = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
266 psb = []
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
267 ps = []
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
268 if len(ps) > 0:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
269 for p in ps:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
270 p.communicate()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
271 stdout.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
272
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
273 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
274 # combine the output from both read pairs:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
275 print('sorting and combining the output for both read pairs...')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
276 if not os.path.exists(outputfolder + os.path.sep + 'sorted_bowtie'):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
277 os.mkdir(outputfolder + os.path.sep + 'sorted_bowtie')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
278 sorted_bowtie = outputfolder + os.path.sep + 'sorted_bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
279 for metagenome in repeat_list:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
280 file1 = folder_pair1 + os.path.sep + metagenome + '.bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
281 file2 = folder_pair2 + os.path.sep + metagenome + '.bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
282 fileout = sorted_bowtie + os.path.sep + metagenome + '.bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
283 with open(fileout, 'w') as stdout:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
284 p1 = subprocess.Popen(['cat', file1, file2],
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
285 stdout=subprocess.PIPE)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
286 p2 = subprocess.Popen(['cut', '-f1', "-d "], stdin=p1.stdout,
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
287 stdout=subprocess.PIPE)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
288 p3 = subprocess.Popen(['cut', '-f1', "-d/"], stdin=p2.stdout,
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
289 stdout=subprocess.PIPE)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
290 p4 = subprocess.Popen(['sort'], stdin=p3.stdout,
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
291 stdout=subprocess.PIPE)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
292 p5 = subprocess.Popen(['uniq'], stdin=p4.stdout, stdout=stdout)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
293 p5.communicate()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
294 stdout.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
295 print('completed ...')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
296 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
297 if paired_end == 'FALSE':
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
298 if not os.path.exists(outputfolder + os.path.sep + 'pair1_bowtie'):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
299 os.mkdir(outputfolder + os.path.sep + 'pair1_bowtie')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
300 folder_pair1 = outputfolder + os.path.sep + 'pair1_bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
301 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
302 ps = []
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
303 ticker = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
304 print("Processing repeat psuedogenomes...")
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
305 for metagenome in repeat_list:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
306 metagenomepath = setup_folder + os.path.sep + metagenome
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
307 file1 = folder_pair1 + os.path.sep + metagenome + '.bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
308 with open(file1, 'w') as stdout:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
309 command = shlex.split("bowtie " + b_opt + " " +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
310 metagenomepath + " " + fastqfile_1)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
311 p = subprocess.Popen(command, stdout=stdout)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
312 ps.append(p)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
313 ticker += 1
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
314 if ticker == cpus:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
315 for p in ps:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
316 p.communicate()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
317 ticker = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
318 ps = []
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
319 if len(ps) > 0:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
320 for p in ps:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
321 p.communicate()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
322 stdout.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
323
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
324 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
325 # combine the output from both read pairs:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
326 print('Sorting and combining the output for both read pairs....')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
327 if not os.path.exists(outputfolder + os.path.sep + 'sorted_bowtie'):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
328 os.mkdir(outputfolder + os.path.sep + 'sorted_bowtie')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
329 sorted_bowtie = outputfolder + os.path.sep + 'sorted_bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
330 for metagenome in repeat_list:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
331 file1 = folder_pair1 + os.path.sep + metagenome + '.bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
332 fileout = sorted_bowtie + os.path.sep + metagenome + '.bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
333 with open(fileout, 'w') as stdout:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
334 p1 = subprocess.Popen(['cat', file1], stdout=subprocess.PIPE)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
335 p2 = subprocess.Popen(['cut', '-f1'], stdin=p1.stdout,
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
336 stdout=subprocess.PIPE)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
337 p3 = subprocess.Popen(['cut', '-f1', "-d/"], stdin=p2.stdout,
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
338 stdout=subprocess.PIPE)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
339 p4 = subprocess.Popen(['sort'], stdin=p3.stdout,
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
340 stdout=subprocess.PIPE)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
341 p5 = subprocess.Popen(['uniq'], stdin=p4.stdout, stdout=stdout)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
342 p5.communicate()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
343 stdout.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
344 print('completed ...')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
345
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
346 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
347 # build a file of repeat keys for all reads
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
348 print('Writing and processing intermediate files...')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
349 sorted_bowtie = outputfolder + os.path.sep + 'sorted_bowtie'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
350 readid = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
351 sumofrepeatreads = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
352 for rep in repeat_list:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
353 for data in import_text(sorted_bowtie + os.path.sep +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
354 rep + '.bowtie', '\t'):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
355 readid[data[0]] = ''
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
356 for rep in repeat_list:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
357 for data in import_text(sorted_bowtie + os.path.sep
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
358 + rep + '.bowtie', '\t'):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
359 readid[data[0]] += str(repeat_key[rep]) + str(',')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
360 for subfamilies in readid.values():
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
361 if subfamilies not in counts:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
362 counts[subfamilies] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
363 counts[subfamilies] += 1
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
364 sumofrepeatreads += 1
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
365 del readid
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
366 print('Identified ' + str(sumofrepeatreads) +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
367 ' reads that mapped to repeats for unique and multimappers.')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
368
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
369 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
370 print("Conducting final calculations...")
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
371
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
372
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
373 def convert(x):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
374 '''
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
375 build a converter to numeric label for repeat and yield a combined list
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
376 of repnames seperated by backslash
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
377 '''
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
378 x = x.strip(',')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
379 x = x.split(',')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
380 global repname
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
381 repname = ""
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
382 for i in x:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
383 repname = repname + os.path.sep + rev_repeat_key[int(i)]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
384
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
385
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
386 # building the total counts for repeat element enrichment...
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
387 for x in counts.keys():
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
388 count = counts[x]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
389 x = x.strip(',')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
390 x = x.split(',')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
391 for i in x:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
392 reptotalcounts[rev_repeat_key[int(i)]] += int(count)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
393 # building the fractional counts for repeat element enrichment...
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
394 for x in counts.keys():
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
395 count = counts[x]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
396 x = x.strip(',')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
397 x = x.split(',')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
398 splits = len(x)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
399 for i in x:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
400 fractionalcounts[rev_repeat_key[int(i)]] += float(
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
401 numpy.divide(float(count), float(splits)))
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
402 # building categorized table of repeat element enrichment...
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
403 repcounts = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
404 repcounts['other'] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
405 for key in counts.keys():
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
406 convert(key)
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
407 repcounts[repname] = counts[key]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
408 # building the total counts for class enrichment...
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
409 for key in reptotalcounts.keys():
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
410 classtotalcounts[repeatclass[key]] += reptotalcounts[key]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
411 # building total counts for family enrichment...
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
412 for key in reptotalcounts.keys():
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
413 familytotalcounts[repeatfamily[key]] += reptotalcounts[key]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
414 # building unique counts table'
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
415 repcounts2 = {}
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
416 for rep in repeat_list:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
417 if "/" + rep in repcounts:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
418 repcounts2[rep] = repcounts["/" + rep]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
419 else:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
420 repcounts2[rep] = 0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
421 # building the fractionalcounts counts for class enrichment...
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
422 for key in fractionalcounts.keys():
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
423 classfractionalcounts[repeatclass[key]] += fractionalcounts[key]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
424 # building fractional counts for family enrichment...
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
425 for key in fractionalcounts.keys():
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
426 familyfractionalcounts[repeatfamily[key]] += fractionalcounts[key]
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
427
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
428 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
429 print('Writing final output and removing intermediate files...')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
430 # print output to file of the categorized counts and total overlapping counts:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
431 if allcountmethod == "TRUE":
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
432 fout1 = open(outputfolder + os.path.sep + outputfile_prefix
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
433 + '_total_counts.txt', 'w')
2
15e3e29f310e planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit c89c33e5ea8fc63f3ea5c0f66ebc5fa822ac734b
artbio
parents: 0
diff changeset
434 for key in sorted(reptotalcounts.keys()):
0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
435 fout1.write(str(key) + '\t' + repeatclass[key] + '\t' +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
436 repeatfamily[key] + '\t' + str(reptotalcounts[key])
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
437 + '\n')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
438 fout2 = open(outputfolder + os.path.sep + outputfile_prefix
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
439 + '_class_total_counts.txt', 'w')
2
15e3e29f310e planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit c89c33e5ea8fc63f3ea5c0f66ebc5fa822ac734b
artbio
parents: 0
diff changeset
440 for key in sorted(classtotalcounts.keys()):
0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
441 fout2.write(str(key) + '\t' + str(classtotalcounts[key]) + '\n')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
442 fout3 = open(outputfolder + os.path.sep + outputfile_prefix
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
443 + '_family_total_counts.txt', 'w')
2
15e3e29f310e planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit c89c33e5ea8fc63f3ea5c0f66ebc5fa822ac734b
artbio
parents: 0
diff changeset
444 for key in sorted(familytotalcounts.keys()):
0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
445 fout3.write(str(key) + '\t' + str(familytotalcounts[key]) + '\n')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
446 fout4 = open(outputfolder + os.path.sep + outputfile_prefix +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
447 '_unique_counts.txt', 'w')
2
15e3e29f310e planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit c89c33e5ea8fc63f3ea5c0f66ebc5fa822ac734b
artbio
parents: 0
diff changeset
448 for key in sorted(repcounts2.keys()):
0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
449 fout4.write(str(key) + '\t' + repeatclass[key] + '\t' +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
450 repeatfamily[key] + '\t' + str(repcounts2[key]) + '\n')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
451 fout5 = open(outputfolder + os.path.sep + outputfile_prefix
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
452 + '_class_fraction_counts.txt', 'w')
2
15e3e29f310e planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit c89c33e5ea8fc63f3ea5c0f66ebc5fa822ac734b
artbio
parents: 0
diff changeset
453 for key in sorted(classfractionalcounts.keys()):
0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
454 fout5.write(str(key) + '\t' + str(classfractionalcounts[key]) + '\n')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
455 fout6 = open(outputfolder + os.path.sep + outputfile_prefix +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
456 '_family_fraction_counts.txt', 'w')
2
15e3e29f310e planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit c89c33e5ea8fc63f3ea5c0f66ebc5fa822ac734b
artbio
parents: 0
diff changeset
457 for key in sorted(familyfractionalcounts.keys()):
0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
458 fout6.write(str(key) + '\t' + str(familyfractionalcounts[key]) + '\n')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
459 fout7 = open(outputfolder + os.path.sep + outputfile_prefix
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
460 + '_fraction_counts.txt', 'w')
2
15e3e29f310e planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit c89c33e5ea8fc63f3ea5c0f66ebc5fa822ac734b
artbio
parents: 0
diff changeset
461 for key in sorted(fractionalcounts.keys()):
0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
462 fout7.write(str(key) + '\t' + repeatclass[key] + '\t' +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
463 repeatfamily[key] + '\t' + str(int(fractionalcounts[key]))
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
464 + '\n')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
465 fout1.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
466 fout2.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
467 fout3.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
468 fout4.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
469 fout5.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
470 fout6.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
471 fout7.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
472 else:
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
473 fout1 = open(outputfolder + os.path.sep + outputfile_prefix +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
474 '_class_fraction_counts.txt', 'w')
2
15e3e29f310e planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit c89c33e5ea8fc63f3ea5c0f66ebc5fa822ac734b
artbio
parents: 0
diff changeset
475 for key in sorted(classfractionalcounts.keys()):
0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
476 fout1.write(str(key) + '\t' + str(classfractionalcounts[key]) + '\n')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
477 fout2 = open(outputfolder + os.path.sep + outputfile_prefix +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
478 '_family_fraction_counts.txt', 'w')
2
15e3e29f310e planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit c89c33e5ea8fc63f3ea5c0f66ebc5fa822ac734b
artbio
parents: 0
diff changeset
479 for key in sorted(familyfractionalcounts.keys()):
0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
480 fout2.write(str(key) + '\t' + str(familyfractionalcounts[key]) + '\n')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
481 fout3 = open(outputfolder + os.path.sep + outputfile_prefix +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
482 '_fraction_counts.txt', 'w')
2
15e3e29f310e planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit c89c33e5ea8fc63f3ea5c0f66ebc5fa822ac734b
artbio
parents: 0
diff changeset
483 for key in sorted(fractionalcounts.keys()):
0
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
484 fout3.write(str(key) + '\t' + repeatclass[key] + '\t' +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
485 repeatfamily[key] + '\t' + str(int(fractionalcounts[key]))
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
486 + '\n')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
487 fout1.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
488 fout2.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
489 fout3.close()
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
490 ##############################################################################
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
491 # Remove Large intermediate files
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
492 if os.path.exists(outputfolder + os.path.sep + outputfile_prefix +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
493 '_regionsorter.txt'):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
494 os.remove(outputfolder + os.path.sep + outputfile_prefix +
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
495 '_regionsorter.txt')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
496 if os.path.exists(outputfolder + os.path.sep + 'pair1_bowtie'):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
497 shutil.rmtree(outputfolder + os.path.sep + 'pair1_bowtie')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
498 if os.path.exists(outputfolder + os.path.sep + 'pair2_bowtie'):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
499 shutil.rmtree(outputfolder + os.path.sep + 'pair2_bowtie')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
500 if os.path.exists(outputfolder + os.path.sep + 'sorted_bowtie'):
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
501 shutil.rmtree(outputfolder + os.path.sep + 'sorted_bowtie')
f6f0f1e5e940 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/repenrich commit 61e203df0be5ed877ff92b917c7cde6eeeab8310
artbio
parents:
diff changeset
502 print("... Done")