annotate mut2read.py @ 0:8d29173d49a9 draft

"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
author iuc
date Wed, 20 Nov 2019 17:47:35 -0500
parents
children 3556001ff2db
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
1 #!/usr/bin/env python
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
2
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
3 """mut2read.py
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
4
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
5 Author -- Gundula Povysil
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
6 Contact -- povysil@bioinf.jku.at
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
7
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
8 Takes a tabular file with mutations and a BAM file as input and prints
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
9 all tags of reads that carry the mutation to a user specified output file.
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
10 Creates fastq file of reads of tags with mutation.
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
11
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
12 ======= ========== ================= ================================
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
13 Version Date Author Description
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
14 0.2.1 2019-10-27 Gundula Povysil -
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
15 ======= ========== ================= ================================
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
16
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
17 USAGE: python mut2read.py DCS_Mutations.tabular DCS.bam Aligned_Families.tabular Interesting_Reads.fastq
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
18 tag_count_dict.json
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
19 """
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
20
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
21 import argparse
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
22 import json
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
23 import os
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
24 import sys
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
25
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
26 import numpy as np
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
27 import pysam
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
28
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
29
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
30 def make_argparser():
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
31 parser = argparse.ArgumentParser(description='Takes a tabular file with mutations and a BAM file as input and prints all tags of reads that carry the mutation to a user specified output file and creates a fastq file of reads of tags with mutation.')
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
32 parser.add_argument('--mutFile',
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
33 help='TABULAR file with DCS mutations.')
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
34 parser.add_argument('--bamFile',
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
35 help='BAM file with aligned DCS reads.')
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
36 parser.add_argument('--familiesFile',
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
37 help='TABULAR file with aligned families.')
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
38 parser.add_argument('--outputFastq',
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
39 help='Output FASTQ file of reads with mutations.')
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
40 parser.add_argument('--outputJson',
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
41 help='Output JSON file to store collected data.')
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
42 return parser
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
43
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
44
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
45 def mut2read(argv):
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
46 parser = make_argparser()
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
47 args = parser.parse_args(argv[1:])
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
48
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
49 file1 = args.mutFile
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
50 file2 = args.bamFile
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
51 file3 = args.familiesFile
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
52 outfile = args.outputFastq
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
53 json_file = args.outputJson
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
54
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
55 if os.path.isfile(file1) is False:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
56 sys.exit("Error: Could not find '{}'".format(file1))
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
57
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
58 if os.path.isfile(file2) is False:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
59 sys.exit("Error: Could not find '{}'".format(file2))
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
60
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
61 if os.path.isfile(file3) is False:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
62 sys.exit("Error: Could not find '{}'".format(file3))
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
63
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
64 # read mut file
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
65 with open(file1, 'r') as mut:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
66 mut_array = np.genfromtxt(mut, skip_header=1, delimiter='\t', comments='#', dtype='string')
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
67
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
68 # read dcs bam file
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
69 # pysam.index(file2)
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
70 bam = pysam.AlignmentFile(file2, "rb")
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
71
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
72 # get tags
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
73 tag_dict = {}
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
74 cvrg_dict = {}
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
75
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
76 if len(mut_array) == 13:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
77 mut_array = mut_array.reshape((1, len(mut_array)))
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
78
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
79 for m in range(len(mut_array[:, 0])):
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
80 print(str(m + 1) + " of " + str(len(mut_array[:, 0])))
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
81 chrom = mut_array[m, 1]
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
82 stop_pos = mut_array[m, 2].astype(int)
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
83 chrom_stop_pos = str(chrom) + "#" + str(stop_pos)
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
84 ref = mut_array[m, 9]
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
85 alt = mut_array[m, 10]
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
86
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
87 dcs_len = []
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
88
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
89 for pileupcolumn in bam.pileup(chrom.tobytes(), stop_pos - 2, stop_pos, max_depth=100000000):
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
90
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
91 if pileupcolumn.reference_pos == stop_pos - 1:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
92 count_alt = 0
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
93 count_ref = 0
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
94 count_indel = 0
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
95 count_n = 0
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
96 count_other = 0
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
97 count_lowq = 0
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
98 print("unfiltered reads=", pileupcolumn.n, "filtered reads=", len(pileupcolumn.pileups),
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
99 "difference= ", len(pileupcolumn.pileups) - pileupcolumn.n)
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
100 for pileupread in pileupcolumn.pileups:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
101 if not pileupread.is_del and not pileupread.is_refskip:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
102 # query position is None if is_del or is_refskip is set.
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
103 nuc = pileupread.alignment.query_sequence[pileupread.query_position]
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
104 dcs_len.append(len(pileupread.alignment.query_sequence))
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
105 if nuc == alt:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
106 count_alt += 1
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
107 tag = pileupread.alignment.query_name
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
108 if tag in tag_dict:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
109 tag_dict[tag][chrom_stop_pos] = alt
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
110 else:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
111 tag_dict[tag] = {}
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
112 tag_dict[tag][chrom_stop_pos] = alt
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
113 elif nuc == ref:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
114 count_ref += 1
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
115 elif nuc == "N":
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
116 count_n += 1
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
117 elif nuc == "lowQ":
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
118 count_lowq += 1
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
119 else:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
120 count_other += 1
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
121 else:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
122 count_indel += 1
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
123 dcs_median = np.median(np.array(dcs_len))
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
124 cvrg_dict[chrom_stop_pos] = (count_ref, count_alt, dcs_median)
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
125
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
126 print("coverage at pos %s = %s, ref = %s, alt = %s, other bases = %s, N = %s, indel = %s, low quality = %s, median length of DCS = %s\n" %
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
127 (pileupcolumn.pos, count_ref + count_alt, count_ref, count_alt, count_other, count_n,
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
128 count_indel, count_lowq, dcs_median))
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
129 bam.close()
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
130
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
131 with open(json_file, "w") as f:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
132 json.dump((tag_dict, cvrg_dict), f)
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
133
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
134 # create fastq from aligned reads
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
135 with open(outfile, 'w') as out:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
136 with open(file3, 'r') as families:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
137 for line in families:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
138 line = line.rstrip('\n')
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
139 splits = line.split('\t')
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
140 tag = splits[0]
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
141
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
142 if tag in tag_dict:
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
143 str1 = splits[4]
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
144 curr_seq = str1.replace("-", "")
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
145 str2 = splits[5]
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
146 curr_qual = str2.replace(" ", "")
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
147
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
148 out.write("@" + splits[0] + "." + splits[1] + "." + splits[2] + "\n")
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
149 out.write(curr_seq + "\n")
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
150 out.write("+" + "\n")
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
151 out.write(curr_qual + "\n")
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
152
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
153
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
154 if __name__ == '__main__':
8d29173d49a9 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/variant_analyzer commit 5a438f76d0ecb6478f82dae6b9596bc7f5a4f4e8"
iuc
parents:
diff changeset
155 sys.exit(mut2read(sys.argv))