Mercurial > repos > fubar > bigwig_outlier_bed
annotate bigwig_outlier_bed.py @ 7:c8e22efcaeda draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/bigwig_outlier_bed commit 9fa87e27ea819badb876e6d89807a789119b9f53
author | fubar |
---|---|
date | Wed, 24 Jul 2024 08:49:37 +0000 |
parents | eb17eb8a3658 |
children | 032e930ef6a1 |
rev | line source |
---|---|
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
1 """ |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
2 Ross Lazarus June 2024 for VGP |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
3 Bigwigs are great, but hard to reliably "see" small low coverage or small very high coverage regions. |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
4 Colouring in JB2 tracks will need a new plugin, so this code will find bigwig regions above and below a chosen percentile point. |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
5 0.99 and 0.01 work well in testing with a minimum span of 10 bp. |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
6 Multiple bigwigs **with the same reference** can be combined - bed segments will be named appropriately |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
7 Combining multiple references works but is silly because only display will rely on one reference so others will not be shown... |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
8 Tricksy numpy method from http://gregoryzynda.com/python/numpy/contiguous/interval/2019/11/29/contiguous-regions.html |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
9 takes about 95 seconds for a 17MB test wiggle |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
10 JBrowse2 bed normally displays ignore the score, so could provide separate low/high bed file outputs as an option. |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
11 Update june 30 2024: wrote a 'no-build' plugin for beds to display red/blue if >0/<0 so those are used for scores |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
12 Bed interval naming must be short for JB2 but needs input bigwig name and (lo or hi). |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
13 """ |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
14 |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
15 import argparse |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
16 import os |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
17 import sys |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
18 from pathlib import Path |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
19 |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
20 import numpy as np |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
21 import pybigtools |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
22 |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
23 |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
24 class asciihist: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
25 |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
26 def __init__( |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
27 self, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
28 data, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
29 bins=10, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
30 minmax=None, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
31 str_tag="", |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
32 scale_output=80, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
33 generate_only=True, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
34 ): |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
35 """ |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
36 https://gist.github.com/bgbg/608d9ef4fd75032731651257fe67fc81 |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
37 Create an ASCII histogram from an interable of numbers. |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
38 Author: Boris Gorelik boris@gorelik.net. based on http://econpy.googlecode.com/svn/trunk/pytrix/pytrix.py |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
39 License: MIT |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
40 """ |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
41 self.data = data |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
42 self.minmax = minmax |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
43 self.str_tag = str_tag |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
44 self.bins = bins |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
45 self.generate_only = generate_only |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
46 self.scale_output = scale_output |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
47 self.itarray = np.asanyarray(self.data) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
48 if self.minmax == "auto": |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
49 self.minmax = np.percentile(data, [5, 95]) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
50 if self.minmax[0] == self.minmax[1]: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
51 # for very ugly distributions |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
52 self.minmax = None |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
53 if self.minmax is not None: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
54 # discard values that are outside minmax range |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
55 mn = self.minmax[0] |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
56 mx = self.minmax[1] |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
57 self.itarray = self.itarray[self.itarray >= mn] |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
58 self.itarray = self.itarray[self.itarray <= mx] |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
59 |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
60 def draw(self): |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
61 values, counts = np.unique(self.data, return_counts=True) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
62 if len(values) <= 20: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
63 self.bins = len(values) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
64 ret = [] |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
65 if self.itarray.size: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
66 total = len(self.itarray) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
67 counts, cutoffs = np.histogram(self.itarray, bins=self.bins) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
68 cutoffs = cutoffs[1:] |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
69 if self.str_tag: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
70 self.str_tag = "%s " % self.str_tag |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
71 else: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
72 self.str_tag = "" |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
73 if self.scale_output is not None: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
74 scaled_counts = counts.astype(float) / counts.sum() * self.scale_output |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
75 else: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
76 scaled_counts = counts |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
77 footerbar = "{:s}{:s} |{:s} |".format(self.str_tag, "-" * 12, "-" * 12,) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
78 if self.minmax is not None: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
79 ret.append( |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
80 "Trimmed to range (%s - %s)" |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
81 % (str(self.minmax[0]), str(self.minmax[1])) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
82 ) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
83 for cutoff, original_count, scaled_count in zip( |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
84 cutoffs, counts, scaled_counts |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
85 ): |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
86 ret.append( |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
87 "{:s}{:>12.2f} |{:>12,d} | {:s}".format( |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
88 self.str_tag, cutoff, original_count, "*" * int(scaled_count) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
89 ) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
90 ) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
91 ret.append(footerbar) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
92 ret.append("{:s}{:>12s} |{:>12,d} |".format(self.str_tag, "N=", total)) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
93 ret.append(footerbar) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
94 ret.append('') |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
95 else: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
96 ret = [] |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
97 if not self.generate_only: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
98 for line in ret: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
99 print(line) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
100 ret = "\n".join(ret) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
101 return ret |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
102 |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
103 |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
104 class findOut: |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
105 |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
106 def __init__(self, args): |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
107 self.bwnames = args.bigwig |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
108 self.bwlabels = args.bigwiglabels |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
109 self.bedwin = args.minwin |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
110 self.outbeds = args.outbeds |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
111 self.bedouthi = args.bedouthi |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
112 self.bedoutlo = args.bedoutlo |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
113 self.bedouthilo = args.bedouthilo |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
114 self.tableoutfile = args.tableoutfile |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
115 self.bedwin = args.minwin |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
116 self.qlo = None |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
117 self.qhi = None |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
118 if args.outbeds != "outtab": |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
119 self.qhi = args.qhi |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
120 if args.qlo: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
121 try: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
122 f = float(args.qlo) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
123 self.qlo = f |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
124 except Exception: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
125 print('qlo not provided') |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
126 nbw = len(args.bigwig) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
127 nlab = len(args.bigwiglabels) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
128 if nlab < nbw: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
129 self.bwlabels += ["Nolabel"] * (nbw - nlab) |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
130 self.makeBed() |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
131 |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
132 def processVals(self, bw, isTop): |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
133 """ |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
134 idea from http://gregoryzynda.com/python/numpy/contiguous/interval/2019/11/29/contiguous-regions.html |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
135 Fast segmentation into regions by taking np.diff on the boolean array of over (under) cutpoint indicators in bwex. |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
136 This only gives non-zero values at the segment boundaries where there's a change, so those zeros are all removed in bwexdnz |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
137 leaving an array of segment start/end positions. That's twisted around into an array of start/end coordinates. |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
138 Magical. Fast. Could do the same for means or medians over windows for sparse bigwigs like repeat regions. |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
139 """ |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
140 if isTop: |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
141 bwex = np.r_[False, bw >= self.bwtop, False] # extend with 0s |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
142 else: |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
143 bwex = np.r_[False, bw <= self.bwbot, False] |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
144 bwexd = np.diff(bwex) |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
145 bwexdnz = bwexd.nonzero()[0] |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
146 bwregions = np.reshape(bwexdnz, (-1, 2)) |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
147 return bwregions |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
148 |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
149 def writeBed(self, bed, bedfname): |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
150 """ |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
151 potentially multiple |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
152 """ |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
153 bed.sort() |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
154 beds = ["%s\t%d\t%d\t%s\t%d" % x for x in bed] |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
155 with open(bedfname, "w") as bedf: |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
156 bedf.write("\n".join(beds)) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
157 bedf.write("\n") |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
158 |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
159 def makeTableRow(self, bw, bwlabel, chr): |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
160 """ |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
161 called for every contig, but messy inline |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
162 """ |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
163 bwmean = np.mean(bw) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
164 bwstd = np.std(bw) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
165 bwmax = np.max(bw) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
166 nrow = np.size(bw) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
167 bwmin = np.min(bw) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
168 row = "%s\t%s\t%d\t%f\t%f\t%f\t%f" % ( |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
169 bwlabel, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
170 chr, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
171 nrow, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
172 bwmean, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
173 bwstd, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
174 bwmin, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
175 bwmax, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
176 ) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
177 if self.qhi is not None: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
178 row += "\t%.2f" % self.bwtop |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
179 else: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
180 row += "\tnoqhi" |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
181 if self.qlo is not None: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
182 row += "\t%.2f" % self.bwbot |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
183 else: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
184 row += "\tnoqlo" |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
185 return row |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
186 |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
187 def makeBed(self): |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
188 bedhi = [] |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
189 bedlo = [] |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
190 restab = [] |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
191 bwlabels = self.bwlabels |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
192 bwnames = self.bwnames |
7
c8e22efcaeda
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/bigwig_outlier_bed commit 9fa87e27ea819badb876e6d89807a789119b9f53
fubar
parents:
6
diff
changeset
|
193 bwnames.sort() |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
194 reshead = "bigwig\tcontig\tn\tmean\tstd\tmin\tmax\tqtop\tqbot" |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
195 for i, bwname in enumerate(bwnames): |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
196 bwlabel = bwlabels[i].replace(" ", "") |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
197 fakepath = "in%d.bw" % i |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
198 if os.path.isfile(fakepath): |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
199 os.remove(fakepath) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
200 p = Path(fakepath) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
201 p.symlink_to(bwname) # required by pybigtools (!) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
202 bwf = pybigtools.open(fakepath) |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
203 chrlist = bwf.chroms() |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
204 chrs = list(chrlist.keys()) |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
205 for chr in chrs: |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
206 first_few = None |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
207 bw = bwf.values(chr) |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
208 values, counts = np.unique(bw, return_counts=True) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
209 nvalues = len(values) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
210 if nvalues <= 20: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
211 histo = '\n'.join(['%s: %f occurs %d times' % (chr, values[x], counts[x]) for x in range(len(values))]) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
212 else: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
213 last10 = range(nvalues-10, nvalues) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
214 first_few = ['%.2f\t%d' % (values[x],counts[x]) for x in range(10)] |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
215 first_few += ['%.2f\t%d' % (values[x],counts[x]) for x in last10] |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
216 first_few.insert(0,'First/Last 10 value counts\nValue\tCount') |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
217 ha = asciihist(data=bw, bins=20, str_tag=chr) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
218 histo = ha.draw() |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
219 histo = '\n'.join(first_few) + '\nHistogram of bigwig values\n' + histo |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
220 bw = bw[~np.isnan(bw)] # some have NaN if parts of a contig not covered |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
221 if self.qhi is not None: |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
222 self.bwtop = np.quantile(bw, self.qhi) |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
223 bwhi = self.processVals(bw, isTop=True) |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
224 for j, seg in enumerate(bwhi): |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
225 seglen = seg[1] - seg[0] |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
226 if seglen >= self.bedwin: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
227 score = np.sum(bw[seg[0]:seg[1]])/float(seglen) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
228 bedhi.append( |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
229 ( |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
230 chr, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
231 seg[0], |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
232 seg[1], |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
233 "%s_%d" % (bwlabel, score), |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
234 score, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
235 ) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
236 ) |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
237 if self.qlo is not None: |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
238 self.bwbot = np.quantile(bw, self.qlo) |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
239 bwlo = self.processVals(bw, isTop=False) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
240 for j, seg in enumerate(bwlo): |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
241 if seg[1] - seg[0] >= self.bedwin: |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
242 score = -1 * np.sum(bw[seg[0]:seg[1]])/float(seglen) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
243 bedlo.append( |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
244 ( |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
245 chr, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
246 seg[0], |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
247 seg[1], |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
248 "%s_%d" % (bwlabel, score), |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
249 score, |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
250 ) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
251 ) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
252 if self.tableoutfile: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
253 row = self.makeTableRow(bw, bwlabel, chr) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
254 resheadl = reshead.split('\t') |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
255 rowl = row.split() |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
256 desc = ['%s\t%s' % (resheadl[x], rowl[x]) for x in range(len(rowl))] |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
257 desc.insert(0, 'Descriptive measures') |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
258 descn = '\n'.join(desc) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
259 restab.append(descn) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
260 restab.append(histo) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
261 if os.path.isfile(fakepath): |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
262 os.remove(fakepath) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
263 if self.tableoutfile: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
264 stable = "\n".join(restab) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
265 with open(self.tableoutfile, "w") as t: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
266 t.write(stable) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
267 t.write("\n") |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
268 some = False |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
269 if self.qlo: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
270 if self.outbeds in ["outall", "outlo", "outlohi"]: |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
271 self.writeBed(bedlo, self.bedoutlo) |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
272 some = True |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
273 if self.qhi: |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
274 if self.outbeds in ["outall", "outlohi", "outhi"]: |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
275 self.writeBed(bedhi, self.bedouthi) |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
276 some = True |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
277 if self.outbeds in ["outall", "outhilo"]: |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
278 allbed = bedlo + bedhi |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
279 self.writeBed(allbed, self.bedouthilo) |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
280 some = True |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
281 if not ((self.outbeds == 'outtab') or some): |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
282 sys.stderr.write( |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
283 "Invalid configuration - no output could be created. Was qlo missing and only low output requested for example?" |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
284 ) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
285 sys.exit(2) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
286 |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
287 |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
288 if __name__ == "__main__": |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
289 parser = argparse.ArgumentParser() |
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
290 a = parser.add_argument |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
291 a("-m", "--minwin", default=10, type=int) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
292 a("-l", "--qlo", default=None) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
293 a("-i", "--qhi", default=None, type=float) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
294 a("--bedouthi", default=None) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
295 a("--bedoutlo", default=None) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
296 a("--bedouthilo", default=None) |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
297 a("-w", "--bigwig", nargs="+") |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
298 a("-n", "--bigwiglabels", nargs="+") |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
299 a("-o", "--outbeds", default="outhilo", help="optional high and low combined bed") |
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
300 a("-t", "--tableoutfile", default=None) |
0
c71db540eb38
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
fubar
parents:
diff
changeset
|
301 args = parser.parse_args() |
6
eb17eb8a3658
planemo upload commit 1baff96e75def9248afdcf21edec9bdc7ed42b1f-dirty
fubar
parents:
0
diff
changeset
|
302 findOut(args) |