comparison README.md @ 0:c71db540eb38 draft

planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
author fubar
date Mon, 01 Jul 2024 02:48:46 +0000
parents
children eb17eb8a3658
comparison
equal deleted inserted replaced
-1:000000000000 0:c71db540eb38
1 ## bigwig peak outlier to bed
2
3 ### July 30 2024 for the VGP
4
5 This code will soon become a Galaxy tool, for building some of the [NIH MARBL T2T assembly polishing](https://github.com/marbl/training) tools as Galaxy workflows.
6
7 The next JBrowse2 tool release will include a plugin for optional colours to distinguish bed features, shown being tested in the screenshots below.
8
9 ### Find and mark BigWig peaks to a bed file for display
10
11 In the spirit of DeepTools, but finding contiguous regions where the bigwig value is either above or below a given centile.
12 0.99 and 0.01 for example. These quantile cut point values are found and applied over each chromosome using some [cunning numpy code](http://gregoryzynda.com/python/numpy/contiguous/interval/2019/11/29/contiguous-regions.html)
13
14 ![image](https://github.com/fubar2/bigwig_peak_bed/assets/6016266/cdee3a2b-ae31-4282-b744-992c15fb49db)
15
16 ![image](https://github.com/fubar2/bigwig_peak_bed/assets/6016266/59d1564b-0c34-42a3-b437-44332cf1b2f0)
17
18 Big differences between chromosomes 14,15,21,22 and Y in this "all contigs" view - explanations welcomed:
19
20 ![image](https://github.com/fubar2/bigwig_peak_bed/assets/6016266/162bf681-2977-4eb8-8d6f-9dad5b3931f8)
21
22
23 [pybedtools](https://github.com/jackh726/bigtools) is used for the bigwig interface. Optionally allow
24 multiple bigwigs to be processed into a single bed - the bed features have the bigwig name in the label for viewing.
25
26 ### Note on quantiles per chromosome rather than quantiles for the whole bigwig
27
28 It is just not feasible to hold all contigs in the entire decoded bigwig in RAM to estimate quantiles. It may be
29 better to sample across all chromosomes so as not to lose any systematic differences between them - the current method will hide those
30 differences unfortunately. Sampling might be possible. Looking at the actual quantile values across a couple of test bigwigs suggests that
31 there is not much variation between chromosomes but there's now a tabular report to check them for each input bigwig.