Toplogical domains (TADs) are large mainly self-interacting domains. Chromatin interactions occur with higher frequency within a TAD as between TADs. More information.
This tool must be used on unmerged matrices (restiction enzyme resolution) produced by hicBuildMatrix and corrected by hicCorrectMatrix.
hicFindTADs computes the TAD regions in two steps: in a first step it computes a TAD-separation score based on a z-score matrix for all bins. The z-score is defined as:
“The absolute value of z represents the distance between the raw score and the population mean in units of the standard deviation. z is negative when the raw score is below the mean, positive when above.” [Source].
In our case the distribution describes the counts per bin of a genomic distance. In a second step the local minima of the TAD-separation score is evaluated with respect to the surrounding bins to assign a p-value. Two multiple testing corrections can be applied to filter the results: Bonferroni or the false discovery rate.
hicFindTADs produces multiple outputs:
It is mandatory to test multiple parameters of TAD calling with hicFindTADs before making conclusions about the number of TADs in a given sample or before comparing TAD calling between multiple conditions. In order to compare numerous TAD calling parameters at once, it is recommended to use pyGenomeTracks, below you can find a plot where multiple TAD calling parameters are displayed for Drosophila melanogaster embryos:
We can see that the fourth set of hicFindTADs parameters with a threshold of 0.001 gives the best results in terms of TAD calling compared to the corrected Hi-C counts distribution and compared to the enrichment of H3K36me3, which is known to be enriched at TAD boundaries in Drosophila melanogaster.
For more information about HiCExplorer please consider our documentation on readthedocs.io