diff hicMergeMatrixBins.xml @ 7:c14ba4629233 draft

planemo upload for repository https://github.com/maxplanck-ie/HiCExplorer/tree/master/galaxy/wrapper/ commit eec0a4d5a7c5ba4ec0fbd2ead8280c3d143bb9d8
author iuc
date Fri, 27 Apr 2018 03:38:09 -0400
parents 8926550196ce
children 49c8055f6830
line wrap: on
line diff
--- a/hicMergeMatrixBins.xml	Wed Mar 07 03:45:11 2018 -0500
+++ b/hicMergeMatrixBins.xml	Fri Apr 27 03:38:09 2018 -0400
@@ -1,5 +1,5 @@
 <tool id="hicexplorer_hicmergematrixbins" name="@BINARY@" version="@WRAPPER_VERSION@.0">
-    <description>Merges bins from a Hi-C matrix</description>
+    <description>merge adjacent bins from a Hi-C contact matrix to reduce its resolution</description>
     <macros>
         <token name="@BINARY@">hicMergeMatrixBins</token>
         <import>macros.xml</import>
@@ -19,7 +19,7 @@
         <expand macro='matrix_h5_cooler_macro' />
 
         <param argument="--numBins" type="integer" min="1" value="3" label="Number of bins to merge" />
-        <param argument="--runningWindow" type="boolean" falsevalue="" truevalue="--runningWindow" label="Merge using a running window of length --numBins" />
+        <param argument="--runningWindow" type="boolean" falsevalue="" truevalue="--runningWindow" label="Set to merge for using a running window of length --numBins. Usually not set." />
         <param name='outputFormat' type='select' label="Output file format">
             <option value='h5' selected='true'>HiCExplorer format</option>
             <option value="cool">cool</option>
@@ -44,31 +44,51 @@
 Change matrix resolution
 ========================
 
-``hicMergeMatrixBins`` is used to decrease the resolution of a matrix. With this tool you can create out of a 100 kb 
+**hicMergeMatrixBins** is used to decrease the resolution of a matrix. With this tool, you can for example create out of a 100 kb
 contact matrix a 1000 kb one:
 
 Number of bins to merge = 10
 
 100 kb * 10 = 1000 kb = 1 Mb
 
-This functionality is useful especially for plotting. The higher the resolution of an contact matrix is, the more likely it is
-to run out of memory while plotting. This is caused by the circumstances that we compute internally with a sparse matrices but 
-to plot we need a dense one. Furthermore, the higher the resolution of a matrix the more detailed it is which can make it 
-difficult to interpret, especially if the read depth of the Hi-C data is not high.
+Depending on the downstream analyses to perform on a Hi-C matrix generated with HiCExplorer, one might need different bin resolutions. For example using ``hicPlotMatrix`` to display chromatin interactions of a whole chromosome will not produce any meaningful vizualisation if it is performed on a matrix at restriction sites resolution (unmerged). Furthermore, the higher the resolution of a matrix, the more detailed it is, which can make it
+difficult to interpret, especially if the read depth of the Hi-C data is not high enough. **hicMergeMatrixBins** address these issues by merging a given number of adjacent bins to reduce Hi-C matrices resolution.
 
-Input
+_________________
+
+Usage
 -----
 
-Parameters
-__________
-- contact matrix to change the resolution on
-- Number of bins to merge
-- running window
+To limit the loss of information, it is mandatory to perform **hicMergeMatrixBins** on matrices prior to any correction and any other bin merging (direct output from ``hicBuildMatrix``). After bin merging, ``hicCorrectMatrix`` must be used for downstream analyses requiring corrected matrices.
+
+_________________
 
 Output
 ------
 
-A contact matrix with the resolution ``original resolution * number of bins``.
+**hicMergeMatrixBins** outputs a Hi-C matrix with reduced resolution.
+
+Below, we will develop the example of a Hi-C matrix in *Drosophila melanogaster* that we want to display at the whole X-chromosome scale and at the scale of a 1Mb region of the X chromosome. To do this, we performed two different bin merging using **hicMergeMatrixBins** on an uncorrected matrix built at the restiction sites resolution using ``hicBuildMatrix``.
+
+Starting from a matrix with bins of a median length of 529bp (restriction enzyme resolution, here DpnII), running **hicMergeMatrixBins** with a number of bins to merge of 3 produced a matrix with bins of a median length of 1661bp, while **hicMergeMatrixBins** with a number of bins to merge of 50 produced a matrix with bins of a median length of 29798bp.
+
+After the correction of these three matrices using ``hicCorrectMatrix``, we plotted them using ``hicPlotMatrix`` at the scale of the whole X-chromosome and at the scale of the X:2000000-3000000 region to see the effect of bin merging on the interactions visualization.
+
+- **Effect of bins merging at the scale of a chromosome:**
+
+.. image:: $PATH_TO_IMAGES/hicMergeMatrixBins_Xchr.png
+   :width: 60 %
+
+When observed altogether, the plots above show that the merging of bins by 50 is the most adequate way to plot interactions for a whole chromosome in *Drosophila melanogaster* when starting from a matrix with bins of a median length of 529bp.
+
+- **Effect of bins merging at the scale of a specific region:**
+
+.. image:: $PATH_TO_IMAGES/hicMergeMatrixBins_Xregion.png
+   :width: 60 %
+
+When observed altogether, the plots above show that the merging of bins by 3 is the most adequate way to plot interactions for a region of 1Mb in Drosophila melanogaster when starting from a matrix with bins of a median length of 529bp.
+
+_________________
 
 | For more information about HiCExplorer please consider our documentation on readthedocs.io_