changeset 24:d7c9fd76e41e draft

Uploaded
author bgruening
date Tue, 04 Feb 2014 09:12:07 -0500
parents 8c452f37c896
children d2898b81b912
files bamCompare.xml bamCorrelate.xml bamCoverage.xml bamFingerprint.xml computeGCBias.xml computeMatrix.xml correctGCBias.xml deepTools_macros.xml heatmapper.xml profiler.xml
diffstat 10 files changed, 48 insertions(+), 60 deletions(-) [+]
line wrap: on
line diff
--- a/bamCompare.xml	Tue Feb 04 03:38:20 2014 -0500
+++ b/bamCompare.xml	Tue Feb 04 09:12:07 2014 -0500
@@ -74,10 +74,10 @@
     </command>
 
     <inputs>
-        <param name="bamFile1" format="bam" type="data" label="Treatment BAM file"
+        <param name="bamFile1" format="bam" type="data" label="First BAM file (e.g. treated sample)"
             help="The BAM file must be sorted."/>
 
-        <param name="bamFile2" format="bam" type="data" label="BAM file"
+        <param name="bamFile2" format="bam" type="data" label="Second BAM file (e.g. control sample)"
             help="The BAM file must be sorted."/>
 
         <param name="fragmentLength" type="integer" value="300" min="1"
@@ -92,7 +92,7 @@
             <param name="method" type="select" 
                 label="Method to use for scaling the largest sample to the smallest">
                 <option value="readCount" selected="true">read count</option>
-                <option value="SES">signal extraction scaling (SES)</option>
+                <option value="SES">signal extraction scaling (SES), check the bamFingerprint plot before using it!</option>
                 <option value="own">enter own scaling factors</option>
             </param>
             <when value="SES">
@@ -205,9 +205,7 @@
 .. image:: $PATH_TO_IMAGES/norm_IGVsnapshot_indFiles.png
 
 
-You can find more details in the `bamCompare wiki`_.
-
-.. _bamCompare wiki: https://github.com/fidelram/deepTools/wiki/Normalizations#wiki-bamCompare
+You can find more details on the bamCompare wiki page: https://github.com/fidelram/deepTools/wiki/Normalizations#wiki-bamCompare
 
 
 **Output files**:
--- a/bamCorrelate.xml	Tue Feb 04 03:38:20 2014 -0500
+++ b/bamCorrelate.xml	Tue Feb 04 09:12:07 2014 -0500
@@ -142,23 +142,21 @@
 is to check the correlation between replicates or published data sets.
 
 The tool splits the genomes into bins of given length. For each bin, the number of reads
-found in each BAM file is counted and a correlation is computed for all
-pairs of BAM files.
+found in each BAM file is counted and a correlation (either Pearson or Spearman) is computed for all
+pairs of BAM files. Finally, a heatmap is drawn based on the similarity of the samples.
 
 
 .. image:: $PATH_TO_IMAGES/QC_bamCorrelate_humanSamples.png
    :alt: Heatmap of RNA Polymerase II ChIP-seq
 
 
-You can find more details in the `bamCorrelate wiki`_.
-
-.. _bamCorrelate wiki: https://github.com/fidelram/deepTools/wiki/Normalizations#wiki-bamCompare
+You can find more details on the bamCorrelate wiki page: https://github.com/fidelram/deepTools/wiki/QC#wiki-bamCorrelate
 
 
 **Output files**:
 
-- diagnostic plot produced by bamCorrelate is a clustered heatmap displaying the values for each pair-wise correlation, see below for an example
-- data matrix (optional) in case you want to plot the correlation values using a different program, e.g. R, this matrix can be used
+- **diagnostic plot**: clustered heatmap displaying the values for each pair-wise correlation, see below for an example
+- data matrix (optional): if you want to plot the correlation values using a different program, e.g. R, this matrix can be used
 
 
 -----
--- a/bamCoverage.xml	Tue Feb 04 03:38:20 2014 -0500
+++ b/bamCoverage.xml	Tue Feb 04 09:12:07 2014 -0500
@@ -133,18 +133,16 @@
 
 **What it does**
 
-Given a BAM file, this tool generates a bigWig or bedGraph file of fragment or read coverages. 
+Given a BAM file, this tool generates a bigWig or bedGraph file with genome-wide coverage of fragment or read coverages. 
 The way the method works is by first calculating all the number of reads (either extended to match the fragment length or not) 
-that overlap each bin in the genome. Bins with zero counts are skipped, i.e. not added to the output file. 
+that overlap each bin (a region of fixed length, i.e. 25 bp) in the genome. Bins with zero counts are skipped, i.e. not added to the output file. 
 The resulting read counts can be normalized using either a given scaling factor, the RPKM formula or to get a 1x depth of coverage (RPGC).
 
 
 .. image:: $PATH_TO_IMAGES/norm_IGVsnapshot_indFiles.png
 
 
-You can find more details in the `bamCoverage wiki`_.
-
-.. _bamCoverage wiki: https://github.com/fidelram/deepTools/wiki/Normalizations#wiki-bamCoverage
+You can find more details on the bamCoverage wiki page: https://github.com/fidelram/deepTools/wiki/Normalizations#wiki-bamCoverage
 
 
 **Output files**:
--- a/bamFingerprint.xml	Tue Feb 04 03:38:20 2014 -0500
+++ b/bamFingerprint.xml	Tue Feb 04 09:12:07 2014 -0500
@@ -120,11 +120,13 @@
 
 **What it does**
 
-This tool is based on a method developed by Diaz et al. (2012). Stat Appl Genet Mol Biol 11(3).
-The resulting plot can be used to assess the strength of a ChIP (for factors that bind to narrow regions).
+This tool is useful to assess the strength of a ChIP (i.e. how clearly the enrichment signal can be separated from the background signal)
+and it is based on a method developed by Diaz et al. (2012) Stat Appl Genet Mol Biol 11(3).
+
 The tool first samples indexed BAM files and counts all reads overlapping a window (bin) of specified length.
-These counts are then sorted according to their rank and the cumulative sum of read counts are plotted. An ideal input
-with perfect uniform distribution of reads along the genome (i.e. without enrichments in open chromatin etc.) should
+These counts are then sorted according to their rank (the bin with the highest number of reads has the highest rank)
+and the cumulative sum of read counts are plotted. An ideal input (control sample) with perfect uniform distribution of reads
+along the genome (i.e. without enrichments in open chromatin etc.) should
 generate a straight diagonal line. A very specific and strong ChIP enrichment will be indicated by a prominent and steep
 rise of the cumulative sum towards the highest rank. This means that a big chunk of reads from the ChIP sample is located in
 few bins which corresponds to high, narrow enrichments seen for transcription factors.
@@ -133,9 +135,7 @@
 .. image:: $PATH_TO_IMAGES/QC_fingerprint.png
 
 
-You can find more details in the `bamFingerprint wiki`_.
-
-.. _bamFingerprint wiki: https://github.com/fidelram/deepTools/wiki/QC#wiki-bamFingerprint
+You can find more details on the bamFingerprint wiki page: https://github.com/fidelram/deepTools/wiki/QC#wiki-bamFingerprint
 
 
 **Output files**:
--- a/computeGCBias.xml	Tue Feb 04 03:38:20 2014 -0500
+++ b/computeGCBias.xml	Tue Feb 04 09:12:07 2014 -0500
@@ -112,7 +112,7 @@
 
 **What it does**
 
-This tool computes the GC bias using the method proposed by Benjamini and Speed (2012). Nucleic Acids Res. (see below for more explanations)
+This tool computes the GC bias using the method proposed by Benjamini and Speed (2012) Nucleic Acids Res. (see below for more explanations)
 The output is used to plot the bias and can also be used later on to correct the bias with the tool correctGCbias.
 There are two plots produced by the tool: a boxplot showing the absolute read numbers per genomic-GC bin and an x-y plot
 depicting the ratio of observed/expected reads per genomic GC content bin.
@@ -132,9 +132,7 @@
 .. image:: $PATH_TO_IMAGES/QC_GCplots_input.png
 
 
-You can find more details in the `computeGCBias wiki`_.
-
-.. _computeGCBias wiki: https://github.com/fidelram/deepTools/wiki/QC#wiki-computeGCbias
+You can find more details on the computeGCBias wiki page: computeGCBias wiki: https://github.com/fidelram/deepTools/wiki/QC#wiki-computeGCbias
 
 
 **Output files**:
--- a/computeMatrix.xml	Tue Feb 04 03:38:20 2014 -0500
+++ b/computeMatrix.xml	Tue Feb 04 09:12:07 2014 -0500
@@ -194,23 +194,24 @@
 
 **What it does**
 
-This tool summarizes and prepares an intermediary file
-containing scores associated with genomic regions that can be used
+This tool prepares an intermediary file (a gzipped table of values)
+that contains scores associated with genomic regions that can be used
 afterwards to plot a heatmap or a profile.
 
 Genomic regions can really be anything - genes, parts of genes, ChIP-seq
 peaks, favorite genome regions... as long as you provide a proper file
-in BED or INTERVAL format. This tool can also be used to filter and sort
-regions according to their score.
+in BED or INTERVAL format. If you would like to compare different groups of regions
+(i.e. genes from chromosome 2 and 3), you can supply more than 1 BED file, one for each group.
+
+computeMatrix can also be used to filter and sort
+regions according to their score by making use of its advanced output options.
 
 
 .. image:: $PATH_TO_IMAGES/flowChart_computeMatrixetc.png
    :alt: Relationship between computeMatrix, heatmapper and profiler
 
 
-You can find more details in the `computeMatrix wiki`_.
-
-.. _computeMatrix wiki: https://github.com/fidelram/deepTools/wiki/Visualizations#wiki-computeMatrix
+You can find more details on the computeMatrix wiki page: https://github.com/fidelram/deepTools/wiki/Visualizations#wiki-computeMatrix
 
 
 -----
--- a/correctGCBias.xml	Tue Feb 04 03:38:20 2014 -0500
+++ b/correctGCBias.xml	Tue Feb 04 09:12:07 2014 -0500
@@ -45,7 +45,7 @@
         ##--correctedFile $newoutFileName;
         --correctedFile "corrected.bam";
 
-        mv $newoutFileName $outFileName
+        mv "corrected.bam" $outFileName
     </command>
     <inputs>
         <param name="GCbiasFrequenciesFile" type="data" format="tabular" label="Output of computeGCBias" />
@@ -88,13 +88,11 @@
 
 **What it does**
 
-This tool requires the output from computeGCBias to correct the given BAM files according to the method proposed by Benjamini and Speed (2012). Nucleic Acids Res.
-The resulting BAM files can be used in any downstream analyses, but be aware that you should not filter out duplicates from here on.
-
+This tool requires the output from computeGCBias to correct a given BAM file according to the method proposed by
+Benjamini and Speed (2012) Nucleic Acids Res.
+The resulting BAM file can be used in any downstream analyses, but be aware that you should not filter out duplicates from here on.
 
-You can find more details in the `correctGCBias wiki`_.
-
-.. _correctGCBias wiki: https://github.com/fidelram/deepTools/wiki/QC#wiki-correctGCbias
+You can find more details on the correctGCBias wiki page: https://github.com/fidelram/deepTools/wiki/QC#wiki-correctGCbias
 
 
 **Output files**:
--- a/deepTools_macros.xml	Tue Feb 04 03:38:20 2014 -0500
+++ b/deepTools_macros.xml	Tue Feb 04 09:12:07 2014 -0500
@@ -67,7 +67,7 @@
 
         <conditional name="used_multiple_regions">
             <param name="used_multiple_regions_options" type="select" 
-                label="Did you used multiple regions in ComputeMatrix?"
+                label="Did you use multiple regions in computeMatrix?"
                 help="Would you like to cluster the regions according to the similarity of the signal distribution? This is only possible if you used computeMatrix on only one group of regions.">
                 <option value="yes">Yes, I used multiple regions.</option>
                 <option value="no">No, I used only one region.</option>
--- a/heatmapper.xml	Tue Feb 04 03:38:20 2014 -0500
+++ b/heatmapper.xml	Tue Feb 04 09:12:07 2014 -0500
@@ -185,22 +185,21 @@
 
 **What it does**
 
-The heatmapper visualizes scores associated with genomic regions, for example ChIP enrichment values around the TSS of genes. 
-Those values can be visualized individually along each of the regions provided by the user in INTERVAL or BED format. 
-In addition to the heatmap, an average profile plot is plotted on top of the heatmap (can be turned off by the user; 
-it can also be generated separately by the tool profiler). 
-We implemented vast optional parameters and we encourage you to play around with the min/max values displayed in the heatmap as well as 
-with the different coloring options. If you would like to plot heatmaps for different groups of genomic regions individually, 
-e.g. one plot per chromosome, simply supply each group as an individual BED file.
+The heatmapper visualizes scores associated with genomic regions, for example ChIP enrichment values around the TSS of genes.
+Like profiler, it requires that computeMatrix was run first to calculate the values.
+ 
+We implemented vast optional parameters to optimize the visual output and we encourage you to play around with the min/max values displayed in the heatmap as well as 
+with the different coloring options. The most powerful option is the k-means clustering where you simply need to indicate the number of 
+groups with similar read distributions that you expect and the algorithm will do the sorting for you.
+
+Do check the examples on our help page with step-by-step protocols: https://github.com/fidelram/deepTools/wiki/Example-workflows
 
 
 .. image:: $PATH_TO_IMAGES/visual_hm_DmelPolII.png
    :alt: Heatmap of RNA Polymerase II ChIP-seq
 
 
-You can find more details in the `heatmapper wiki`_.
-
-.. _heatmapper wiki: https://github.com/fidelram/deepTools/wiki/Visualizations#wiki-heatmapper
+You can find more details on the tool itself on the heatmapper wiki page: https://github.com/fidelram/deepTools/wiki/Visualizations#wiki-heatmapper
 
 
 -----
--- a/profiler.xml	Tue Feb 04 03:38:20 2014 -0500
+++ b/profiler.xml	Tue Feb 04 09:12:07 2014 -0500
@@ -141,9 +141,9 @@
 **What it does**
 
 This tool plots the average enrichments over all genomic
-regions supplied to computeMarix. It is a very useful complement to the
-heatmapper, especially in cases when you want to compare the scores for
-many different groups. Like heatmapper, profiler does not change the
+regions supplied to computeMarix. It requires that computeMatrix was successfully run.
+It is a very useful complement to the heatmapper, especially in cases when you want to
+compare the scores for many different groups. Like heatmapper, profiler does not change the
 values that were compute by computeMatrix, but you can choose between
 many different ways to color and display the plots.
 
@@ -152,9 +152,7 @@
    :alt: Meta-gene profile of Rna Polymerase II
 
 
-You can find more details in the `profiler wiki`_.
-
-.. _profiler wiki: https://github.com/fidelram/deepTools/wiki/Visualizations#wiki-profiler
+You can find more details on the profiler wiki page: https://github.com/fidelram/deepTools/wiki/Visualizations#wiki-profiler
 
 
 -----