# HG changeset patch # User dongjun # Date 1357851417 18000 # Node ID 95a657f15ba7cba956b65d4d75f47521a61f5b2e # Parent b6d0c6ceda2cb9c78a6eea914f9f8841d587d378 Uploaded diff -r b6d0c6ceda2c -r 95a657f15ba7 mosaics.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mosaics.xml Thu Jan 10 15:56:57 2013 -0500 @@ -0,0 +1,313 @@ + + + + + + + + R + + + + mosaics_wrapper.pl + ## ChIP file info + $readFileType.chipParams.chip + $readFileType.chipParams.chipFileFormat + ## control file info + $readFileType.controlParams.control + $readFileType.controlParams.controlFileFormat + ## peak file info + $out_peak + $OutfileFormat + ## analysis type + IO + ## optional output + $report_summary + $report_gof + $report_exploratory + ## settings for model fitting and peak calling: required (FALSE, FALSE, 0.05, 200, 50, 0) + $readFileType.pet + $by_chr + $fdrLevel + $fragLen + $binSize + $capping + #if $fitParams.fSettingsType == "preSet" + ## settings for model fitting and peak calling: optional + BIC + automatic + 0.25 + 200 + 50 + 10 + ## setting for parallel computing + TRUE + 8 + #else + $fitParams.signalModel + $fitParams.bgEst + $fitParams.d + $fitParams.maxgap + $fitParams.minsize + $fitParams.thres + $fitParams.parallel + $fitParams.nCore + #end if + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + summary == 1 + + + gof == 1 + + + exploratory == 1 + + + + + +**What it does** + +MOSAiCS is a statistical framework for the analysis of ChIP-seq data and it stands for MOdel-based one and two Sample Analysis and Inference for ChIP-Seq Data. MOSAiCS is based on a flexible parametric mixture modeling approach for detecting peaks (i.e., enriched regions). +MOSAiCS is also available in Bioconductor_ as a R package. +We encourage questions or requests regarding MOSAiCS to be posted on our `Google group`_. + +Please cite: Kuan PF, Chung D, Pan G, Thomson JA, Stewart R, and Keles S (2011), "`A statistical framework for the analysis of ChIP-Seq data`_," *Journal of the American Statistical Association*, Vol. 106, pp. 891--903. + +.. _Bioconductor: http://www.bioconductor.org/help/bioc-views/2.11/bioc/html/mosaics.html +.. _Google group: http://groups.google.com/group/mosaics_user_group +.. _A statistical framework for the analysis of ChIP-Seq data: http://pubs.amstat.org/doi/abs/10.1198/jasa.2011.ap09706 + +------ + +**Input formats** + +MOSAiCS accepts aligned read files of ChIP and control samples as input. Currently, MOSAiCS accepts Eland result, Eland extended, Eland export, Bowtie default, SAM, BED, and CSEM formats for single-end tag (SET) data. For paired-end tag (PET) data, MOSAiCS accepts Eland result and SAM formats. + +------ + +**Outputs** + +Peak calling results of MOSAiCS can be exported into BED or GFF file formats, or as a table. Each line of the output file specifies a single peak. + +If the output is a table, it has the following columns:: + + Column Description + -------- -------------------------------------------------------- + 1 Chromosome of the peak + 2 Start position of the peak + 3 End position of the peak + 4 Width of the peak + 5 Averaged posterior probability of the peak + 6 Minimum posterior probability of the peak + 7 Averaged ChIP tag counts of the peak + 8 Maximum ChIP tag counts of the peak + 9 Averaged control tag counts of the peak + 10 Averaged control tag counts of the peak, scaled by sequencing depth + 11 Averaged log base 2 ratio of ChIP over input tag counts + +If the output is in BED format, it has the following columns:: + + Column Description + ------------ -------------------------------------------------------- + 1 chrom Chromosome of the peak + 2 chromStart Start position of the peak + 3 chromEnd End position of the peak + 4 name Always "MOSAiCS_peak" + 5 score Averaged ChIP tag counts of the peak + +If the output is in GFF format, it has the following columns:: + + Column Description + --------- -------------------------------------------------------- + 1 seqname Chromosome of the peak + 2 source Always "MOSAiCS" + 3 feature Always "MOSAiCS_peak" + 4 start Start position of the peak + 5 end End position of the peak + 6 score Averaged ChIP tag counts of the peak + 7 strand Always "." + 8 frame Always "." + 9 group Always "." + +------ + +**Reports for diagnostics** + +*Summary of model fitting and peak calling*: This report provides information about input and output files, parameter settings used for model fitting and peak calling, and brief summary of peak calling results. + +*Goodness of fit (GOF) plots*: This report allows visual comparisons of the fits of the background, one-signal-component, and two-signal-component models with the actual data. + +*Plots of exploratory analysis*: This report provides the histograms of ChIP and control samples and the scatter plots of ChIP versus control tag counts. + +More details regarding these reports can be found here_: + +------ + +**Settings for model fitting and peak calling** + +More details about the tuning of these parameters can be found here_: + +.. _here: http://www.bioconductor.org/packages/2.11/bioc/vignettes/mosaics/inst/doc/mosaics-example.pdf + + +