Dataset formats
The input is in GFF format, and the output is tabular. (Dataset missing?)
What it does
PASS (Poisson Approximation for Statistical Significance) detects significant transcription factor binding sites in the genome from ChIP data. This is probably the only peak-calling method that accurately controls the false-positive rate and FDR in ChIP data, which is important given the huge discrepancy in results obtained from different peak-calling algorithms. At the same time, this method achieves a similar or better power than previous methods.
Hints
ChIP-Seq data:
If the data is from ChIP-Seq, you need to convert the ChIP-Seq values into z-scores before using this program. It is also recommended that you group read counts within a neighborhood together, e.g. in tiled windows of 30bp. In this way, the ChIP-Seq data will resemble ChIP-chip data in format.
Choosing window size options:
The window size is related to the probe tiling density. For example, if the probes are tiled at every 100bp, then setting the smallest window = 2 and largest window = 6 is appropriate, because the DNA fragment size is around 300-500bp.
Example
input file:
chr7 Nimblegen ID 40307603 40307652 1.668944 . . . chr7 Nimblegen ID 40307703 40307752 0.8041307 . . . chr7 Nimblegen ID 40307808 40307865 -1.089931 . . . chr7 Nimblegen ID 40307920 40307969 1.055044 . . . chr7 Nimblegen ID 40308005 40308068 2.447853 . . . chr7 Nimblegen ID 40308125 40308174 0.1638694 . . . chr7 Nimblegen ID 40308223 40308275 -0.04796628 . . . chr7 Nimblegen ID 40308318 40308367 0.9335709 . . . chr7 Nimblegen ID 40308526 40308584 0.5143972 . . . chr7 Nimblegen ID 40308611 40308660 -1.089931 . . . etc.
In GFF, a value of dot '.' is used to mean "not applicable".
output file:
ID Chr Start End WinSz PeakValue # of FPs FDR 1 chr7 40310931 40311266 4 1.663446 0.248817 0.248817
References
Zhang Y. (2008) Poisson approximation for significance in genome-wide ChIP-chip tiling arrays. Bioinformatics. 24(24):2825-31. Epub 2008 Oct 25.
Chen KB, Zhang Y. (2010) A varying threshold method for ChIP peak calling using multiple sources of information. Submitted.