What it does
paralyzer is an algorithm to generate a high resolution
map of interaction sites between RNA-binding proteins and their targets. The
algorithm utilizes the deep sequencing reads generated by PAR-CLIP
(Photoactivatable-Ribonucleoside-Enhanced Crosslinking and
Immunoprecipitation) protocol.The use of photoactivatable nucleotides in the
PAR-CLIP protocol results in more efficient crosslinking between the
RNA-binding protein and its target relative to other CLIP methods; in addition
a nucleotide substitution occurs at the site of crosslinking, providing for
single-nucleotide resolution binding information. PARalyzer utilizes this
nucleotide substition in a kernel density estimate classifier to generate
the high resolution set of Protein-RNA interaction sites.
Approaches
EXTEND_BY_READ: including this line means that the cluster will be extended
beyond the signal to include a region such that it extends to
the end of any read that falls within the cluster and contained
a conversion, or until the minimum read depth
(MINIMUM_READ_COUNT_FOR_CLUSTER_INCLUSION parameter) is no longer met
HAFNER_APPROACH: identifies the location with the largest number of conversion
events and extends the cluster up to
( parameter ADDITIONAL_NUCLEOTIDES_BEYOND_SIGNAL)nt
in each direction from that point, or until the minimum
read depth (MINIMUM_READ_COUNT_FOR_CLUSTER_INCLUSION parameter) is no longer met
ADDITIONAL_NUCLEOTIDES_BEYOND_SIGNAL: the maximum number of reads to
extend beyond the positive signal in each direction (default 0)
the cluster is defined as the region where the conversion KDE is above
the background KDE and then extended up to #integer#, or until the minimum
read depth (MINIMUM_READ_COUNT_FOR_CLUSTER_INCLUSION parameter) is no longer met
Outputs
- DISTRIBUTIONS: contains the signal KDE, background KDE, read count & conversion for all locations within each group
- The data will be in blocks of four lines for each group
- groups on the reverse strand do not need to be reversed; the values always equal nucleotdies from GroupStart to GroupEnd, regardless of Strand
- First Column = Chromosome = chromosome on which the group resides
- Second Column = Strand = orientation in which the group resides
- Third Column = GroupStart = beginning coordinate on the chromosome of the group
- Fourth Column = GroupEnd = ending coordinate on the chromosome of the group
- Fifth Column = GroupID = unique ID for the group
- Sixth Column = Information = reports if the current line contains the Signal, Background, Conversion Percent, or ReadCount
- All nucleotides that do not have any possibility of having a conversion event are given a value of -1
- All Subsequent Columns: the values for each nucleotide from GroupStart until GroupEnd
- GROUPS: a comma separated file containing the information about the resulting groups
- Chromosome = chromosome on which the group resides
- Strand = orientation in which the group resides
- GroupStart = beginning coordinate on the chromosome of the group
- GroupEnd = ending coordinate on the chromosome of the group
- GroupID = unique ID for the group
- ReadCount = number of reads within the group
- CLUSTERS: a comma separated file containing the information about the resulting clusters
- Chromosome = chromosome on which the cluster resides
- Strand = orientation in which the cluster resides
- ClusterStart = beginning coordinate on the chromosome of the cluster
- ClusterEnd = ending coordinate on the chromosome of the cluster
- ClusterID = unique ID for the cluster
- ClusterSequence = sequence of the cluster
- ReadCount = number of reads that overlap the cluster by at least 1 nucleotide
- ModeLocation = coordinate of the location with the highest signal / (signal + background) value
- ModeScore = score of the highest signal / (signal + background) value
- ConversionLocationCount = number of unique location where at least 1 conversion occurred
- ConversionEventCount = total number of conversions that occurred within the cluster
- NonConversionEventCount = total number of possible conversion events that did not occur