rMATS
RMATS is a computational tool to detect differential alternative splicing events from RNA-Seq data. The statistical model of MATS calculates the P-value and false discovery rate that the difference in the isoform ratio of a gene between two conditions exceeds a given user-defined threshold. From the RNA-Seq data, MATS can automatically detect and analyze alternative splicing events corresponding to all major types of alternative splicing patterns. MATS handles replicate RNA-Seq data from both paired and unpaired study design.
INPUTS
BAM files
Reads can be mapped independently of rMATS with any aligner and then the resulting BAM files can be used as input to rMATS. rMATS requires aligned reads to match --readLength unless --variable-read-length is given. rMATS also ignores alignments with soft or hard clipping unless --allow-clipping is given.
https://github.com/Xinglab/rmats-turbo#starting-with-bam-files
OUTPUTS
https://github.com/Xinglab/rmats-turbo#output
Splicing Events
Each alternative splicing event type has a corresponding set of output files. In the filename templates below [AS_Event] is replaced by one of [SE (skipped exon), MXE (mutually exclusive exons), A3SS (alternative 3' splice site), A5SS (alternative 5' splice site), RI (retained intron)] for the event specific filename.
- Output Files:
- summary.txt: Brief summary of all AS event types. Includes the total event counts and significant event counts. By default, events are counted as significant if FDR <= 0.05.
- [AS_Event].MATS.JC.txt: Final output including only reads that span junctions defined by rmats (Junction Counts)
- [AS_Event].MATS.JCEC.txt: Final output including both reads that span junctions defined by rmats (Junction Counts) and reads that do not cross an exon boundary (Exon Counts)
- fromGTF.[AS_Event].txt: All identified alternative splicing (AS) events derived from GTF and RNA
- fromGTF.novelJunction.[AS_Event].txt: Alternative splicing (AS) events which were identified only after considering the RNA (as opposed to analyzing the GTF in isolation). This does not include events with an unannotated splice site.
- fromGTF.novelSpliceSite.[AS_Event].txt: This file contains only those events which include an unannotated splice site. Only relevant if --novelSS is enabled.
- JC.raw.input.[AS_Event].txt: Event counts including only reads that span junctions defined by rmats (Junction Counts)
- JCEC.raw.input.[AS_Event].txt: Event counts including both reads that span junctions defined by rmats (Junction Counts) and reads that do not cross an exon boundary (Exon Counts)
- Shared columns:
- ID: rMATS event id
- GeneID: Gene id
- geneSymbol: Gene name
- chr: Chromosome
- strand: Strand of the gene
- IJC_SAMPLE_1: Inclusion counts for sample 1. Replicates are comma separated
- SJC_SAMPLE_1: Skipping counts for sample 1. Replicates are comma separated
- IJC_SAMPLE_2: Inclusion counts for sample 2. Replicates are comma separated
- SJC_SAMPLE_2: Skipping counts for sample 2. Replicates are comma separated
- IncFormLen: Length of inclusion form, used for normalization
- SkipFormLen: Length of skipping form, used for normalization
- PValue: Significance of splicing difference between the two sample groups. (Only available if the statistical model is on)
- FDR: False Discovery Rate calculated from p-value. (Only available if statistical model is on)
- IncLevel1: Inclusion level for sample 1. Replicates are comma separated. Calculated from normalized counts
- IncLevel2: Inclusion level for sample 2. Replicates are comma separated. Calculated from normalized counts
- IncLevelDifference: average(IncLevel1) - average(IncLevel2)
- Event specific columns (event coordinates):
- SE: exonStart_0base exonEnd upstreamES upstreamEE downstreamES downstreamEE
+ The inclusion form includes the target exon (exonStart_0base, exonEnd)
- MXE: 1stExonStart_0base 1stExonEnd 2ndExonStart_0base 2ndExonEnd upstreamES upstreamEE downstreamES downstreamEE
+ If the strand is + then the inclusion form includes the 1st exon (1stExonStart_0base, 1stExonEnd) and skips the 2nd exon
+ If the strand is - then the inclusion form includes the 2nd exon (2ndExonStart_0base, 2ndExonEnd) and skips the 1st exon
- A3SS, A5SS: longExonStart_0base longExonEnd shortES shortEE flankingES flankingEE
+ The inclusion form includes the long exon (longExonStart_0base, longExonEnd) instead of the short exon (shortES shortEE)
- RI: riExonStart_0base riExonEnd upstreamES upstreamEE downstreamES downstreamEE
+ The inclusion form includes (retains) the intron (upstreamEE, downstreamES)