What it does
MiRDeep2 is a software package for identification of novel and known miRNAs in deep sequencing data. Furthermore, it can be used for miRNA expression profiling across samples.
Input
A FASTA file with deep sequencing reads, a FASTA file of the corresponding genome, a file of mapped reads to the genome in miRDeep2 arf format, an optional fasta file with known miRNAs of the analysing species and an option fasta file of known miRNAs of related species.
Arf format:
Is a proprietary file format generated and processed by miRDeep2. It contains information of reads mapped to a reference genome. Each line in such a file contains 13 columns:
- read identifier
- length of read sequence
- start position in read sequence that is mapped
- end position in read sequence that is mapped
- read sequence
- identifier of the genome-part to which a read is mapped to. This is either a scaffold id or a chromosome name
- length of the genome sequence a read is mapped to
- start position in the genome where a read is mapped to
- end position in the genome where a read is mapped to
- genome sequence to which a read is mapped
- genome strand information. Plus means the read is aligned to the sense-strand of the genome. Minus means it is aligned to the antisense-strand of the genome.
- Number of mismatches in the read mapping
- Edit string that indicates matches by lowercase 'm' and mismatches by uppercase 'M'