Galaxy | Tool Preview

MiRDeep2 Mapper (version 2.0.0.8.1)
Reads in fastq or FASTA format
Remove all entries that have a sequence that contains letters other than a,c,g,t,u,n,A,C,G,T,U,N. (-j)
(-i)
(-k)
Set to 0 to keep all reads. (-l)
(-m) and/or (-p)
Map to genome. (-p)
If your genome of interest is not listed, contact your Galaxy admin.
(-q)
Map threshold. (-r)

What it does

The MiRDeep2 Mapper module is designed as a tool to process deep sequencing reads and/or map them to the reference genome. The module works in sequence space, and can process or map data that is in sequence FASTA format. A number of the functions of the mapper module are implemented specifically with Solexa/Illumina data in mind.

Input

Default input is a file in FASTA format, seq.txt or qseq.txt format. More input can be given depending on the options used.

Output

The output depends on the options used. Either a FASTA file with processed reads or an arf file with with mapped reads, or both, are output.

Arf format: Is a proprietary file format generated and processed by miRDeep2. It contains information of reads mapped to a reference genome. Each line in such a file contains 13 columns:

  1. read identifier
  2. length of read sequence
  3. start position in read sequence that is mapped
  4. end position in read sequence that is mapped
  5. read sequence
  6. identifier of the genome-part to which a read is mapped to. This is either a scaffold id or a chromosome name
  7. length of the genome sequence a read is mapped to
  8. start position in the genome where a read is mapped to
  9. end position in the genome where a read is mapped to
  10. genome sequence to which a read is mapped
  11. genome strand information. Plus means the read is aligned to the sense-strand of the genome. Minus means it is aligned to the antisense-strand of the genome.
  12. Number of mismatches in the read mapping
  13. Edit string that indicates matches by lowercase 'm' and mismatches by uppercase 'M'