Galaxy | Tool Preview

CollectRnaSeqMetrics (version 3.1.1.0)
If empty, upload or import a SAM/BAM dataset
REFERENCE_SEQUENCE
RIBOSOMAL_INTERVALS; If not specified no bases will be identified as being ribosomal. The list of intervals can be geberated from BED or Interval datasets using Galaxy BedToIntervalList tool
STRAND_SPECIFICITY; For unpaired reads, use FIRST_READ_TRANSCRIPTION_STRAND if the reads are expected to be on the transcription strand.
MINIMUM_LENGTH; default=500
Sequences to ignores
Sequences to ignore 0
RRNA_FRAGMENT_PERCENTAGE; default=0.8
METRIC_ACCUMULATION_LEVEL
ASSUME_SORTED
Setting stringency to SILENT can improve performance when processing a BAM file in which variable-length data (read, qualities, tags) do not otherwise need to be decoded.

Purpose

Collects metrics about the alignment of RNA to various functional classes of loci in the genome: coding, intronic, UTR, intergenic, ribosomal.


Dataset collections - processing large numbers of datasets at once

This will be added shortly


Obtaining gene annotations in refFlat format

This tool requires gene annotations in refFlat format. These data can be obtained from UCSC table browser directly through Galaxy by following these steps:

  1. Click on Get Data in the upper part of left pane of Galaxy interface
  2. Click on UCSC Main link
  3. Set your genome and dataset of interest. It must be the same genome build against which you have mapped the reads contained in the BAM file you are analyzing
  4. In the output format field choose selected fields from primary and related tables
  5. Click get output button
  6. In the first table presented at the top of the page select (using checkboxes) first 11 fields: name chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts exonEnds proteinId
  7. Click done with selection
  8. Click Send query to Galaxy
  9. A new dataset will appear in the current Galaxy history
  10. Use this dataset as the input for Gene annotations in refFlat form dropdown of this tool

Inputs, outputs, and parameters

Either a SAM file or a BAM file must be supplied. Galaxy automatically coordinate-sorts all uploaded BAM files.

From Picard documentation( http://broadinstitute.github.io/picard/):

REF_FLAT=File                 Gene annotations in refFlat form.  Format described here:
                              https://genome.ucsc.edu/FAQ/FAQformat.html#format9  Required.

RIBOSOMAL_INTERVALS=File      Location of rRNA sequences in genome, in interval_list format.  If not specified no bases
                              will be identified as being ribosomal. Format described here:
                              https://samtools.github.io/htsjdk/javadoc/htsjdk/htsjdk/samtools/util/IntervalList.html and can be
                              generated from BED datasetes using Galaxy's wrapper for picard_BedToIntervalList tool

STRAND_SPECIFICITY=StrandSpecificity
STRAND=StrandSpecificity      For strand-specific library prep. For unpaired reads, use FIRST_READ_TRANSCRIPTION_STRAND
                              if the reads are expected to be on the transcription strand.  Required. Possible values:
                              {NONE, FIRST_READ_TRANSCRIPTION_STRAND, SECOND_READ_TRANSCRIPTION_STRAND}

MINIMUM_LENGTH=Integer        When calculating coverage based values (e.g. CV of coverage) only use transcripts of this
                              length or greater.  Default value: 500.

IGNORE_SEQUENCE=String        If a read maps to a sequence specified with this option, all the bases in the read are
                              counted as ignored bases.

RRNA_FRAGMENT_PERCENTAGE=Double
                              This percentage of the length of a fragment must overlap one of the ribosomal intervals
                              for a read or read pair by this must in order to be considered rRNA.  Default value: 0.8.

METRIC_ACCUMULATION_LEVEL=MetricAccumulationLevel
LEVEL=MetricAccumulationLevel The level(s) at which to accumulate metrics.    Possible values: {ALL_READS, SAMPLE,
                              LIBRARY, READ_GROUP} This option may be specified 0 or more times.

ASSUME_SORTED=Boolean
AS=Boolean                    If true (default), then the sort order in the header file will be ignored.  Default
                              value: true. Possible values: {true, false}

Additional information

Additional information about Picard tools is available from Picard web site at http://broadinstitute.github.io/picard/ .