Galaxy | Tool Preview

Cuffdiff (version 2.2.1.6)
A transcript GFF3 or GTF file produced by cufflinks, cuffcompare, or other source.
Discard the tabular output.
Generate a SQLite database for use with cummeRbund.
CuffNorm supports either CXB (from cuffquant) or SAM/BAM input files. Mixing is not supported. Default: SAM/BAM
Conditions
Condition 0
If using only one sample per condition, you must use 'blind.'
The allowed false discovery rate.
The minimum number of alignments in a locus for needed to conduct significance testing on changes in that locus observed between samples.
Tells Cufflinks to do an initial estimation procedure to more accurately weight reads mapping to multiple locations in the genome.
Bias detection and correction can significantly improve accuracy of transcript abundance estimates.
Read group datasets provide information on replicates.
Cuffdiff estimates the number of fragments that originated from each transcript, primary transcript, and gene in each sample. Primary transcript and gene counts are computed by summing the counts of transcripts in each primary transcript group or gene group.
mode of length normalization to transcript fpkm.

Cuffdiff Overview

Cuffdiff is part of Cufflinks. Cuffdiff find significant changes in transcript expression, splicing, and promoter use. Please cite: Trapnell C, Williams BA, Pertea G, Mortazavi AM, Kwan G, van Baren MJ, Salzberg SL, Wold B, Pachter L. Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms. Nature Biotechnology doi:10.1038/nbt.1621


Know what you are doing

There is no such thing (yet) as an automated gearshift in expression analysis. It is all like stick-shift driving in San Francisco. In other words, running this tool with default parameters will probably not give you meaningful results. A way to deal with this is to understand the parameters by carefully reading the documentation and experimenting. Fortunately, Galaxy makes experimenting easy.


Input format

Cuffdiff takes Cufflinks or Cuffcompare GTF files as input along with two SAM files containing the fragment alignments for two or more samples.


Outputs

Cuffdiff produces many output files:

  1. Transcript FPKM (+count) expression tracking.
  2. Gene FPKM (+count) expression tracking; tracks the summed FPKM of transcripts sharing each gene_id
  3. Primary transcript FPKM (+count) tracking; tracks the summed FPKM of transcripts sharing each tss_id
  4. Coding sequence FPKM (+count) tracking; tracks the summed FPKM of transcripts sharing each p_id, independent of tss_id
  5. Transcript differential FPKM.
  6. Gene differential FPKM. Tests difference sin the summed FPKM of transcripts sharing each gene_id
  7. Primary transcript differential FPKM. Tests difference sin the summed FPKM of transcripts sharing each tss_id
  8. Coding sequence differential FPKM. Tests difference sin the summed FPKM of transcripts sharing each p_id independent of tss_id
  9. Differential splicing tests: this tab delimited file lists, for each primary transcript, the amount of overloading detected among its isoforms, i.e. how much differential splicing exists between isoforms processed from a single primary transcript. Only primary transcripts from which two or more isoforms are spliced are listed in this file.
  10. Differential promoter tests: this tab delimited file lists, for each gene, the amount of overloading detected among its primary transcripts, i.e. how much differential promoter use exists between samples. Only genes producing two or more distinct primary transcripts (i.e. multi-promoter genes) are listed here.
  11. Differential CDS tests: this tab delimited file lists, for each gene, the amount of overloading detected among its coding sequences, i.e. how much differential CDS output exists between samples. Only genes producing two or more distinct CDS (i.e. multi-protein genes) are listed here.

Settings

All of the options have a default value. You can change any of them. Most of the options in Cuffdiff have been implemented here.


Cuffdiff parameter list

This is a list of implemented Cuffdiff options:

-m INT                         Average fragment length (SE reads); default 200
-s INT                         Fragment legnth standard deviation (SE reads); default 80
-c INT                         The minimum number of alignments in a locus for needed to conduct significance testing on changes in that locus observed between samples. If no testing is performed, changes in the locus are deemed not significant, and the locus' observed changes don't contribute to correction for multiple testing. The default is 1,000 fragment alignments (up to 2,000 paired reads).
--FDR FLOAT                    The allowed false discovery rate. The default is 0.05.
--max-mle-iterations INT       Sets the number of iterations allowed during maximum likelihood estimation of abundances. Default: 5000
--library-norm-method          Library Normalization method : Geometric (default), classic-fpkm, quartile
--dispersion-method            Dispersion estimation method : Pooled (default), per-condition, blind, poisson
-u                             Multi read correction tells Cufflinks to do an initial estimation procedure to more accurately weight reads mapping to multiple locations in the genome.
-b ref.fasta                         bias correction. Bias detection and correction can significantly improve accuracy of transcript abundance estimates.
--no-effective-length-correction  Use standard length correction
--no-length-correction         Disable all length correction.
--library-type                 ff-firststrand,ff-secondstrand,ff-unstranded,fr-firstrand,fr-secondstrand,fr-unstranded,transfrags
--mask-file (gff3/gtf)         Ignore all alignment within transcripts in this file
--time-series                  Treat provided sam files as time series
--compatible-hits-norm         With this option, Cufflinks counts only those fragments compatible with some reference transcript towards the number of mapped fragments used in the FPKM denominator. Using this mode is generally recommended in Cuffdiff to reduce certain types of bias caused by differential amounts of ribosomal reads which can create the impression of falsely differentially expressed genes.
--total-hits-norm              With this option, Cufflinks counts all fragments, including those not compatible with any reference transcript, towards the number of mapped fragments used in the FPKM denominator
--max-bundle-frags             Sets the maximum number of fragments a locus may have before being skipped. Skipped loci are listed in skipped.gtf.
--num-frag-count-draws         Cuffdiff will make this many draws from each transcript's predicted negative binomial random numbder generator. Each draw is a number of fragments that will be probabilistically assigned to the transcripts in the transcriptome. Used to estimate the variance-covariance matrix on assigned fragment counts.
--num-frag-assign-draws        For each fragment drawn from a transcript, Cuffdiff will assign it this many times (probabilistically), thus estimating the assignment uncertainty for each transcript. Used to estimate the variance-covariance matrix on assigned fragment counts.
--min-reps-for-js-test         Cuffdiff won't test genes for differential regulation unless the conditions in question have at least this many replicates.