comparison rsem/README @ 0:4edac0183857

Initial commit from tarball version 1.17
author victor
date Mon, 05 Mar 2012 11:12:34 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:4edac0183857
1 # RSEM Galaxy Wrapper #
2
3 ## Introduction ##
4
5 RSEM (RNA-Seq by Expectation-Maximization) is a software package for the
6 estimation of gene and isoform abundances from RNA-Seq data. A key feature of
7 RSEM is its statistically-principled approach to the handling of RNA-Seq
8 reads that map to multiple genes and/or isoforms. In addition, RSEM is
9 well-suited to performing quantification with de novo transcriptome
10 assemblies, as it does not require a reference genome.
11
12 ## Installation ##
13
14 Follow the [Galaxy Tool Shed
15 instructions](http://wiki.g2.bx.psu.edu/Tool_Shed) to add this wrapper from
16 the tool shed to your galaxy instance. Once the files are in the tools
17 directory you have to have RSEM references installed. This can be done by:
18
19 1. Placing the file called `rsem_indices.loc` into the directory
20 `~/galaxy-dist/tool-data` This file tells the RSEM wrapper how to find the
21 reference(s). It is formatted according to galaxy's documentation with the
22 following tab-delimited format:
23
24 unique_build_id dbkey display_name file_base_path
25
26 For example,
27
28 human_refseq_NM human_refseq_NM human_refseq_NM /opt/galaxy/references/human/1.1.2/NM_refseq_ref
29
30 2. Downloaded a pre-built RSEM reference from the [RSEM website](http://deweylab.biostat.wisc.edu/rsem/).
31
32 3. Place reference files into the `file_base_path` listed in the
33 `rsem_indices.loc` file
34
35 If you would rather build your own reference files follow the instructions
36 below and then place resulting reference files into the `file_base_path` listed
37 in the `rsem_indices.loc` file.
38
39 ### Building a custom RSEM reference ###
40
41 For instructions on how to build the RSEM reference files, first see the [RSEM
42 documentation](http://deweylab.biostat.wisc.edu/rsem/README.html).
43
44 #### Example ####
45
46 Suppose we have mouse RNA-Seq data and want to use the UCSC mm9 version of the
47 mouse genome. We have downloaded the UCSC Genes transcript annotations in GTF
48 format (as mm9.gtf) using the Table Browser and the knownIsoforms.txt file for
49 mm9 from the UCSC Downloads. We also have all chromosome files for mm9 in the
50 directory `/data/mm9`. We want to put the generated reference files under
51 `/opt/galaxy/references` with name `mouse_125`. We'll add poly(A) tails with
52 length 125. Please note that GTF files generated from UCSC's Table Browser do
53 not contain isoform-gene relationship information. For the UCSC Genes
54 annotation, this information can be obtained from the knownIsoforms.txt file.
55 Suppose we want to build Bowtie indices and Bowtie executables are found in
56 `/sw/bowtie`.
57
58 To build the reference files, first run the command:
59
60 rsem-prepare-reference --gtf mm9.gtf \
61 --transcript-to-gene-map knownIsoforms.txt \
62 --bowtie-path /sw/bowtie \
63 /data/mm9/chr1.fa,/data/mm9/chr2.fa,...,/data/mm9/chrM.fa \
64 /opt/galaxy/references/mouse_125
65
66 To add this reference to your galaxy installation, add the following line to
67 the the `rsem_indices.loc` file:
68
69 mouse_125 mouse_125 mouse_125 /opt/galaxy/references/mouse_125
70
71 Then restart galaxy and you should see the `mouse_125` reference listed in the
72 RSEM wrapper.
73
74 ## References ##
75
76 * [RSEM website (stand alone package)](http://deweylab.biostat.wisc.edu/rsem/)
77
78 * B. Li and C. Dewey (2011) [RSEM: accurate transcript quantification from
79 RNA-Seq data with or without a reference
80 genome](http://bioinformatics.oxfordjournals.org/content/26/4/493.abstract).
81 BMC Bioinformatics 12:323.
82
83 * B. Li, V. Ruotti, R. Stewart, J. Thomson, and C. Dewey (2010) [RNA-Seq gene
84 expression estimation with read mapping
85 uncertainty](http://www.biomedcentral.com/1471-2105/12/323). Bioinformatics
86 26(4): 493-500.
87
88 ## Contact information ##
89 * RSEM galaxy wrapper questions: ruotti@wisc.edu
90 * RSEM stand alone package questions: bli@cs.wisc.edu
91 * [RSEM announcements mailing list](http://groups.google.com/group/rsem-announce)
92 * [RSEM users mailing list](http://groups.google.com/group/rsem-users)