Mercurial > repos > victor > rsem
comparison rsem/README @ 0:4edac0183857
Initial commit from tarball version 1.17
author | victor |
---|---|
date | Mon, 05 Mar 2012 11:12:34 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:4edac0183857 |
---|---|
1 # RSEM Galaxy Wrapper # | |
2 | |
3 ## Introduction ## | |
4 | |
5 RSEM (RNA-Seq by Expectation-Maximization) is a software package for the | |
6 estimation of gene and isoform abundances from RNA-Seq data. A key feature of | |
7 RSEM is its statistically-principled approach to the handling of RNA-Seq | |
8 reads that map to multiple genes and/or isoforms. In addition, RSEM is | |
9 well-suited to performing quantification with de novo transcriptome | |
10 assemblies, as it does not require a reference genome. | |
11 | |
12 ## Installation ## | |
13 | |
14 Follow the [Galaxy Tool Shed | |
15 instructions](http://wiki.g2.bx.psu.edu/Tool_Shed) to add this wrapper from | |
16 the tool shed to your galaxy instance. Once the files are in the tools | |
17 directory you have to have RSEM references installed. This can be done by: | |
18 | |
19 1. Placing the file called `rsem_indices.loc` into the directory | |
20 `~/galaxy-dist/tool-data` This file tells the RSEM wrapper how to find the | |
21 reference(s). It is formatted according to galaxy's documentation with the | |
22 following tab-delimited format: | |
23 | |
24 unique_build_id dbkey display_name file_base_path | |
25 | |
26 For example, | |
27 | |
28 human_refseq_NM human_refseq_NM human_refseq_NM /opt/galaxy/references/human/1.1.2/NM_refseq_ref | |
29 | |
30 2. Downloaded a pre-built RSEM reference from the [RSEM website](http://deweylab.biostat.wisc.edu/rsem/). | |
31 | |
32 3. Place reference files into the `file_base_path` listed in the | |
33 `rsem_indices.loc` file | |
34 | |
35 If you would rather build your own reference files follow the instructions | |
36 below and then place resulting reference files into the `file_base_path` listed | |
37 in the `rsem_indices.loc` file. | |
38 | |
39 ### Building a custom RSEM reference ### | |
40 | |
41 For instructions on how to build the RSEM reference files, first see the [RSEM | |
42 documentation](http://deweylab.biostat.wisc.edu/rsem/README.html). | |
43 | |
44 #### Example #### | |
45 | |
46 Suppose we have mouse RNA-Seq data and want to use the UCSC mm9 version of the | |
47 mouse genome. We have downloaded the UCSC Genes transcript annotations in GTF | |
48 format (as mm9.gtf) using the Table Browser and the knownIsoforms.txt file for | |
49 mm9 from the UCSC Downloads. We also have all chromosome files for mm9 in the | |
50 directory `/data/mm9`. We want to put the generated reference files under | |
51 `/opt/galaxy/references` with name `mouse_125`. We'll add poly(A) tails with | |
52 length 125. Please note that GTF files generated from UCSC's Table Browser do | |
53 not contain isoform-gene relationship information. For the UCSC Genes | |
54 annotation, this information can be obtained from the knownIsoforms.txt file. | |
55 Suppose we want to build Bowtie indices and Bowtie executables are found in | |
56 `/sw/bowtie`. | |
57 | |
58 To build the reference files, first run the command: | |
59 | |
60 rsem-prepare-reference --gtf mm9.gtf \ | |
61 --transcript-to-gene-map knownIsoforms.txt \ | |
62 --bowtie-path /sw/bowtie \ | |
63 /data/mm9/chr1.fa,/data/mm9/chr2.fa,...,/data/mm9/chrM.fa \ | |
64 /opt/galaxy/references/mouse_125 | |
65 | |
66 To add this reference to your galaxy installation, add the following line to | |
67 the the `rsem_indices.loc` file: | |
68 | |
69 mouse_125 mouse_125 mouse_125 /opt/galaxy/references/mouse_125 | |
70 | |
71 Then restart galaxy and you should see the `mouse_125` reference listed in the | |
72 RSEM wrapper. | |
73 | |
74 ## References ## | |
75 | |
76 * [RSEM website (stand alone package)](http://deweylab.biostat.wisc.edu/rsem/) | |
77 | |
78 * B. Li and C. Dewey (2011) [RSEM: accurate transcript quantification from | |
79 RNA-Seq data with or without a reference | |
80 genome](http://bioinformatics.oxfordjournals.org/content/26/4/493.abstract). | |
81 BMC Bioinformatics 12:323. | |
82 | |
83 * B. Li, V. Ruotti, R. Stewart, J. Thomson, and C. Dewey (2010) [RNA-Seq gene | |
84 expression estimation with read mapping | |
85 uncertainty](http://www.biomedcentral.com/1471-2105/12/323). Bioinformatics | |
86 26(4): 493-500. | |
87 | |
88 ## Contact information ## | |
89 * RSEM galaxy wrapper questions: ruotti@wisc.edu | |
90 * RSEM stand alone package questions: bli@cs.wisc.edu | |
91 * [RSEM announcements mailing list](http://groups.google.com/group/rsem-announce) | |
92 * [RSEM users mailing list](http://groups.google.com/group/rsem-users) |