comparison README.md @ 0:84361ce36a11 draft

planemo upload commit fb90aafc93e5e63acfcdac4c27cfd865cdf06c5a-dirty
author nturaga
date Tue, 19 Apr 2016 11:10:25 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:84361ce36a11
1 Minfi (for galaxy)
2 ====
3
4 *Maintainer: IUC*
5
6 *Contact: nitesh1989 on github*
7
8 The minfi package provides tools for analyzing Illumina’s Methylation arrays, with a special
9 focus on the new 450k array for humans.
10
11 In this package we refer to differentially methylated positions (DMPs) by which we mean
12 a single genomic position that has a different methylation level in two different groups of
13 samples (or conditions). This is different from differentially methylated regions (DMRs)
14 which imply more that more than one methylation positions are different between conditions.
15
16 ### Table of Contents
17 **[User Guides](#user-guides)**
18 **[Functions provided and description](#functions-provided-and-description)**
19 **[Minfi Pipeline to analyze illumina 450k data](#minfi-pipeline-to-analyze-illumina-450k-data)**
20 **[Minfi Analysis Pipeline for TCGA data hosted on GDAC Broad Institute](#minfi-analysis-pipeline-for-tcga-data-hosted-on-gdac-broad-institute)**
21 **[Test Data](#test-data)**
22
23
24 ##User Guides
25 Here are some excellent user guides about minfi, written by the contributors.
26
27 http://www.bioconductor.org/packages/release/bioc/vignettes/minfi/inst/doc/minfi.pdf
28
29 https://bioconductor.org/help/course-materials/2014/BioC2014/minfi_BioC2014.pdf
30
31 https://www.bioconductor.org/help/course-materials/2015/BioC2015/methylation450k.html
32
33 ##Functions provided and description
34
35 1. **Minfi** **Pipeline** to analyze illumina 450k data. (minfi_pipeline.xml)
36
37 2. **Minfi** **Analysis** **pipeline** for TCGA data hosted on GDAC-Broad Institute.(minfi_TCGA_pipeline.xml)
38
39
40 ##Minfi Pipeline to analyze illumina 450k data
41
42 ####Tool xml name: *minfi_pipeline.xml*
43
44 ####How to use:
45
46
47 * The first step is to upload Illumina 450k IDAT files using the Galaxy Upload feature.
48
49 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-07%20at%205.43.07%20PM.png" >
50
51 * In this example, we uploaded "Case" samples from slide 572364052 - one Illumina slide with 6 IDAT files (3 samples with red/green channel pairs) - and "Control" samples from slide 572364053 - one Illumina slide with 6 IDAT files (3 samples with red/green channel pairs) - for a total of 12 IDAT files. Once uploaded, these files should appear in the history panel.
52
53 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-07%20at%205.43.41%20PM.png">
54
55 * The next step is to group samples based on treatment type. First, select "Operations on multiple datasets" in your history panel.
56
57 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-07%20at%205.44.32%20PM.png" width="150">
58
59 * Next, select the samples which belong to the "Case" treatment group (572364052_*.idat), and "For all selected", "Build a list of dataset pairs".
60
61 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-07%20at%205.45.18%20PM.png" width="150">
62
63 * The "Create a collection of paired datasets" window will appear with the samples you selected. To pair the red/green channels, first "Clear Filters" and then type each dataset suffix - "_Grn.idat" and "_Red.idat" - in the filter bars. This should filter your list and display the "Green" channel files on one side and "Red" channel files on the other. If the datasets are correctly paied, click "Pair these datasets" in the middle of the window for each pair. Finally, name the collection - in this example "Case" based on the treatement type - and click "Create list".
64
65 ![pairCollection](https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-07%20at%205.46.59%20PM.png)
66
67 * Repeat these steps to build a paired dataset for the "Control" samples (572364053_*.idat).
68
69 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-07%20at%205.47.30%20PM.png" width="150">
70
71 * In addition to the option to pair each dataset, the "Auto-pair" link will pair all datasets at once. Again, after pairing, name the collection - in this example "Control" - and click "Create list".
72
73 ![pairControl2](https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-07%20at%205.47.59%20PM.png)
74
75 * The history panel now displays two paired dataset collections: 13. "Case" and 14. "Control".
76
77 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-07%20at%205.48.22%20PM.png" width="150">
78
79 * From the "Minfi" tools, choose "Minfi pipeline to analyse Illumina 450k data". In the tool form, select "Case" for "Condition1/Treatment" and "Control" for "Condition2/Wildtype". After choosing the "Quantile Normalization" preprocessing method and "Basic Default settings" for Minfi Parameters, run the tool.
80
81 ![minfitoolform](https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-08%20at%2010.30.54%20AM.png)
82
83 * Successful completion of the tool results in 4 items in the history panel: QC report, MDS plot, Differentially methylated positions, and Differentially methylated regions.
84
85 ![result](https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-07%20at%205.53.03%20PM.png)
86
87 * To view results, click the "eye" in the history panel corresponding to the result.
88
89 ![resultview](https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-07%20at%205.55.57%20PM.png)
90
91 ##Minfi Analysis Pipeline for TCGA data hosted on GDAC Broad Institute
92
93 ####Tool xml name: *minfi_TCGA_pipeline.xml*
94
95 ####How to use:
96
97 The minfi analysis pipeline for TCGA data provides only two functions as of now, to find differentially methylated regions and differentially methylated positions. This is because the data which is provided/used for this tool has been put through quality control measures. The Level 3 data is ready to use for analysis.
98
99 * The first step of this process is to fetch TCGA data from http://gdac.broadinstitute.org/runs/ where we need Standard data from a particular date, or the latest data which is available at http://gdac.broadinstitute.org/runs/stddata__latest/. We want the "Open" data avaiable to download.
100
101 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-09%20at%202.42.17%20PM.png" width="300">
102
103 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-09%20at%202.42.42%20PM.png">
104
105
106 * Choose the cancer type with the Illumina 450K methylation data, for the sake of this example we will use "UCEC" - Uterine Corpus Endometrial Carcinoma. In the directory structure, we want the data with the suffix ```__Level_3__within_bioassay_data_set_function__data.Level_3```. This file is usually the largest in the directory and can be easily spotted by sorting the directory based on Size. So for the example, we will choose the file ```gdac.broadinstitute.org_UCEC.Merge_methylation__humanmethylation450__jhu_usc_edu__Level_3__within_bioassay_data_set_function__data.Level_3.2015110100.0.0.tar.gz```.
107
108 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-09%20at%202.41.05%20PM.png">
109
110 * There are two options to obtain the data: 1. Get the link to that file and paste it in your Galaxy Upload tool to fetch the data; 2. Download the data on to your local machine and upload the whole file.
111
112 Option 1 : Fetch the data at the URL directly to Galaxy
113
114 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-11%20at%204.30.15%20PM.png">
115
116 Option 2: Upload a local file
117
118 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-11%20at%204.34.30%20PM.png">
119
120 * Once the upload is finished, it should be available in the History panel of the session. Choose the tool "Minfi Analysis pipeline for TCGA data".
121
122 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-12%20at%205.17.44%20PM.png">
123
124
125 * Galaxy will uncompress your dataset to show you only a ".tar" file. Choose that file as the input to the tool. This example will show the tool run with the Basic Default settings.
126
127 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-11%20at%204.34.53%20PM.png">
128
129 * After the tool is executed, two results will appear in the history panel.
130
131 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-11%20at%204.59.32%20PM.png" width="150">
132
133 * This shows the two results "Differentially methylated Regions" and "Differentially methylated Positions". You can ```View``` the results by clicking on the ```eye``` icon.
134
135 <img src="https://github.com/nitesh1989/tools-iuc/blob/methylation_2/tools/minfi/help/help-images/Screen%20Shot%202015-12-11%20at%204.59.59%20PM.png">
136
137
138 ##Test Data
139
140 The test_data folder contains some test files which are,
141
142 1. 572364052 : One Illumina slide with 6 IDAT files (3 samples with red/green channel pairs)
143
144 2. 572364053 : One Illumina slide with 6 IDAT files (3 samples with red/green channel pairs)
145
146 3. dmrs.csv : Differentially methylated regions as output.