comparison README.txt @ 2:9d363eb081b5 draft

Uploaded
author iarc
date Thu, 28 Apr 2016 03:43:25 -0400
parents 8c682b3a7c5b
children 46a10309dfe2
comparison
equal deleted inserted replaced
1:748b7a8b634c 2:9d363eb081b5
1 ============================== 1 ==============================
2 MutSpec-Suite 2 MutSpec-Suite
3 ============================== 3 ==============================
4 4
5 Created by Maude Ardin and Vincent Cahais (Mechanisms of Carcinogenesis Section, International Agency for Research on Cancer F69372 Lyon France, http://www.iarc.fr/) 5 Created by Maude Ardin and Vincent Cahais (Mechanisms of Carcinogenesis Section, International Agency for Research on Cancer F69372 Lyon France,
6 http://www.iarc.fr/)
6 7
7 Version 1.0 8 Version 1.0
8 9
9 Released under GNU public license version 2 (GPL v2) 10 Released under GNU public license version 2 (GPL v2)
10 11
11 Package description: Ardin et al. - 2016 - MutSpec: a Galaxy toolbox for streamlined analyses of somatic mutation spectra in human and mouse cancer genomes - BMC Bioinformatics 12 Package description: Ardin et al. - 2016 - MutSpec: a Galaxy toolbox for streamlined analyses of somatic mutation spectra in human and mouse
13 cancer genomes - BMC Bioinformatics
14 http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1011-z
12 15
13 Test data: https://usegalaxy.org/u/maude-ardin/p/mutspectestdata 16 Test data: https://usegalaxy.org/u/maude-ardin/p/mutspectestdata
17
14 18
15 19
16 ### Requirements 20 ### Requirements
17 21
18 # python-dev 22 # python-dev
21 25
22 26
23 # Annovar 27 # Annovar
24 If you do not have ANNOVAR installed, you can download it here: http://www.openbioinformatics.org/annovar/annovar_download_form.php 28 If you do not have ANNOVAR installed, you can download it here: http://www.openbioinformatics.org/annovar/annovar_download_form.php
25 29
26 1) Once downloaded, install annovar per the installation instructions and edit the PATH variable in galaxy deamon (/etc/init.d/galaxy) to reflect the location of directory containing perl scripts. 30 1) Once downloaded, install annovar per the installation instructions and edit the PATH variable in galaxy deamon (/etc/init.d/galaxy)
31 to reflect the location of directory containing perl scripts.
27 32
28 2) Create directories for saving Annovar databases 33 2) Create directories for saving Annovar databases
29 2-a Create a folder (annovardb) for saving all Annovar databases, e.g. hg19db 34 2-a Create a folder (annovardb) for saving all Annovar databases, e.g. hg19db
30 2-b Create a subfolder (seqFolder) for saving the reference genome, e.g. hg19db/hg19_seq 35 2-b Create a subfolder (seqFolder) for saving the reference genome, e.g. hg19db/hg19_seq
31 36
46 and <annovardb> is the location where all database files should be stored, e.g. hg19db 51 and <annovardb> is the location where all database files should be stored, e.g. hg19db
47 52
48 The list of all available databases can be found here: http://annovar.openbioinformatics.org/en/latest/user-guide/download/ 53 The list of all available databases can be found here: http://annovar.openbioinformatics.org/en/latest/user-guide/download/
49 54
50 55
51 5) Edit the annovar_index.loc file (in the folder galaxy-dist/tool-data/toolshed/repos/iarc/mutspec/revision/) to reflect the location of annovardb folder (containing all the databases files downloaded from Annovar). 56 5) Edit the annovar_index.loc file (in the folder galaxy-dist/tool-data/toolshed/repos/iarc/mutspec/revision/) to reflect the location
57 of annovardb folder (containing all the databases files downloaded from Annovar).
52 Restart galaxy instance for changes in .loc file to take effect or reload it into the admin interface. 58 Restart galaxy instance for changes in .loc file to take effect or reload it into the admin interface.
53 59
54 6) Edit the file build_listAVDB.txt in the mutspec install directory to reflect the name and the type of the databases installed 60 6) Edit the file build_listAVDB.txt in the mutspec install directory to reflect the name and the type of the databases installed
55 61
56 62
57 ### Installation 63 ### Installation
58 64
59 # MutSpec-Stat and MutSpec-NMF 65 # MutSpec-Stat and MutSpec-NMF
60 By default 1 CPU is used by these tools, but you may edit mutspecStat_wrapper.sh and mutspecNmf_wrapper.sh to change this number to the maximum number of CPU available on your server. 66 By default 8 CPUs are used by these tools, but you may edit mutspecStat_wrapper.sh and mutspecNmf_wrapper.sh to change this number
67 to the maximum number of CPU available on your server.
61 68
62 MutSpec-Stat and MutSpec-NMF tools allow parallel computations that are time consuming. 69 MutSpec-Stat and MutSpec-NMF tools allow parallel computations that are time consuming.
63 It is recommended to use the highest number of cores available on the Galaxy server to reduce the computation time of these tools. 70 It is recommended to use the highest number of cores available on the Galaxy server to reduce the computation time of these tools.
64 71
65 72
66 73
74
67 # MutSpec-Annot 75 # MutSpec-Annot
68 The maximum CPU value needs to be specified when installing MutSpec package by editing the file mutspecAnnot.pl to reflect the maximum number of CPU available on your server (by default 1 CPU is used). 76 The maximum CPU value needs to be specified when installing MutSpec package by editing the file mutspecAnnot.pl to reflect the maximum number
77 of CPU available on your server.
69 78
70 This tool may be time consuming for large files. For example, annotating a file with more than 25,000 variants takes 1 hour using 1 CPU (2.6 GHz), while annotating this file using 8 CPUs takes only 5 minutes. 79 This tool may be time consuming for large files. For example, annotating a file of more than 25,000 variants takes 1 hour using 1 CPU (2.6 GHz),
80 while annotating this file using 8 CPUs takes only 5 minutes.
71 We have optimized MutSpec-Annot so that the tool uses more CPUs, if available, as follows: 81 We have optimized MutSpec-Annot so that the tool uses more CPUs, if available, as follows:
72 -files with less than 5,000 lines: 1 CPU is used 82 -files with less than 5,000 lines: 1 CPU is used
73 -files with more than 5,000 and less than 25,000 lines: 2 CPUs are used 83 -files with more than 5,000 and less than 25,000 lines: 2 CPUs are used
74 -files with more than 25,000 and less than 100,000 lines: 8 (or maximum CPUs, if less than 8 CPUs are available) are used (our benchmark results didn't show any time saving using more than 8 cores for files with more than 25,000 84 -files with more than 25,000 and less than 100,000 lines: 8 (or maximum CPUs, if less than 8 CPUs are available) are used (our benchmark
75 but less than 100,000 lines) 85 results didn't show any time saving using more than 8 cores for files with more than 25,000 but less than 100,000 lines)
76 -files with more than 100,000: maximum CPUs are used 86 -files with more than 100,000: maximum CPUs are used