0
|
1 ResFinder documentation
|
|
2 =============
|
|
3
|
|
4 ResFinder identifies acquired antimicrobial resistance genes in total or partial
|
|
5 sequenced isolates of bacteria.
|
|
6
|
|
7 ## Content of the repository
|
|
8 1. run_resfinder.py - Use this script to run ResFinder
|
|
9 2. tests/data - Contains fasta and fastq data for testing. More information in the "Test data" section
|
|
10 3. scripts/ - All scripts in this directory is unsupported but has been uploaded as they may be useful
|
|
11 4. cge/ - ResFinder code
|
|
12 5. dockerfile - Used to build ResFinder docker image (See Docker section near the end)
|
|
13
|
|
14 ## Installation
|
|
15 The installation described here will first install the actual ResFinder software,
|
|
16 then the dependencies, and finally the databases. A more detailed breakdown of the
|
|
17 installation is provided below:
|
|
18
|
|
19 1. Install ResFinder tool
|
|
20 2. Install python modules
|
|
21 3. Install BLAST (optional)
|
|
22 4. install KMA (optional)
|
|
23 5. Download ResFinder database
|
|
24 6. Download PointFinder database
|
|
25 7. Index databases with KMA (if installed)
|
|
26 8. Test installation
|
|
27
|
|
28 A small script has been written to automate this process. It is available from the
|
|
29 scripts directory and is named install_resfinder.sh. It is very simple and might
|
|
30 not work in all environments. It is only meant as a supplement and no support will
|
|
31 be provided for any scripts in this directory. However, specific suggestions (with code)
|
|
32 for improvement is very welcome.
|
|
33
|
|
34 ### ResFinder tool
|
|
35 Setting up ResFinder script and database
|
|
36 ```bash
|
|
37 # Go to wanted location for resfinder
|
|
38 cd /path/to/some/dir
|
|
39
|
|
40 # Clone the latest version and enter the resfinder directory
|
|
41 git clone https://git@bitbucket.org/genomicepidemiology/resfinder.git
|
|
42 cd resfinder
|
|
43
|
|
44 ```
|
|
45
|
|
46 ### Dependencies:
|
|
47 Depending on how you plan to run ResFinder BLAST and KMA can be optional.
|
|
48 BLAST is used to analyse assemblies (ie. FASTA files).
|
|
49 KMA is used to analyse read data (ie. FASTQ files).
|
|
50
|
|
51 #### Python modules: Tabulate, BioPython, CGECore and Python-Git
|
|
52 To install the needed python modules you can use pip
|
|
53 ```bash
|
|
54 pip3 install tabulate biopython cgecore gitpython python-dateutil
|
|
55 ```
|
|
56 For more information visit the respective website
|
|
57 ```url
|
|
58 https://bitbucket.org/astanin/python-tabulate
|
|
59 https://biopython.org
|
|
60 https://bitbucket.org/genomicepidemiology/cge_core_module
|
|
61 https://gitpython.readthedocs.io/en/stable/index.html
|
|
62 ```
|
|
63
|
|
64 #### BLAST (optional)
|
|
65 If you don't want to specify the path of blastn every time you run
|
|
66 ResFinder, make sure that blastn is in you PATH.
|
|
67
|
|
68 Blastn can be obtained from:
|
|
69 ```url
|
|
70 ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
|
|
71 ```
|
|
72
|
|
73 #### KMA (optional)
|
|
74 The instructions here will install KMA in the default location ResFinder uses. KMA
|
|
75 can be installed in another location but the path to KMA will then need to be
|
|
76 specified every time you run ResFinder unless you add the kma program to your PATH.
|
|
77 ```bash
|
|
78 # Go to the directoy in which you installed the ResFinder tool
|
|
79 cd /path/to/some/dir/resfinder
|
|
80 cd cge
|
|
81 git clone https://bitbucket.org/genomicepidemiology/kma.git
|
|
82 cd kma && make
|
|
83 ```
|
|
84
|
|
85 ### Databases
|
|
86 This section describes how to install the databases at the ResFinder default locations.
|
|
87 The database locations can be changed, but must then be specified to ResFinder at run time.
|
|
88
|
|
89 #### ResFinder database
|
|
90 ```bash
|
|
91 # Go to the directoy in which you installed the ResFinder tool
|
|
92 cd /path/to/some/dir/resfinder
|
|
93 git clone https://git@bitbucket.org/genomicepidemiology/resfinder_db.git db_resfinder
|
|
94 ```
|
|
95
|
|
96 #### PointFinder database
|
|
97 ```bash
|
|
98 # Go to the directoy in which you installed the ResFinder tool
|
|
99 cd /path/to/some/dir/resfinder
|
|
100 git clone https://git@bitbucket.org/genomicepidemiology/pointfinder_db.git db_pointfinder
|
|
101 ```
|
|
102
|
|
103 #### Indexing databases with KMA
|
|
104 If you have KMA installed you either need to have the kma_index in your PATH or
|
|
105 you need to provide the path to kma_index to INSTALL.py
|
|
106
|
|
107 **NOTE**: The documentation given here describes the procedure for the ResFinder database, but the procedure is identical for the PointFinder database.
|
|
108 **PointFinder database documentation**: [https://bitbucket.org/genomicepidemiology/pointfinder_db]
|
|
109
|
|
110 ##### a) Run INSTALL.py in interactive mode
|
|
111 ```bash
|
|
112 # Go to the database directory
|
|
113 cd path/to/db_resfinder
|
|
114 python3 INSTALL.py
|
|
115 ```
|
|
116 If kma_index was found in your path a lot of indexing information will be
|
|
117 printed to your terminal, and will end with the word "done".
|
|
118
|
|
119 If kma_index wasn't found you will recieve the following output:
|
|
120 ```bash
|
|
121 KMA index program, kma_index, does not exist or is not executable
|
|
122 Please input path to executable kma_index program or choose one of the options below:
|
|
123 1. Install KMA using make, index db, then remove KMA.
|
|
124 2. Exit
|
|
125 ```
|
|
126 You can now write the path to kma_index and finish with <enter> or you can
|
|
127 enter "1" or "2" and finish with <enter>.
|
|
128
|
|
129 If "1" is chosen, the script will attempt to install kma in your systems
|
|
130 default temporary location. If the installation is successful it will proceed
|
|
131 to index your database, when finished it will delete the kma installation again.
|
|
132
|
|
133 ##### b) Run INSTALL.py in non_interactive mode
|
|
134 ```bash
|
|
135 # Go to the database directory
|
|
136 cd path/to/resfinder_db
|
|
137 python3 INSTALL.py /path/to/kma_index non_interactive
|
|
138 ```
|
|
139 The path to kma_index can be omitted if it exists in PATH or if the script
|
|
140 should attempt to do an automatic temporary installation of KMA.
|
|
141
|
|
142 ##### c) Index database manually (not recommended)
|
|
143 It is possible to index the databases manually, but is generally not recommended
|
|
144 as it is more prone to error. If you choose to do so, be aware of the naming of
|
|
145 the indexed files.
|
|
146
|
|
147 This is an example of how to index the ResFinder database files:
|
|
148 ```bash
|
|
149 # Go to the resfinder database directory
|
|
150 cd path/to/resfinder_db
|
|
151 # create indexing directory
|
|
152 mkdir kma_indexing
|
|
153 # Index files using kma_index
|
|
154 kma_index -i db_resfinder/fusidicacid.fsa -o db_resfinder/kma_indexing/fusidicacid
|
|
155 kma_index -i db_resfinder/phenicol.fsa -o db_resfinder/kma_indexing/phenicol
|
|
156 kma_index -i db_resfinder/glycopeptide.fsa -o db_resfinder/kma_indexing/glycopeptide
|
|
157 kma_index -i db_resfinder/trimethoprim.fsa -o db_resfinder/kma_indexing/trimethoprim
|
|
158 kma_index -i db_resfinder/oxazolidinone.fsa -o db_resfinder/kma_indexing/oxazolidinone
|
|
159 kma_index -i db_resfinder/tetracycline.fsa -o db_resfinder/kma_indexing/tetracycline
|
|
160 kma_index -i db_resfinder/quinolone.fsa -o db_resfinder/kma_indexing/quinolone
|
|
161 kma_index -i db_resfinder/nitroimidazole.fsa -o db_resfinder/kma_indexing/nitroimidazole
|
|
162 kma_index -i db_resfinder/fosfomycin.fsa -o db_resfinder/kma_indexing/fosfomycin
|
|
163 kma_index -i db_resfinder/aminoglycoside.fsa -o db_resfinder/kma_indexing/aminoglycoside
|
|
164 kma_index -i db_resfinder/macrolide.fsa -o db_resfinder/kma_indexing/macrolide
|
|
165 kma_index -i db_resfinder/sulphonamide.fsa -o db_resfinder/kma_indexing/sulphonamide
|
|
166 kma_index -i db_resfinder/rifampicin.fsa -o db_resfinder/kma_indexing/rifampicin
|
|
167 kma_index -i db_resfinder/colistin.fsa -o db_resfinder/kma_indexing/colistin
|
|
168 kma_index -i db_resfinder/beta-lactam.fsa -o db_resfinder/kma_indexing/beta-lactam
|
|
169 # Go to the pointfinder database directory
|
|
170 cd path/to/pointfinder_db
|
|
171 # Index files using kma_index
|
|
172 kma_index -i db_pointfinder/campylobacter/*.fsa -o db_pointfinder/campylobacter/campylobacter
|
|
173 kma_index -i db_pointfinder/escherichia_coli/*.fsa -o db_pointfinder/escherichia_coli/escherichia_coli
|
|
174 kma_index -i db_pointfinder/enterococcus_faecalis/*.fsa -o db_pointfinder/enterococcus_faecalis/enterococcus_faecalis
|
|
175 kma_index -i db_pointfinder/enterococcus_faecium/*.fsa -o db_pointfinder/enterococcus_faecium/enterococcus_faecium
|
|
176 kma_index -i db_pointfinder/neisseria_gonorrhoeae/*.fsa -o db_pointfinder/neisseria_gonorrhoeae/neisseria_gonorrhoeae
|
|
177 kma_index -i db_pointfinder/salmonella/*.fsa -o db_pointfinder/salmonella/salmonella
|
|
178 kma_index -i db_pointfinder/mycobacterium_tuberculosis/*.fsa -o db_pointfinder/mycobacterium_tuberculosis/mycobacterium_tuberculosis
|
|
179 ```
|
|
180
|
|
181 ### Test ResFinder intallation
|
|
182 (This will not function with the docker installation.)
|
|
183 If you did not install BLAST, test 1 and 3 will fail. If you did not install KMA, test 2
|
|
184 and 4 will fail.
|
|
185 The 4 tests will in total take approximately take 5-60 seconds, depending on your system.
|
|
186 ```bash
|
|
187 # Go to the directoy in which you installed the ResFinder tool
|
|
188 cd /path/to/some/dir/resfinder
|
|
189
|
|
190 # For seeing the unittest options for running the tests
|
|
191 python3 tests/functional_tests.py -h
|
|
192
|
|
193 # In case you need to point blastn or kma, or the resfinder or the pointfinder databases as they are not in the places indicated above, see the optional arguments for the test by:
|
|
194 python3 tests/functional_tests.py -res_help
|
|
195
|
|
196 #Which outputs:
|
|
197 usage: functional_tests.py [-res_help] [-db_res DB_PATH_RES] [-b BLAST_PATH]
|
|
198 [-k KMA_PATH] [-db_point DB_PATH_POINT]
|
|
199
|
|
200 Options:
|
|
201 -res_help, --resfinder_help
|
|
202 -db_res DB_PATH_RES, --db_path_res DB_PATH_RES
|
|
203 Path to the databases for ResFinder
|
|
204 -b BLAST_PATH, --blastPath BLAST_PATH
|
|
205 Path to blastn
|
|
206 -k KMA_PATH, --kmaPath KMA_PATH
|
|
207 Path to KMA
|
|
208 -db_point DB_PATH_POINT, --db_path_point DB_PATH_POINT
|
|
209 Path to the databases for PointFinder
|
|
210
|
|
211 # Run tests
|
|
212 python3 tests/functional_tests.py
|
|
213
|
|
214 # Output from successful tests
|
|
215 ....
|
|
216 ----------------------------------------------------------------------
|
|
217 Ran 4 tests in 8.263s
|
|
218
|
|
219 OK
|
|
220 ```
|
|
221
|
|
222 ### Test data
|
|
223 Test data can be found in the sub-dierectory /tests/data
|
|
224
|
|
225 ## Usage
|
|
226
|
|
227 You can run resfinder command line using python3.
|
|
228
|
|
229 **NOTE**: Species should be entered with their full scientific names (e.g. "escherichia coli"), using quotation marks, not case sensitive.
|
|
230 An attempt has been made to capture some deviations like "ecoli" and "e.coli", but it is far from all deviations that will be captured.
|
|
231
|
|
232
|
|
233 ```bash
|
|
234
|
|
235 # Example of running resfinder
|
|
236 python3 run_resfinder.py -o path/to/outdir -s "Escherichia coli" -l 0.6 -t 0.8 --acquired --point -ifq test_isolate_01_*
|
|
237
|
|
238 # The program can be invoked with the -h option
|
|
239 usage: run_resfinder.py [-h] [-ifa INPUTFASTA]
|
|
240 [-ifq INPUTFASTQ [INPUTFASTQ ...]] [-scripts SCRIPTS]
|
|
241 [-o OUT_PATH] [-b BLAST_PATH] [-k KMA_PATH]
|
|
242 [-s SPECIES] [-l MIN_COV] [-t THRESHOLD]
|
|
243 [-db_res DB_PATH_RES] [-db_res_kma DB_PATH_RES_KMA]
|
|
244 [-d DATABASES] [-acq] [-c] [-db_point DB_PATH_POINT]
|
|
245 [-g SPECIFIC_GENE [SPECIFIC_GENE ...]] [-u]
|
|
246
|
|
247 optional arguments:
|
|
248 -h, --help show this help message and exit
|
|
249 -ifa INPUTFASTA, --inputfasta INPUTFASTA
|
|
250 Input fasta file.
|
|
251 -ifq INPUTFASTQ [INPUTFASTQ ...], --inputfastq INPUTFASTQ [INPUTFASTQ ...]
|
|
252 Input fastq file(s). Assumed to be single-end fastq if
|
|
253 only one file is provided, and assumed to be paired-
|
|
254 end data if two files are provided.
|
|
255 -o OUT_PATH, --outputPath OUT_PATH
|
|
256 All output will be stored in this directory.
|
|
257 -b BLAST_PATH, --blastPath BLAST_PATH
|
|
258 Path to blastn
|
|
259 -k KMA_PATH, --kmaPath KMA_PATH
|
|
260 Path to kma
|
|
261 -s SPECIES, --species SPECIES
|
|
262 Species in the sample
|
|
263 Available species: Campylobacter, Campylobacter jejuni, Campylobacter coli,
|
|
264 Enterococcus faecalis, Enterococcus faecium, Escherichia coli, Helicobacter pylori,
|
|
265 Klebsiella, Mycobacterium tuberculosis, Neisseria gonorrhoeae,
|
|
266 Plasmodium falciparum, Salmonella, Salmonella enterica, Staphylococcus aureus
|
|
267 -s "Other" can be used for metagenomic samples or samples with unknown species.
|
|
268 -db_res DB_PATH_RES, --db_path_res DB_PATH_RES
|
|
269 Path to the databases for ResFinder
|
|
270 -db_res_kma DB_PATH_RES_KMA, --db_path_res_kma DB_PATH_RES_KMA
|
|
271 Path to the ResFinder databases indexed with KMA.
|
|
272 Defaults to the 'kma_indexing' directory inside the
|
|
273 given database directory.
|
|
274 -d DATABASES, --databases DATABASES
|
|
275 Databases chosen to search in - if none is specified
|
|
276 all is used
|
|
277 -acq, --acquired Run resfinder for acquired resistance genes
|
|
278 -l MIN_COV, --min_cov MIN_COV
|
|
279 Minimum (breadth-of) coverage of ResFinder
|
|
280 Valid interval: 0.00-1.00
|
|
281 -t THRESHOLD, --threshold THRESHOLD
|
|
282 Threshold for identity of ResFinder
|
|
283 Valid interval: 0.00-1.00
|
|
284 -c, --point Run pointfinder for chromosomal mutations
|
|
285 -db_point DB_PATH_POINT, --db_path_point DB_PATH_POINT
|
|
286 Path to the databases for PointFinder
|
|
287 -g SPECIFIC_GENE [SPECIFIC_GENE ...]
|
|
288 Specify genes existing in the database to search for -
|
|
289 if none is specified all genes are included in the
|
|
290 search.
|
|
291 -u, --unknown_mut Show all mutations found even if in unknown to the
|
|
292 resistance database
|
|
293 -l_p MIN_COV_POINT, --min_cov_point MIN_COV_POINT
|
|
294 Minimum (breadth-of) coverage of Pointfinder. If None
|
|
295 is selected, the minimum coverage of ResFinder will be
|
|
296 used.
|
|
297 -t_p THRESHOLD_POINT, --threshold_point THRESHOLD_POINT
|
|
298 Threshold for identity of Pointfinder. If None is
|
|
299 selected, the minimum coverage of ResFinder will be
|
|
300 used.
|
|
301
|
|
302 ```
|
|
303
|
|
304 ### Web-server
|
|
305
|
|
306 A webserver implementing the methods is available at the [CGE
|
|
307 website](http://www.genomicepidemiology.org/) and can be found here:
|
|
308 https://cge.cbs.dtu.dk/services/ResFinder/
|
|
309
|
|
310 ### Install ResFinder with Docker
|
|
311 If you would like to build a docker image with ResFinder, make sure you have cloned the ResFinder directory as well as installed and indexed the databases: `db_pointfinder` and `db_resfinder`. Then run the following commands:
|
|
312 ```bash
|
|
313 # Go to ResFinder directory
|
|
314 cd path/to/resfinder
|
|
315 # Build docker image with name resfinder
|
|
316 docker build -t resfinder .
|
|
317 ```
|
|
318 When running the docker make sure to mount the `db_resfinder` and the `db_pointfinder` with the flag -v, as shown in the examples below.
|
|
319
|
|
320 You can test the installation by running the docker with the test files:
|
|
321 ```bash
|
|
322 cd path/to/resfinder/
|
|
323 mkdir results
|
|
324
|
|
325 # Run with raw data (this command mounts the results to the local directory "results")
|
|
326 docker run --rm -it -v $(pwd)/db_resfinder/:/usr/src/db_resfinder -v $(pwd)/results/:/usr/src/results resfinder -ifq /usr/src/tests/data/test_isolate_01_1.fq /usr/src/tests/data/test_isolate_01_2.fq -acq -db_res /usr/src/db_resfinder -o /usr/src/results
|
|
327
|
|
328 # Run with assembled data (this command mounts the results to the local directory "results")
|
|
329 docker run --rm -it -v $(pwd)/db_resfinder/:/usr/src/db_resfinder -v $(pwd)/results/:/usr/src/results resfinder -ifa /usr/src/tests/data/test_isolate_01.fa -acq -db_res /usr/src/db_resfinder -o /usr/src/results
|
|
330 ```
|
|
331
|
|
332 Citation
|
|
333 =======
|
|
334
|
|
335 When using the method please cite:
|
|
336
|
|
337 ResFinder 4.0 for predictions of phenotypes from genotypes.
|
|
338 Bortolaia V, Kaas RF, Ruppe E, Roberts MC, Schwarz S, Cattoir V, Philippon A, Allesoe RL, Rebelo AR, Florensa AR, Fagelhauer L,
|
|
339 Chakraborty T, Neumann B, Werner G, Bender JK, Stingl K, Nguyen M, Coppens J, Xavier BB, Malhotra-Kumar S, Westh H, Pinholt M,
|
|
340 Anjum MF, Duggett NA, Kempf I, Nyk�senoja S, Olkkola S, Wieczorek K, Amaro A, Clemente L, Mossong J, Losch S, Ragimbeau C, Lund O, Aarestrup FM.
|
|
341 Journal of Antimicrobial Chemotherapy. 2020 Aug 11.
|
|
342 PMID: 32780112 doi: 10.1093/jac/dkaa345
|
|
343 [Epub ahead of print]
|
|
344
|
|
345 References
|
|
346 =======
|
|
347
|
|
348 1. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics 2009; 10:421.
|
|
349 2. Clausen PTLC, Aarestrup FM, Lund O. Rapid and precise alignment of raw reads against redundant databases with KMA. BMC Bioinformatics 2018; 19:307.
|
|
350
|
|
351 License
|
|
352 =======
|
|
353
|
|
354 Licensed under the Apache License, Version 2.0 (the "License");
|
|
355 you may not use this file except in compliance with the License.
|
|
356 You may obtain a copy of the License at
|
|
357
|
|
358 http://www.apache.org/licenses/LICENSE-2.0
|
|
359
|
|
360 Unless required by applicable law or agreed to in writing, software
|
|
361 distributed under the License is distributed on an "AS IS" BASIS,
|
|
362 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
363 See the License for the specific language governing permissions and
|
|
364 limitations under the License. |