0
|
1 # Quick guide to running ResFinder with Cromwell
|
|
2
|
|
3 ### Disclaimer
|
|
4 Support is not offered for running Cromwell and no files in this directory is
|
|
5 guaranteed to work. These files were uploaded as inspiration. Please do not
|
|
6 report issues relating to this directory.
|
|
7
|
|
8 ## Prepare input files
|
|
9
|
|
10 Two input files are needed:
|
|
11
|
|
12 1. input_data.tsv
|
|
13 2. input.json
|
|
14
|
|
15 Templates can be found in the ResFinder directory scripts/wdl.
|
|
16
|
|
17 ### input_data.tsv
|
|
18 Tab separated file. Should contain columns in the following order:
|
|
19
|
|
20 1. Absolute path to fasta/fastq file 1
|
|
21 2. Absolute path to fastq file 2 (Can be empty, but must exist)
|
|
22 3. Species
|
|
23 4. Type of data, must be one of: assembly, paired
|
|
24
|
|
25 Each row should contain a single sample.
|
|
26
|
|
27 #### Species
|
|
28 If species cannot be provided put "other" (cases sensitive).
|
|
29
|
|
30 #### Type of data
|
|
31
|
|
32 * assembly: Fasta file containing contigs from a de novo assembly.
|
|
33 * paired: Couple of fastq files containing read data for foward and reverse
|
|
34 reads.
|
|
35 * single: **Not implemented** Read data from single-end sequencing.
|
|
36
|
|
37
|
|
38 #### Example
|
|
39 ```
|
|
40
|
|
41 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_01_1.fq /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_01_2.fq Escherichia coli paired
|
|
42 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_05_1.fq /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_05_2.fq Escherichia coli paired
|
|
43 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09a_1.fq /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09a_2.fq Escherichia coli paired
|
|
44 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09b_1.fq /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09b_2.fq Escherichia coli paired
|
|
45 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_01.fa Escherichia coli assembly
|
|
46 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_02.fa Escherichia coli assembly
|
|
47 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_03.fa Escherichia coli assembly
|
|
48 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_05.fa Escherichia coli assembly
|
|
49 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09a.fa Escherichia coli assembly
|
|
50 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09b.fa Escherichia coli assembly
|
|
51
|
|
52 ```
|
|
53
|
|
54 ### input.json
|
|
55 JSON formatted file containing input and output information.
|
|
56
|
|
57 The file should consist of a single dict/hash/map with the following keys:
|
|
58
|
|
59 * Resistance.inputSamplesFile: Absolute path to input_data.tsv
|
|
60 * Resistance.outputDir: Absolute path to output directory.
|
|
61 * Resistance.geneCov: Fraction of gene coverage needed for resistance gene hits.
|
|
62 * Resistance.geneID: Fraction of nucleotide identity needed in resistance gene
|
|
63 hits.
|
|
64 * Resistance.pointCov: Fraction of gene coverage needed for point mutation gene
|
|
65 hits.
|
|
66 * Resistance.pointID: Fraction of nucleotide identity needed in point mutation gene
|
|
67 hits.
|
|
68
|
|
69 If running on Computerome and are using the input.json template, you probably
|
|
70 won't need to change the following:
|
|
71
|
|
72 * Resistance.python: Path to python3 interpreter.
|
|
73 * Resistance.kma: Path to kma application.
|
|
74 * Resistance.blastn: Path to blastn application.
|
|
75 * Resistance.resfinder: Path to run_resfinder.py.
|
|
76 * Resistance.resDB: Path to ResFinder database.
|
|
77 * Resistance.pointDB: Path to PointFinder database
|
|
78
|
|
79 The values should be the absolute path to the input_data.tsv and the desired
|
|
80 output directory, respectively.
|
|
81
|
|
82 #### Example
|
|
83
|
|
84 ```json
|
|
85
|
|
86 {
|
|
87 "Resistance.inputSamplesFile": "/home/projects/cge/people/rkmo/delme/res_input.tsv",
|
|
88 "Resistance.outputDir": "/home/projects/cge/people/rkmo/delme/",
|
|
89 "Resistance.geneCov": 0.6,
|
|
90 "Resistance.geneID": 0.8,
|
|
91 "Resistance.pointCov": 0.6,
|
|
92 "Resistance.pointID": 0.8,
|
|
93 "Resistance.python": "python3",
|
|
94 "Resistance.kma": "/home/projects/cge/apps/resfinder/resfinder/cge/kma/kma",
|
|
95 "Resistance.blastn": "blastn",
|
|
96 "Resistance.resfinder": "/home/projects/cge/apps/resfinder/resfinder/run_resfinder.py",
|
|
97 "Resistance.resDB": "/home/projects/cge/apps/resfinder/resfinder/db_resfinder",
|
|
98 "Resistance.pointDB": "/home/projects/cge/apps/resfinder/resfinder/db_pointfinder"
|
|
99 }
|
|
100
|
|
101 ```
|
|
102
|
|
103 ## Run Cromwell
|
|
104
|
|
105 Cromwell needs JAVA to run. Load a valid JAVA module, for example:
|
|
106
|
|
107 ```bash
|
|
108
|
|
109 module load openjdk/16
|
|
110
|
|
111 ```
|
|
112
|
|
113 A Cromwell call looks like this:
|
|
114
|
|
115 ```bash
|
|
116
|
|
117 java -Dconfig.file=<CONF> -jar <CROMWELL> run <WDL> --inputs <JSON>
|
|
118
|
|
119 ```
|
|
120
|
|
121 ### <CONF> and <CROMWELL>
|
|
122 Computerome specific.
|
|
123
|
|
124 * <CONF>: Path to Computerome configuration for Cromwell. You need to change
|
|
125 this if you are not running Cromwell on Computerome. Computerome path:
|
|
126 /home/projects/cge/apps/resfinder/resfinder/scripts/wdl/computerome.conf
|
|
127
|
|
128 * <CROMWELL>: Path to Cronwell jar file in Computerome:
|
|
129 /services/tools/cromwell/50/cromwell-50.jar
|
|
130
|
|
131 ### <WDL>
|
|
132 ResFinder specific.
|
|
133
|
|
134 * <WDL>: Path to wdl file that specifies how to run ResFinder. Path to
|
|
135 resfinder.wdl on Computerome:
|
|
136 /home/projects/cge/apps/resfinder/resfinder/scripts/wdl/resfinder.wdl
|
|
137
|
|
138 ### <JSON>
|
|
139 User/Run specific
|
|
140
|
|
141 Path to input.json. Specifies all the parameters for ResFinder (See above).
|
|
142
|
|
143 ### Run example
|
|
144
|
|
145 ```bash
|
|
146
|
|
147 java -Dconfig.file=/home/projects/cge/apps/resfinder/resfinder/scripts/wdl/computerome.conf -jar /services/tools/cromwell/50/cromwell-50.jar run /home/projects/cge/apps/resfinder/resfinder/scripts/wdl/resfinder.wdl --inputs /home/projects/cge/apps/resfinder/resfinder/scripts/wdl/input.json
|
|
148
|
|
149 ```
|
|
150
|
|
151 ### Post run
|
|
152
|
|
153 All ResFinder output will be located in the provided output directory.
|
|
154
|
|
155 In the directory where you execute Cromwell the following two directories will
|
|
156 also be created:
|
|
157
|
|
158 * cromwell-executions
|
|
159 * cromwell-workflow-logs
|
|
160
|
|
161 They contain logging information and cached results.
|