Mercurial > repos > dcouvin > resfinder4
comparison resfinder/scripts/wdl/README.md @ 0:55051a9bc58d draft default tip
Uploaded
author | dcouvin |
---|---|
date | Mon, 10 Jan 2022 20:06:07 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:55051a9bc58d |
---|---|
1 # Quick guide to running ResFinder with Cromwell | |
2 | |
3 ### Disclaimer | |
4 Support is not offered for running Cromwell and no files in this directory is | |
5 guaranteed to work. These files were uploaded as inspiration. Please do not | |
6 report issues relating to this directory. | |
7 | |
8 ## Prepare input files | |
9 | |
10 Two input files are needed: | |
11 | |
12 1. input_data.tsv | |
13 2. input.json | |
14 | |
15 Templates can be found in the ResFinder directory scripts/wdl. | |
16 | |
17 ### input_data.tsv | |
18 Tab separated file. Should contain columns in the following order: | |
19 | |
20 1. Absolute path to fasta/fastq file 1 | |
21 2. Absolute path to fastq file 2 (Can be empty, but must exist) | |
22 3. Species | |
23 4. Type of data, must be one of: assembly, paired | |
24 | |
25 Each row should contain a single sample. | |
26 | |
27 #### Species | |
28 If species cannot be provided put "other" (cases sensitive). | |
29 | |
30 #### Type of data | |
31 | |
32 * assembly: Fasta file containing contigs from a de novo assembly. | |
33 * paired: Couple of fastq files containing read data for foward and reverse | |
34 reads. | |
35 * single: **Not implemented** Read data from single-end sequencing. | |
36 | |
37 | |
38 #### Example | |
39 ``` | |
40 | |
41 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_01_1.fq /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_01_2.fq Escherichia coli paired | |
42 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_05_1.fq /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_05_2.fq Escherichia coli paired | |
43 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09a_1.fq /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09a_2.fq Escherichia coli paired | |
44 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09b_1.fq /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09b_2.fq Escherichia coli paired | |
45 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_01.fa Escherichia coli assembly | |
46 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_02.fa Escherichia coli assembly | |
47 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_03.fa Escherichia coli assembly | |
48 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_05.fa Escherichia coli assembly | |
49 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09a.fa Escherichia coli assembly | |
50 /home/projects/cge/apps/resfinder/resfinder/tests/data/test_isolate_09b.fa Escherichia coli assembly | |
51 | |
52 ``` | |
53 | |
54 ### input.json | |
55 JSON formatted file containing input and output information. | |
56 | |
57 The file should consist of a single dict/hash/map with the following keys: | |
58 | |
59 * Resistance.inputSamplesFile: Absolute path to input_data.tsv | |
60 * Resistance.outputDir: Absolute path to output directory. | |
61 * Resistance.geneCov: Fraction of gene coverage needed for resistance gene hits. | |
62 * Resistance.geneID: Fraction of nucleotide identity needed in resistance gene | |
63 hits. | |
64 * Resistance.pointCov: Fraction of gene coverage needed for point mutation gene | |
65 hits. | |
66 * Resistance.pointID: Fraction of nucleotide identity needed in point mutation gene | |
67 hits. | |
68 | |
69 If running on Computerome and are using the input.json template, you probably | |
70 won't need to change the following: | |
71 | |
72 * Resistance.python: Path to python3 interpreter. | |
73 * Resistance.kma: Path to kma application. | |
74 * Resistance.blastn: Path to blastn application. | |
75 * Resistance.resfinder: Path to run_resfinder.py. | |
76 * Resistance.resDB: Path to ResFinder database. | |
77 * Resistance.pointDB: Path to PointFinder database | |
78 | |
79 The values should be the absolute path to the input_data.tsv and the desired | |
80 output directory, respectively. | |
81 | |
82 #### Example | |
83 | |
84 ```json | |
85 | |
86 { | |
87 "Resistance.inputSamplesFile": "/home/projects/cge/people/rkmo/delme/res_input.tsv", | |
88 "Resistance.outputDir": "/home/projects/cge/people/rkmo/delme/", | |
89 "Resistance.geneCov": 0.6, | |
90 "Resistance.geneID": 0.8, | |
91 "Resistance.pointCov": 0.6, | |
92 "Resistance.pointID": 0.8, | |
93 "Resistance.python": "python3", | |
94 "Resistance.kma": "/home/projects/cge/apps/resfinder/resfinder/cge/kma/kma", | |
95 "Resistance.blastn": "blastn", | |
96 "Resistance.resfinder": "/home/projects/cge/apps/resfinder/resfinder/run_resfinder.py", | |
97 "Resistance.resDB": "/home/projects/cge/apps/resfinder/resfinder/db_resfinder", | |
98 "Resistance.pointDB": "/home/projects/cge/apps/resfinder/resfinder/db_pointfinder" | |
99 } | |
100 | |
101 ``` | |
102 | |
103 ## Run Cromwell | |
104 | |
105 Cromwell needs JAVA to run. Load a valid JAVA module, for example: | |
106 | |
107 ```bash | |
108 | |
109 module load openjdk/16 | |
110 | |
111 ``` | |
112 | |
113 A Cromwell call looks like this: | |
114 | |
115 ```bash | |
116 | |
117 java -Dconfig.file=<CONF> -jar <CROMWELL> run <WDL> --inputs <JSON> | |
118 | |
119 ``` | |
120 | |
121 ### <CONF> and <CROMWELL> | |
122 Computerome specific. | |
123 | |
124 * <CONF>: Path to Computerome configuration for Cromwell. You need to change | |
125 this if you are not running Cromwell on Computerome. Computerome path: | |
126 /home/projects/cge/apps/resfinder/resfinder/scripts/wdl/computerome.conf | |
127 | |
128 * <CROMWELL>: Path to Cronwell jar file in Computerome: | |
129 /services/tools/cromwell/50/cromwell-50.jar | |
130 | |
131 ### <WDL> | |
132 ResFinder specific. | |
133 | |
134 * <WDL>: Path to wdl file that specifies how to run ResFinder. Path to | |
135 resfinder.wdl on Computerome: | |
136 /home/projects/cge/apps/resfinder/resfinder/scripts/wdl/resfinder.wdl | |
137 | |
138 ### <JSON> | |
139 User/Run specific | |
140 | |
141 Path to input.json. Specifies all the parameters for ResFinder (See above). | |
142 | |
143 ### Run example | |
144 | |
145 ```bash | |
146 | |
147 java -Dconfig.file=/home/projects/cge/apps/resfinder/resfinder/scripts/wdl/computerome.conf -jar /services/tools/cromwell/50/cromwell-50.jar run /home/projects/cge/apps/resfinder/resfinder/scripts/wdl/resfinder.wdl --inputs /home/projects/cge/apps/resfinder/resfinder/scripts/wdl/input.json | |
148 | |
149 ``` | |
150 | |
151 ### Post run | |
152 | |
153 All ResFinder output will be located in the provided output directory. | |
154 | |
155 In the directory where you execute Cromwell the following two directories will | |
156 also be created: | |
157 | |
158 * cromwell-executions | |
159 * cromwell-workflow-logs | |
160 | |
161 They contain logging information and cached results. |