|
0
|
1 .TH xenome 1 "September 12, 2012" "Xenome User Manual"
|
|
|
2 .SH NAME
|
|
|
3 .PP
|
|
|
4 xenome - a tool for classifying reads from xenograft sources.
|
|
|
5 .PP
|
|
|
6 Version 1.0.1
|
|
|
7 .SH SYNOPSIS
|
|
|
8 .PP
|
|
|
9 xenome index -T 8 -P idx -H mouse.fa -G human.fa
|
|
|
10 .PP
|
|
|
11 xenome classify -T 8 -P idx \[em]pairs \[em]host-name mouse
|
|
|
12 \[em]graft-name human -i in_1.fastq -i in_2.fastq
|
|
|
13 .PP
|
|
|
14 xenome help
|
|
|
15 .SH DESCRIPTION
|
|
|
16 .PP
|
|
|
17 Shotgun sequence read data derived from xenograft material contains
|
|
|
18 a mixture of reads arising from the host and reads arising from the
|
|
|
19 graft.
|
|
|
20 Xenome is an application for classifying the read mixture to
|
|
|
21 separate the two, allowing for more precise analysis to be
|
|
|
22 performed.
|
|
|
23 .PP
|
|
|
24 Xenome uses host and graft reference sequences to characterise the
|
|
|
25 set of all possible k-mers according to whether they belong to:
|
|
|
26 .IP \[bu] 2
|
|
|
27 only the graft (and NOT the host)
|
|
|
28 .IP \[bu] 2
|
|
|
29 only the host (and NOT the graft)
|
|
|
30 .IP \[bu] 2
|
|
|
31 both references
|
|
|
32 .IP \[bu] 2
|
|
|
33 neither reference
|
|
|
34 .IP \[bu] 2
|
|
|
35 the subset of the host (or graft) k-mers which is one base
|
|
|
36 substitution away from being in the graft (or host) - we call these
|
|
|
37 k-mers \[lq]marginal\[rq]
|
|
|
38 .PP
|
|
|
39 Given a read, or read pair, xenome will calculate which of the
|
|
|
40 above categories its k-mers belong to, and classify it as one of:
|
|
|
41 graft, host, both, neither, or ambiguous.
|
|
|
42 .PP
|
|
|
43 Xenome has two distinct stages, which are embodied in two separate
|
|
|
44 commands: `index' and `classify'.
|
|
|
45 Before reads can be classified, an index must be constructed from
|
|
|
46 the graft and host reference sequences.
|
|
|
47 The references must be in FASTA format, and may optionally be
|
|
|
48 compressed (gzip).
|
|
|
49 .PP
|
|
|
50 \f[CR]
|
|
|
51 xenome\ index\ -M\ 24\ -T\ 8\ -P\ idx\ -H\ mouse.fa\ -G\ human.fa
|
|
|
52 \f[]
|
|
|
53 .PP
|
|
|
54 A xenome index consists of a number of related files which can be
|
|
|
55 identified by a user-specified prefix, e.g.\ `idx' in the above
|
|
|
56 command.
|
|
|
57 The prefix may contain `/' characters, allowing the index to be in
|
|
|
58 a sub-directory.
|
|
|
59 (Any such sub-directory must already exist - xenome will not create
|
|
|
60 it.)
|
|
|
61 For example, the set of files comprising an index with prefix `idx'
|
|
|
62 are:
|
|
|
63 .PP
|
|
|
64 \f[CR]
|
|
|
65 idx-both.header
|
|
|
66 idx-both.kmers-d0
|
|
|
67 idx-both.kmers-d1
|
|
|
68 idx-both.kmers.header
|
|
|
69 idx-both.kmers.high-bits
|
|
|
70 idx-both.kmers.low-bits.lwr
|
|
|
71 idx-both.kmers.low-bits.upr
|
|
|
72 idx-both.lhs-bits
|
|
|
73 idx-both.rhs-bits
|
|
|
74 \f[]
|
|
|
75 .PP
|
|
|
76 Once an index is available, reads can be classified according to
|
|
|
77 whether they appear to contain graft or host material.
|
|
|
78 In the simplest case, Xenome can classify each read from a single
|
|
|
79 source file individually.
|
|
|
80 .PP
|
|
|
81 \f[CR]
|
|
|
82 xenome\ classify\ -P\ idx\ -i\ in.fastq\
|
|
|
83 \f[]
|
|
|
84 .PP
|
|
|
85 This step produces a file for each read category, containing all of
|
|
|
86 the reads which have been assigned that classification:
|
|
|
87 .PP
|
|
|
88 \f[CR]
|
|
|
89 ambiguous.fastq
|
|
|
90 both.fastq
|
|
|
91 graft.fastq
|
|
|
92 host.fastq
|
|
|
93 neither.fastq
|
|
|
94 \f[]
|
|
|
95 .PP
|
|
|
96 Input files are base-space reads in FASTA or FASTQ format or in a
|
|
|
97 format with one read per line and in either plain text or
|
|
|
98 compressed format (gzip).
|
|
|
99 .PP
|
|
|
100 The files produced are in the same format as the input file, with
|
|
|
101 all of the input read data preserved.
|
|
|
102 i.e.\ if the input reads are in FASTQ format, the reads written to
|
|
|
103 each of the output files will also be in FASTQ format.
|
|
|
104 .PP
|
|
|
105 Multiple input files may be specified, but all inputs in the same
|
|
|
106 format will be written to the same set of output files.
|
|
|
107 .PP
|
|
|
108 \f[CR]
|
|
|
109 xenome\ classify\ -P\ idx\ -i\ inA.fastq\ -i\ inB.fastq\ -I\ inC.fasta
|
|
|
110 \f[]
|
|
|
111 .PP
|
|
|
112 The above will result in the following set of files:
|
|
|
113 .PP
|
|
|
114 \f[CR]
|
|
|
115 ambiguous.fasta
|
|
|
116 ambiguous.fastq
|
|
|
117 both.fasta
|
|
|
118 both.fastq
|
|
|
119 graft.fasta
|
|
|
120 graft.fastq
|
|
|
121 host.fasta
|
|
|
122 host.fastq
|
|
|
123 neither.fasta
|
|
|
124 neither.fastq
|
|
|
125 \f[]
|
|
|
126 .PP
|
|
|
127 Each of the FASTQ files contains a mixture of reads from inA.fastq
|
|
|
128 and inB.fastq.
|
|
|
129 The FASTA files contain reads from inC.fasta.
|
|
|
130 .PP
|
|
|
131 If the combining of input reads from separate files is not desired,
|
|
|
132 xenome should be run separately for each input.
|
|
|
133 The output from different runs can be distinguished by prefixing
|
|
|
134 the filenames with a distinct string.
|
|
|
135 .PP
|
|
|
136 \f[CR]
|
|
|
137 xenome\ classify\ -P\ idx\ -i\ inA.fastq\ --output-filename-prefix\ A
|
|
|
138 xenome\ classify\ -P\ idx\ -i\ inB.fastq\ --output-filename-prefix\ B
|
|
|
139 \f[]
|
|
|
140 .PP
|
|
|
141 Running these two commands yields:
|
|
|
142 .PP
|
|
|
143 \f[CR]
|
|
|
144 A_ambiguous.fastq
|
|
|
145 A_both.fastq
|
|
|
146 A_graft.fastq
|
|
|
147 A_host.fastq
|
|
|
148 A_neither.fastq
|
|
|
149 B_ambiguous.fastq
|
|
|
150 B_both.fastq
|
|
|
151 B_graft.fastq
|
|
|
152 B_host.fastq
|
|
|
153 B_neither.fastq
|
|
|
154 \f[]
|
|
|
155 .PP
|
|
|
156 Xenome can also process pairs of reads.
|
|
|
157 .PP
|
|
|
158 \f[CR]
|
|
|
159 xenome\ classify\ -P\ idx\ --pairs\ -i\ in_1.fastq\ -i\ in_2.fastq
|
|
|
160 \f[]
|
|
|
161 .PP
|
|
|
162 This results in a pair of files for each read category.
|
|
|
163 The two reads of each pair are written to the corresponding `_1'
|
|
|
164 and `_2' files respectively.
|
|
|
165 .PP
|
|
|
166 \f[CR]
|
|
|
167 ambiguous_1.fastq
|
|
|
168 ambiguous_2.fastq
|
|
|
169 both_1.fastq
|
|
|
170 both_2.fastq
|
|
|
171 graft_1.fastq
|
|
|
172 graft_2.fastq
|
|
|
173 host_1.fastq
|
|
|
174 host_2.fastq
|
|
|
175 neither_1.fastq
|
|
|
176 neither_2.fastq
|
|
|
177 \f[]
|
|
|
178 .PP
|
|
|
179 If desired, more specific names can be used in place of `host' and
|
|
|
180 `graft'.
|
|
|
181 .PP
|
|
|
182 \f[CR]
|
|
|
183 xenome\ classify\ -P\ idx\ -i\ in.fastq\ --graft-name\ human\ --host-name\ mouse
|
|
|
184 \f[]
|
|
|
185 .PP
|
|
|
186 This will cause xenome to produce the following files.
|
|
|
187 .PP
|
|
|
188 \f[CR]
|
|
|
189 ambiguous.fastq
|
|
|
190 both.fastq
|
|
|
191 human.fastq
|
|
|
192 mouse.fastq
|
|
|
193 neither.fastq
|
|
|
194 \f[]
|
|
|
195 .PP
|
|
|
196 In addition to generating sets of output files, the classify
|
|
|
197 command produces statistics about the number and proportion of
|
|
|
198 reads assigned to each category.
|
|
|
199 These are printed to standard out at the end of a run and look as
|
|
|
200 follows:
|
|
|
201 .PP
|
|
|
202 \f[CR]
|
|
|
203 Statistics
|
|
|
204 B\ \ \ \ \ \ \ G\ \ \ \ \ \ \ H\ \ \ \ \ \ \ M\ \ \ \ \ \ \ count\ \ \ \ \ percent\ \ \ class
|
|
|
205 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1900\ \ \ \ \ \ 0.938267\ \ "neither"
|
|
|
206 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 21\ \ \ \ \ \ \ \ 0.0103703\ "both"
|
|
|
207 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 28491\ \ \ \ \ 14.0696\ \ \ "definitely\ host"
|
|
|
208 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 7366\ \ \ \ \ \ 3.63751\ \ \ "probably\ host"
|
|
|
209 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 91895\ \ \ \ \ 45.38\ \ \ \ \ "definitely\ graft"
|
|
|
210 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 30059\ \ \ \ \ 14.8439\ \ \ "probably\ graft"
|
|
|
211 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 282\ \ \ \ \ \ \ 0.139259\ \ "ambiguous"
|
|
|
212 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 330\ \ \ \ \ \ \ 0.162962\ \ "ambiguous"
|
|
|
213 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 2878\ \ \ \ \ \ 1.42123\ \ \ "both"
|
|
|
214 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 254\ \ \ \ \ \ \ 0.125431\ \ "probably\ both"
|
|
|
215 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 610\ \ \ \ \ \ \ 0.301233\ \ "definitely\ host"
|
|
|
216 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 5815\ \ \ \ \ \ 2.87159\ \ \ "probably\ host"
|
|
|
217 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 3843\ \ \ \ \ \ 1.89777\ \ \ "definitely\ graft"
|
|
|
218 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 27775\ \ \ \ \ 13.716\ \ \ \ "probably\ graft"
|
|
|
219 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 99\ \ \ \ \ \ \ \ 0.0488886\ "ambiguous"
|
|
|
220 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 883\ \ \ \ \ \ \ 0.436047\ \ "ambiguous"
|
|
|
221
|
|
|
222 Summary
|
|
|
223 count\ \ \ \ \ percent\ \ \ class
|
|
|
224 153572\ \ \ \ 75.8377\ \ \ "graft"
|
|
|
225 42282\ \ \ \ \ 20.8799\ \ \ "host"
|
|
|
226 3153\ \ \ \ \ \ 1.55703\ \ \ "both"
|
|
|
227 1900\ \ \ \ \ \ 0.938267\ \ "neither"
|
|
|
228 1594\ \ \ \ \ \ 0.787157\ \ "ambiguous"
|
|
|
229 \f[]
|
|
|
230 .PP
|
|
|
231 Both tables contain a single heading line, followed by rows of
|
|
|
232 TAB-separated elements; a format suitable for loading into R or a
|
|
|
233 spreadsheet.
|
|
|
234 .PP
|
|
|
235 Each row represents the number and proportion of reads assigned to
|
|
|
236 a particular class.
|
|
|
237 The B, G, H, and M fields represent the presence (1) or absence (0)
|
|
|
238 of k-mers belonging to the both, graft, host and marginal k-mer
|
|
|
239 subsets, according to the reference index.
|
|
|
240 .PP
|
|
|
241 The Statistics table contains 16 rows; one for each possible
|
|
|
242 combination of k-mer classes present within a read.
|
|
|
243 The first row of the above table, indicates that for the given
|
|
|
244 input, 1,900 reads (or pairs) - 0.938267% of the total reads -
|
|
|
245 contained no k-mers that belonged to the B, G, H, or M k-mer
|
|
|
246 subsets, and are accordingly neither host nor graft reads.
|
|
|
247 Similarly, the fourteenth line states that 27,775 reads (or pairs)
|
|
|
248 - 13.716% of the total - contained k-mers that belong to the B, G,
|
|
|
249 M, but not H subsets, and are therefore \[lq]probably graft\[rq]
|
|
|
250 reads.
|
|
|
251 .PP
|
|
|
252 In the Summary table, the B, G, H, and M columns are removed, and
|
|
|
253 the classes from the Statistics table have been collapsed into the
|
|
|
254 five shown; the definitely/probably graft/host classes are combined
|
|
|
255 into just graft/host classes.
|
|
|
256 Notice that the different read output files, described earlier,
|
|
|
257 correspond exactly to these classes.
|
|
|
258 .SH OPTIONS COMMON TO ALL COMMANDS
|
|
|
259 .PP
|
|
|
260 The following options can be used with all of the \f[I]xenome\f[]
|
|
|
261 commands and are therefore not listed separately for each command.
|
|
|
262 .TP
|
|
|
263 .B -h, --help
|
|
|
264 Show a help message.
|
|
|
265 .RS
|
|
|
266 .RE
|
|
|
267 .TP
|
|
|
268 .B -l \f[I]FILE\f[], --log-file \f[I]FILE\f[]
|
|
|
269 Place to write progress messages.
|
|
|
270 Messages are only written if the -v flag is used.
|
|
|
271 If omitted, messages are written to stderr.
|
|
|
272 .RS
|
|
|
273 .RE
|
|
|
274 .TP
|
|
|
275 .B -T \f[I]INT\f[], --num-threads \f[I]INT\f[]
|
|
|
276 The maximum number of \f[I]worker\f[] threads to use.
|
|
|
277 The actual number of threads used during the algorithms depends on
|
|
|
278 each implementation.
|
|
|
279 \f[I]xenome\f[] may use a small number of additional threads for
|
|
|
280 performing non cpu-bound operations, such as file I/O.
|
|
|
281 .RS
|
|
|
282 .RE
|
|
|
283 .TP
|
|
|
284 .B --tmp-dir \f[I]DIRECTORY\f[]
|
|
|
285 A directory to use for temporary files.
|
|
|
286 This flag may be repeated in order to nominate multiple temporary
|
|
|
287 directories.
|
|
|
288 .RS
|
|
|
289 .RE
|
|
|
290 .TP
|
|
|
291 .B -v, --verbose
|
|
|
292 Show progress messages.
|
|
|
293 .RS
|
|
|
294 .RE
|
|
|
295 .TP
|
|
|
296 .B -V, --version
|
|
|
297 Show the software version.
|
|
|
298 .RS
|
|
|
299 .RE
|
|
|
300 .SH COMMANDS AND OPTIONS
|
|
|
301 .SS xenome index
|
|
|
302 .PP
|
|
|
303 xenome index [-k \f[I]INT\f[]] [-M \f[I]INT\f[]] -P \f[I]PREFIX\f[]
|
|
|
304 -G \f[I]FASTA-filename\f[] -H \f[I]FASTA-filename\f[]
|
|
|
305 .PP
|
|
|
306 Build the xenome reference index from the graft and host reference
|
|
|
307 sequences.
|
|
|
308 The input files must be in FASTA format.
|
|
|
309 They may be gzip compressed, in which case the filename suffix must
|
|
|
310 be \f[I]\&.gz\f[].
|
|
|
311 .PP
|
|
|
312 The k-mer size may be specified using the \f[I]-k\f[] flag.
|
|
|
313 If omitted, xenome defaults to k=25.
|
|
|
314 .PP
|
|
|
315 During index construction, xenome maintains a hash table of the
|
|
|
316 k-mers seen so far.
|
|
|
317 When this table fills, its contents are written to disk, and the
|
|
|
318 table is reinitialised.
|
|
|
319 The more memory xenome can use, the less often it will need to
|
|
|
320 write to disk, and the faster index construction will run.
|
|
|
321 By default, xenome will limit itself to 2 GB during index
|
|
|
322 construction.
|
|
|
323 The -M, \[em]max-memory flag can be used to explicitly control the
|
|
|
324 amount of memory available to xenome (in GB).
|
|
|
325 To improve performance, this should generally be set close to the
|
|
|
326 amount memory available in the system - having accounted for
|
|
|
327 operating system and other overhead.
|
|
|
328 .PP
|
|
|
329 \f[I]OPTIONS\f[]
|
|
|
330 .TP
|
|
|
331 .B -k \f[I]INT\f[], --kmer-size \f[I]INT\f[]
|
|
|
332 The k-mer size to use for building the graph: in version 1.0.0 this
|
|
|
333 \f[I]must be an integer strictly less than 63\f[].
|
|
|
334 If not supplied, the default value of 25 is used.
|
|
|
335 .RS
|
|
|
336 .RE
|
|
|
337 .TP
|
|
|
338 .B -M \f[I]INT\f[], --max-memory \f[I]INT\f[]
|
|
|
339 The maximum amount of memory (in GB) of memory to use.
|
|
|
340 Making more memory available will reduce the number of times xenome
|
|
|
341 writes intermediate index data to disk.
|
|
|
342 The default is 2 GB.
|
|
|
343 .RS
|
|
|
344 .RE
|
|
|
345 .TP
|
|
|
346 .B -P \f[I]PREFIX\f[], --prefix \f[I]PREFIX\f[]
|
|
|
347 The path prefix for all generated reference index files.
|
|
|
348 The prefix may contain directory separators (e.g.
|
|
|
349 `/') in order to have the index files written to another directory.
|
|
|
350 .RS
|
|
|
351 .RE
|
|
|
352 .TP
|
|
|
353 .B -G \f[I]FILE\f[], --graft \f[I]FILE\f[]
|
|
|
354 The name of the FASTA file containing the graft reference sequence.
|
|
|
355 If the filename ends in \f[I]\&.gz\f[] it will be read as a gzip
|
|
|
356 file.
|
|
|
357 .RS
|
|
|
358 .RE
|
|
|
359 .TP
|
|
|
360 .B -H \f[I]FILE\f[], --host \f[I]FILE\f[]
|
|
|
361 The name of the FASTA file containing the host reference sequence.
|
|
|
362 If the filename ends in \f[I]\&.gz\f[] it will be read as a gzip
|
|
|
363 file.
|
|
|
364 .RS
|
|
|
365 .RE
|
|
|
366 .SS xenome classify
|
|
|
367 .PP
|
|
|
368 xenome classify -P \f[I]PREFIX\f[] {-I \f[I]FASTA-filename\f[] | -i
|
|
|
369 \f[I]FASTQ-filename\f[] | \[em]line-in \f[I]filename\f[]}+
|
|
|
370 [\[em]pairs] [-M \f[I]INT\f[]] [\[em]graft-name \f[I]STRING\f[]]
|
|
|
371 [\[em]host-name \f[I]STRING\f[]] [\[em]output-filename-prefix
|
|
|
372 \f[I]STRING\f[]] [\[em]dont-write-reads] [\[em]preserve-read-order]
|
|
|
373 .PP
|
|
|
374 Classifies input reads according to a pre-computed k-mer index.
|
|
|
375 The reads are written into separate files, according to their
|
|
|
376 classification, and a breakdown of the number and proportion of
|
|
|
377 reads in each class is printed.
|
|
|
378 .PP
|
|
|
379 If the total size of the index files is greater than available RAM,
|
|
|
380 xenome will perform poorly.
|
|
|
381 To overcome this, the -M, \[em]max-memory flag may be used to
|
|
|
382 specify the maximum amount of memory (in GB) that xenome may use at
|
|
|
383 any time.
|
|
|
384 If this amount is less than the size of the index structures,
|
|
|
385 xenome will (effectively) partition the index into multiple
|
|
|
386 subsets, each no larger than the specified maximum memory size, and
|
|
|
387 classify the reads in multiple passes - with each pass using a
|
|
|
388 different index subset.
|
|
|
389 The results from each passes are combined, and the result is
|
|
|
390 produced as usual.
|
|
|
391 If run with the -v, \[em]verbose flag, xenome will report the
|
|
|
392 number of passes it will perform.
|
|
|
393 Note that runtime will increase with the number of passes
|
|
|
394 performed; the biggest increase will occur with the step from one
|
|
|
395 pass to two.
|
|
|
396 .PP
|
|
|
397 \f[I]OPTIONS\f[]
|
|
|
398 .TP
|
|
|
399 .B -P \f[I]PREFIX\f[], --prefix \f[I]PREFIX\f[]
|
|
|
400 The path prefix for all reference index files.
|
|
|
401 The prefix may contain directory separators (e.g.
|
|
|
402 `/') in order to have the index files written to another directory.
|
|
|
403 .RS
|
|
|
404 .RE
|
|
|
405 .TP
|
|
|
406 .B -I \f[I]FILE\f[], --fasta-in \f[I]FILE\f[]
|
|
|
407 Input file in FASTA format.
|
|
|
408 .RS
|
|
|
409 .RE
|
|
|
410 .TP
|
|
|
411 .B -i \f[I]FILE\f[], --fastq-in \f[I]FILE\f[]
|
|
|
412 Input file in FASTQ format.
|
|
|
413 .RS
|
|
|
414 .RE
|
|
|
415 .TP
|
|
|
416 .B \[em]line-in \f[I]FILE\f[]
|
|
|
417 Input file with one read per line and no other annotation.
|
|
|
418 .RS
|
|
|
419 .RE
|
|
|
420 .TP
|
|
|
421 .B \[em]pairs
|
|
|
422 Treat reads from consecutive input files of the same type as pairs.
|
|
|
423 .RS
|
|
|
424 .RE
|
|
|
425 .TP
|
|
|
426 .B -M \f[I]INT\f[], --max-memory \f[I]INT\f[]
|
|
|
427 The maximum amount of memory (in GB) to use while classifying
|
|
|
428 reads.
|
|
|
429 If not specified, xenome will use as much memory as required to
|
|
|
430 classify all reads in a single pass.
|
|
|
431 When the maximum amount of memory is less than the size of the
|
|
|
432 reference index files, xenome will need to perform multiple passes
|
|
|
433 over the input data - increasing runtime.
|
|
|
434 .RS
|
|
|
435 .RE
|
|
|
436 .TP
|
|
|
437 .B \[em]graft-name \f[I]STRING\f[]
|
|
|
438 The name of the graft reference to appear in filenames and
|
|
|
439 statistics.
|
|
|
440 If no explicit name is provided, the string \[lq]graft\[rq] is
|
|
|
441 used.
|
|
|
442 .RS
|
|
|
443 .RE
|
|
|
444 .TP
|
|
|
445 .B \[em]host-name \f[I]STRING\f[]
|
|
|
446 The name of the host reference to appear in filenames and
|
|
|
447 statistics.
|
|
|
448 If no explicit name is provided, the string \[lq]host\[rq] is used.
|
|
|
449 .RS
|
|
|
450 .RE
|
|
|
451 .TP
|
|
|
452 .B \[em]output-filename-prefix \f[I]STRING\f[]
|
|
|
453 An optional prefix to apply to all output read filenames.
|
|
|
454 The prefix is separated from the rest of the filename by an
|
|
|
455 underscore (`_').
|
|
|
456 .RS
|
|
|
457 .RE
|
|
|
458 .TP
|
|
|
459 .B \[em]dont-write-reads
|
|
|
460 The reads will not be written to any files after classification,
|
|
|
461 and none of the usual per-category output files will be created.
|
|
|
462 The classification statistics will still be printed to standard
|
|
|
463 out.
|
|
|
464 .RS
|
|
|
465 .RE
|
|
|
466 .TP
|
|
|
467 .B \[em]preserve-read-order
|
|
|
468 The relative ordering of reads within each output file will be the
|
|
|
469 same as that in the input files.
|
|
|
470 i.e.\ if read \f[I]r1\f[] precedes \f[I]r2\f[] in a single output
|
|
|
471 file, then \f[I]r1\f[] also precedes \f[I]r2\f[] in the input.
|
|
|
472 Note: If this flag is specified, the -T/\[em]num-threads flag is
|
|
|
473 ignored, and xenome will only operate with a single worker thread.
|
|
|
474 .RS
|
|
|
475 .RE
|
|
|
476 .SS xenome help
|
|
|
477 .PP
|
|
|
478 xenome help
|
|
|
479 .PP
|
|
|
480 Prints a summary of all of the xenome commands.
|
|
|
481 .PP
|
|
|
482 \[em]
|
|
|
483 .SH FUTURE RELEASES
|
|
|
484 .PP
|
|
|
485 Bzip support will be introduced.
|
|
|
486 .SH AUTHORS
|
|
|
487 Bryan Beresford-Smith, Andrew Bromage, Thomas Conway, Jeremy Wazny.
|
|
|
488
|