changeset 0:6d87470d68aa draft default tip

Uploaded
author sangok
date Thu, 23 Apr 2020 08:32:34 -0400
parents
children
files xenome-1.0.1-r/README.txt xenome-1.0.1-r/license.txt xenome-1.0.1-r/xenome xenome-1.0.1-r/xenome.1
diffstat 4 files changed, 643 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/xenome-1.0.1-r/README.txt	Thu Apr 23 08:32:34 2020 -0400
@@ -0,0 +1,45 @@
+% Xenome README
+% September, 2012
+
+Xenome is a tool for classifying reads from xenograft sources.
+
+REQUIREMENTS
+===========
+
+Xenome should run on any standard 64 bit Linux environment, with as little as 
+2 GB of free RAM. For best performance, when classifying reads according to a 
+mouse/human reference (k=25), we recommend using a machine with 16 GB of RAM.
+
+INSTALLATION
+===========
+
+The Xenome program and related files (including this README.txt) will be 
+extracted to a directory called xenome-*version* where *version* is the version
+number for the distribution. For example, for version 0.1.1 this will be the
+directory xenome-0.1.1.
+
+The following files should appear in this directory:
+
+***
+
+---------------------------------------------------------------------------
+File Name               Description
+-----------             -------------
+xenome                  The Xenome program.
+
+xenome.pdf              The Xenome manual in pdf format.
+
+xenome.1                The Xenome unix manpage. To view use :
+                            man ./xenome.1
+
+license.txt             The license for Xenome.
+
+README.txt              This readme file.
+
+---------------------------------------------------------------------------
+
+***
+
+Please refer to the Xenome manual for further details and for examples.
+
+This software is distributed under the included license.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/xenome-1.0.1-r/license.txt	Thu Apr 23 08:32:34 2020 -0400
@@ -0,0 +1,110 @@
+[NON-COMMERCIAL] SOFTWARE LICENSE AGREEMENT
+
+PLEASE READ THIS SOFTWARE LICENSE AGREEMENT CAREFULLY BEFORE DOWNLOADING,
+INSTALLING OR USING NATIONAL ICT AUSTRALIA LIMITED (NICTA) SUPPLIED SOFTWARE. BY
+DOWNLOADING, INSTALLING OR USING THE SOFTWARE YOU ARE CONSENTING TO BE BOUND BY
+THIS LICENSE. IF YOU DO NOT AGREE TO ALL OF THE TERMS OF THIS LICENSE, THEN DO
+NOT DOWNLOAD, INSTALL OR USE THE SOFTWARE. 
+
+This License Agreement is entered into between National ICT Australia Limited
+(ABN 62 102 206 173) (herein referred to as "Licensor") and you, the Licensee. 
+
+The computer program(s) and related documentation and materials (herein
+collectively referred to as "the Software") are licensed, not sold, to the
+Licensee for use only upon the terms of this license, and Licensor reserves any
+rights not expressly granted to Licensee. The following terms govern use of the
+Software by the Licensee.
+
+1.  Licensor hereby grants you a perpetual, non-exclusive, non-transferable,
+royalty free license to use the Software for academic and research purposes
+only. Licensee acknowledges that Licensee may not use the Software for
+Commercial Purposes. “Commercial purposes” and “Commercial Use” means to use
+sell, hire or otherwise exploit the Software as part of a product or process
+which is intended directly or indirectly to make a profit for the Licensee or
+any third party, or to license or sub-license the Software to any third party 
+2.  Licensee may not:
+    a.  translate, reverse engineer, decompile, decrypt, disassemble
+    (except to the extent applicable laws specifically prohibit such
+    restriction), or create derivative works based on the Software;
+    b.  copy the Software (except for back-up purposes), but subject to
+    clause 11;
+    c.  rent, lease, transfer, assign, sub-license or otherwise transfer
+    rights to the Software; 
+    d.  sell the Software to any third party; or
+    e.  remove any proprietary notices or labels on the Software.
+3.  Title, ownership rights, and intellectual property rights in and to the
+Software shall remain solely with Licensor.
+4.  To the extent permitted by law, the Software is provided on an "AS IS"
+basis, without warranty of any kind, including without limitation the warranties
+of merchantability, fitness for a particular purpose and non-infringement. The
+entire risk as to the quality and performance of the Software is borne by
+Licensee. Should the Software prove defective, Licensee assumes the entire cost
+of any service and repair. This disclaimer of warranty constitutes an essential
+part of this Agreement.
+5.  Except to the extent required by applicable law, Licensor shall not be
+under any liability (whether for breach of contract, breach of warranty or in
+tort, including negligence) to Licensee in respect of any loss or damage
+(including any direct, indirect, special, incidental or consequential loss or
+damage) howsoever caused, arising as a result of this Agreement. 
+6.  Licensee agrees to maintain and reproduce all copyright and other
+proprietary notices on all copies, in any form, of the Software in the same form
+and manner that such copyright and other proprietary notices are included on the
+Software. Licensee may make a reasonable number of copies of the Software and
+install those copies on separate machines which are owned or controlled by
+Licensee PROVIDED THAT Licensee does not redistribute the Software under any
+circumstances to any third party, and provided Licensee retains on those copies
+all copyright, confidentiality and proprietary notices that appear on the
+original.  
+7.  Licensee agrees that Licensor may request from time to time that the
+Licensee provide feedback to the Licensor on the Software. Licensee agrees that
+the Licensor owns all title, ownership rights and intellectual property rights
+in the feedback provided by Licensee.
+8.  This license will terminate automatically if Licensee fails to comply
+with the limitations described above. On termination, Licensee must destroy all
+copies of the Software in electronic or other form, including any copies on
+backup tapes or other media, and, at Licensor’s request, the Licensee, to the
+extent practicable, shall deliver to Licensor certification that all copies of
+the Software have been destroyed..
+9.  This Agreement represents the complete agreement concerning this license between
+the parties and supersedes all prior agreements and representations between
+them. It may be amended only by a writing executed by both parties. If any
+provision of this Agreement is held to be unenforceable for any reason, such
+provision shall be reformed only to the extent necessary to make it enforceable.
+This Agreement shall be governed by and construed under the laws of the State of
+New South Wales, Australia. The application of the United Nations Convention of
+Contracts for the International Sale of Goods is expressly excluded.
+10. As a condition of this license, Licensee shall ensure that any reports,
+academic papers  or published results obtained from use of the Software by
+Licensee (or by third parties who use the Software with Licensee’s permission
+under the terms of this License) will contain one or both of the following
+acknowledgements: 
+    i.  “Results obtained using Gossamer Software ©2012 National ICT
+    Australia Ltd (NICTA).” 
+    ii. "Gossamer - A Resource Efficient de novo Assembler", Thomas
+    Conway, Jeremy Wazny, Andrew Bromage, Justin Zobel and Bryan
+    Beresford-Smith, Bioinformatics 2012; doi:
+    10.1093/bioinformatics/bts297.
+11. The Licensee understands and accepts that the Software is proprietary to
+NICTA. The Licensee agrees to take all reasonable steps to ensure that all
+copies of the Software in their possession under the terms of the License are
+protected and secured from unauthorized disclosure, use, or redistribution.
+Licensee will treat the Software with at least the same level of care as
+Licensee would use to protect and secure its own proprietary computer programs
+and/or information, but using no less than a reasonable standard of care. 
+12. Licensee agrees to provide access to the Software only to any other
+person or entity who has agreed to abide by the terms of this Licence. If
+Licensee is an institution or corporation each individual person who uses the
+Software with the permission of that institution or corporation must agree to
+abide by the terms of this license. If the Licensee becomes aware of any
+unauthorized licensing, copying or use of the Software in breach of this
+License, the Licensee shall promptly do all things necessary to stop the breach
+and notify NICTA in writing. The Licensee expressly agrees to use the Software
+only in the manner and for the specific uses authorized in this Agreement.
+13. Commercial Use of the Software REQUIRES A COMMERCIAL LICENSE.  Should
+the Licensee wish to make Commercial Use of the Software, Licensee will contact
+NICTA (bioinformatics@nicta.com.au) to request an appropriate license for such
+use. In addition to the definition in clause 1 above, Commercial Use includes:
+(1) integration of all or part of the Software into a product for sale, lease or
+license by or on behalf of Licensee to third parties, or  (2) distribution of
+the Software to third parties that need it to commercialize a product sold or
+licensed by or on behalf of Licensee.
Binary file xenome-1.0.1-r/xenome has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/xenome-1.0.1-r/xenome.1	Thu Apr 23 08:32:34 2020 -0400
@@ -0,0 +1,488 @@
+.TH xenome 1 "September 12, 2012" "Xenome User Manual"
+.SH NAME
+.PP
+xenome - a tool for classifying reads from xenograft sources.
+.PP
+Version 1.0.1
+.SH SYNOPSIS
+.PP
+xenome index -T 8 -P idx -H mouse.fa -G human.fa
+.PP
+xenome classify -T 8 -P idx \[em]pairs \[em]host-name mouse
+\[em]graft-name human -i in_1.fastq -i in_2.fastq
+.PP
+xenome help
+.SH DESCRIPTION
+.PP
+Shotgun sequence read data derived from xenograft material contains
+a mixture of reads arising from the host and reads arising from the
+graft.
+Xenome is an application for classifying the read mixture to
+separate the two, allowing for more precise analysis to be
+performed.
+.PP
+Xenome uses host and graft reference sequences to characterise the
+set of all possible k-mers according to whether they belong to:
+.IP \[bu] 2
+only the graft (and NOT the host)
+.IP \[bu] 2
+only the host (and NOT the graft)
+.IP \[bu] 2
+both references
+.IP \[bu] 2
+neither reference
+.IP \[bu] 2
+the subset of the host (or graft) k-mers which is one base
+substitution away from being in the graft (or host) - we call these
+k-mers \[lq]marginal\[rq]
+.PP
+Given a read, or read pair, xenome will calculate which of the
+above categories its k-mers belong to, and classify it as one of:
+graft, host, both, neither, or ambiguous.
+.PP
+Xenome has two distinct stages, which are embodied in two separate
+commands: `index' and `classify'.
+Before reads can be classified, an index must be constructed from
+the graft and host reference sequences.
+The references must be in FASTA format, and may optionally be
+compressed (gzip).
+.PP
+\f[CR]
+      xenome\ index\ -M\ 24\ -T\ 8\ -P\ idx\ -H\ mouse.fa\ -G\ human.fa
+\f[]
+.PP
+A xenome index consists of a number of related files which can be
+identified by a user-specified prefix, e.g.\ `idx' in the above
+command.
+The prefix may contain `/' characters, allowing the index to be in
+a sub-directory.
+(Any such sub-directory must already exist - xenome will not create
+it.)
+For example, the set of files comprising an index with prefix `idx'
+are:
+.PP
+\f[CR]
+      idx-both.header
+      idx-both.kmers-d0
+      idx-both.kmers-d1
+      idx-both.kmers.header
+      idx-both.kmers.high-bits
+      idx-both.kmers.low-bits.lwr
+      idx-both.kmers.low-bits.upr
+      idx-both.lhs-bits
+      idx-both.rhs-bits
+\f[]
+.PP
+Once an index is available, reads can be classified according to
+whether they appear to contain graft or host material.
+In the simplest case, Xenome can classify each read from a single
+source file individually.
+.PP
+\f[CR]
+      xenome\ classify\ -P\ idx\ -i\ in.fastq\ 
+\f[]
+.PP
+This step produces a file for each read category, containing all of
+the reads which have been assigned that classification:
+.PP
+\f[CR]
+      ambiguous.fastq
+      both.fastq
+      graft.fastq
+      host.fastq
+      neither.fastq
+\f[]
+.PP
+Input files are base-space reads in FASTA or FASTQ format or in a
+format with one read per line and in either plain text or
+compressed format (gzip).
+.PP
+The files produced are in the same format as the input file, with
+all of the input read data preserved.
+i.e.\ if the input reads are in FASTQ format, the reads written to
+each of the output files will also be in FASTQ format.
+.PP
+Multiple input files may be specified, but all inputs in the same
+format will be written to the same set of output files.
+.PP
+\f[CR]
+      xenome\ classify\ -P\ idx\ -i\ inA.fastq\ -i\ inB.fastq\ -I\ inC.fasta
+\f[]
+.PP
+The above will result in the following set of files:
+.PP
+\f[CR]
+      ambiguous.fasta
+      ambiguous.fastq
+      both.fasta
+      both.fastq
+      graft.fasta
+      graft.fastq
+      host.fasta
+      host.fastq
+      neither.fasta
+      neither.fastq
+\f[]
+.PP
+Each of the FASTQ files contains a mixture of reads from inA.fastq
+and inB.fastq.
+The FASTA files contain reads from inC.fasta.
+.PP
+If the combining of input reads from separate files is not desired,
+xenome should be run separately for each input.
+The output from different runs can be distinguished by prefixing
+the filenames with a distinct string.
+.PP
+\f[CR]
+      xenome\ classify\ -P\ idx\ -i\ inA.fastq\ --output-filename-prefix\ A
+      xenome\ classify\ -P\ idx\ -i\ inB.fastq\ --output-filename-prefix\ B
+\f[]
+.PP
+Running these two commands yields:
+.PP
+\f[CR]
+      A_ambiguous.fastq
+      A_both.fastq
+      A_graft.fastq
+      A_host.fastq
+      A_neither.fastq
+      B_ambiguous.fastq
+      B_both.fastq
+      B_graft.fastq
+      B_host.fastq
+      B_neither.fastq
+\f[]
+.PP
+Xenome can also process pairs of reads.
+.PP
+\f[CR]
+      xenome\ classify\ -P\ idx\ --pairs\ -i\ in_1.fastq\ -i\ in_2.fastq
+\f[]
+.PP
+This results in a pair of files for each read category.
+The two reads of each pair are written to the corresponding `_1'
+and `_2' files respectively.
+.PP
+\f[CR]
+      ambiguous_1.fastq
+      ambiguous_2.fastq
+      both_1.fastq
+      both_2.fastq
+      graft_1.fastq
+      graft_2.fastq
+      host_1.fastq
+      host_2.fastq
+      neither_1.fastq
+      neither_2.fastq
+\f[]
+.PP
+If desired, more specific names can be used in place of `host' and
+`graft'.
+.PP
+\f[CR]
+      xenome\ classify\ -P\ idx\ -i\ in.fastq\ --graft-name\ human\ --host-name\ mouse
+\f[]
+.PP
+This will cause xenome to produce the following files.
+.PP
+\f[CR]
+      ambiguous.fastq
+      both.fastq
+      human.fastq
+      mouse.fastq
+      neither.fastq
+\f[]
+.PP
+In addition to generating sets of output files, the classify
+command produces statistics about the number and proportion of
+reads assigned to each category.
+These are printed to standard out at the end of a run and look as
+follows:
+.PP
+\f[CR]
+      Statistics
+      B\ \ \ \ \ \ \ G\ \ \ \ \ \ \ H\ \ \ \ \ \ \ M\ \ \ \ \ \ \ count\ \ \ \ \ percent\ \ \ class
+      0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1900\ \ \ \ \ \ 0.938267\ \ "neither"
+      0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 21\ \ \ \ \ \ \ \ 0.0103703\ "both"
+      0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 28491\ \ \ \ \ 14.0696\ \ \ "definitely\ host"
+      0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 7366\ \ \ \ \ \ 3.63751\ \ \ "probably\ host"
+      0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 91895\ \ \ \ \ 45.38\ \ \ \ \ "definitely\ graft"
+      0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 30059\ \ \ \ \ 14.8439\ \ \ "probably\ graft"
+      0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 282\ \ \ \ \ \ \ 0.139259\ \ "ambiguous"
+      0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 330\ \ \ \ \ \ \ 0.162962\ \ "ambiguous"
+      1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 2878\ \ \ \ \ \ 1.42123\ \ \ "both"
+      1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 254\ \ \ \ \ \ \ 0.125431\ \ "probably\ both"
+      1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 610\ \ \ \ \ \ \ 0.301233\ \ "definitely\ host"
+      1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 5815\ \ \ \ \ \ 2.87159\ \ \ "probably\ host"
+      1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 3843\ \ \ \ \ \ 1.89777\ \ \ "definitely\ graft"
+      1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 27775\ \ \ \ \ 13.716\ \ \ \ "probably\ graft"
+      1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 0\ \ \ \ \ \ \ 99\ \ \ \ \ \ \ \ 0.0488886\ "ambiguous"
+      1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 1\ \ \ \ \ \ \ 883\ \ \ \ \ \ \ 0.436047\ \ "ambiguous"
+      
+      Summary
+      count\ \ \ \ \ percent\ \ \ class
+      153572\ \ \ \ 75.8377\ \ \ "graft"
+      42282\ \ \ \ \ 20.8799\ \ \ "host"
+      3153\ \ \ \ \ \ 1.55703\ \ \ "both"
+      1900\ \ \ \ \ \ 0.938267\ \ "neither"
+      1594\ \ \ \ \ \ 0.787157\ \ "ambiguous"
+\f[]
+.PP
+Both tables contain a single heading line, followed by rows of
+TAB-separated elements; a format suitable for loading into R or a
+spreadsheet.
+.PP
+Each row represents the number and proportion of reads assigned to
+a particular class.
+The B, G, H, and M fields represent the presence (1) or absence (0)
+of k-mers belonging to the both, graft, host and marginal k-mer
+subsets, according to the reference index.
+.PP
+The Statistics table contains 16 rows; one for each possible
+combination of k-mer classes present within a read.
+The first row of the above table, indicates that for the given
+input, 1,900 reads (or pairs) - 0.938267% of the total reads -
+contained no k-mers that belonged to the B, G, H, or M k-mer
+subsets, and are accordingly neither host nor graft reads.
+Similarly, the fourteenth line states that 27,775 reads (or pairs)
+- 13.716% of the total - contained k-mers that belong to the B, G,
+M, but not H subsets, and are therefore \[lq]probably graft\[rq]
+reads.
+.PP
+In the Summary table, the B, G, H, and M columns are removed, and
+the classes from the Statistics table have been collapsed into the
+five shown; the definitely/probably graft/host classes are combined
+into just graft/host classes.
+Notice that the different read output files, described earlier,
+correspond exactly to these classes.
+.SH OPTIONS COMMON TO ALL COMMANDS
+.PP
+The following options can be used with all of the \f[I]xenome\f[]
+commands and are therefore not listed separately for each command.
+.TP
+.B -h, --help
+Show a help message.
+.RS
+.RE
+.TP
+.B -l \f[I]FILE\f[], --log-file \f[I]FILE\f[]
+Place to write progress messages.
+Messages are only written if the -v flag is used.
+If omitted, messages are written to stderr.
+.RS
+.RE
+.TP
+.B -T \f[I]INT\f[], --num-threads \f[I]INT\f[]
+The maximum number of \f[I]worker\f[] threads to use.
+The actual number of threads used during the algorithms depends on
+each implementation.
+\f[I]xenome\f[] may use a small number of additional threads for
+performing non cpu-bound operations, such as file I/O.
+.RS
+.RE
+.TP
+.B --tmp-dir \f[I]DIRECTORY\f[]
+A directory to use for temporary files.
+This flag may be repeated in order to nominate multiple temporary
+directories.
+.RS
+.RE
+.TP
+.B -v, --verbose
+Show progress messages.
+.RS
+.RE
+.TP
+.B -V, --version
+Show the software version.
+.RS
+.RE
+.SH COMMANDS AND OPTIONS
+.SS xenome index
+.PP
+xenome index [-k \f[I]INT\f[]] [-M \f[I]INT\f[]] -P \f[I]PREFIX\f[]
+-G \f[I]FASTA-filename\f[] -H \f[I]FASTA-filename\f[]
+.PP
+Build the xenome reference index from the graft and host reference
+sequences.
+The input files must be in FASTA format.
+They may be gzip compressed, in which case the filename suffix must
+be \f[I]\&.gz\f[].
+.PP
+The k-mer size may be specified using the \f[I]-k\f[] flag.
+If omitted, xenome defaults to k=25.
+.PP
+During index construction, xenome maintains a hash table of the
+k-mers seen so far.
+When this table fills, its contents are written to disk, and the
+table is reinitialised.
+The more memory xenome can use, the less often it will need to
+write to disk, and the faster index construction will run.
+By default, xenome will limit itself to 2 GB during index
+construction.
+The -M, \[em]max-memory flag can be used to explicitly control the
+amount of memory available to xenome (in GB).
+To improve performance, this should generally be set close to the
+amount memory available in the system - having accounted for
+operating system and other overhead.
+.PP
+\f[I]OPTIONS\f[]
+.TP
+.B -k \f[I]INT\f[], --kmer-size \f[I]INT\f[]
+The k-mer size to use for building the graph: in version 1.0.0 this
+\f[I]must be an integer strictly less than 63\f[].
+If not supplied, the default value of 25 is used.
+.RS
+.RE
+.TP
+.B -M \f[I]INT\f[], --max-memory \f[I]INT\f[]
+The maximum amount of memory (in GB) of memory to use.
+Making more memory available will reduce the number of times xenome
+writes intermediate index data to disk.
+The default is 2 GB.
+.RS
+.RE
+.TP
+.B -P \f[I]PREFIX\f[], --prefix \f[I]PREFIX\f[]
+The path prefix for all generated reference index files.
+The prefix may contain directory separators (e.g.
+`/') in order to have the index files written to another directory.
+.RS
+.RE
+.TP
+.B -G \f[I]FILE\f[], --graft \f[I]FILE\f[]
+The name of the FASTA file containing the graft reference sequence.
+If the filename ends in \f[I]\&.gz\f[] it will be read as a gzip
+file.
+.RS
+.RE
+.TP
+.B -H \f[I]FILE\f[], --host \f[I]FILE\f[]
+The name of the FASTA file containing the host reference sequence.
+If the filename ends in \f[I]\&.gz\f[] it will be read as a gzip
+file.
+.RS
+.RE
+.SS xenome classify
+.PP
+xenome classify -P \f[I]PREFIX\f[] {-I \f[I]FASTA-filename\f[] | -i
+\f[I]FASTQ-filename\f[] | \[em]line-in \f[I]filename\f[]}+
+[\[em]pairs] [-M \f[I]INT\f[]] [\[em]graft-name \f[I]STRING\f[]]
+[\[em]host-name \f[I]STRING\f[]] [\[em]output-filename-prefix
+\f[I]STRING\f[]] [\[em]dont-write-reads] [\[em]preserve-read-order]
+.PP
+Classifies input reads according to a pre-computed k-mer index.
+The reads are written into separate files, according to their
+classification, and a breakdown of the number and proportion of
+reads in each class is printed.
+.PP
+If the total size of the index files is greater than available RAM,
+xenome will perform poorly.
+To overcome this, the -M, \[em]max-memory flag may be used to
+specify the maximum amount of memory (in GB) that xenome may use at
+any time.
+If this amount is less than the size of the index structures,
+xenome will (effectively) partition the index into multiple
+subsets, each no larger than the specified maximum memory size, and
+classify the reads in multiple passes - with each pass using a
+different index subset.
+The results from each passes are combined, and the result is
+produced as usual.
+If run with the -v, \[em]verbose flag, xenome will report the
+number of passes it will perform.
+Note that runtime will increase with the number of passes
+performed; the biggest increase will occur with the step from one
+pass to two.
+.PP
+\f[I]OPTIONS\f[]
+.TP
+.B -P \f[I]PREFIX\f[], --prefix \f[I]PREFIX\f[]
+The path prefix for all reference index files.
+The prefix may contain directory separators (e.g.
+`/') in order to have the index files written to another directory.
+.RS
+.RE
+.TP
+.B -I \f[I]FILE\f[], --fasta-in \f[I]FILE\f[]
+Input file in FASTA format.
+.RS
+.RE
+.TP
+.B -i \f[I]FILE\f[], --fastq-in \f[I]FILE\f[]
+Input file in FASTQ format.
+.RS
+.RE
+.TP
+.B \[em]line-in \f[I]FILE\f[]
+Input file with one read per line and no other annotation.
+.RS
+.RE
+.TP
+.B \[em]pairs
+Treat reads from consecutive input files of the same type as pairs.
+.RS
+.RE
+.TP
+.B -M \f[I]INT\f[], --max-memory \f[I]INT\f[]
+The maximum amount of memory (in GB) to use while classifying
+reads.
+If not specified, xenome will use as much memory as required to
+classify all reads in a single pass.
+When the maximum amount of memory is less than the size of the
+reference index files, xenome will need to perform multiple passes
+over the input data - increasing runtime.
+.RS
+.RE
+.TP
+.B \[em]graft-name \f[I]STRING\f[]
+The name of the graft reference to appear in filenames and
+statistics.
+If no explicit name is provided, the string \[lq]graft\[rq] is
+used.
+.RS
+.RE
+.TP
+.B \[em]host-name \f[I]STRING\f[]
+The name of the host reference to appear in filenames and
+statistics.
+If no explicit name is provided, the string \[lq]host\[rq] is used.
+.RS
+.RE
+.TP
+.B \[em]output-filename-prefix \f[I]STRING\f[]
+An optional prefix to apply to all output read filenames.
+The prefix is separated from the rest of the filename by an
+underscore (`_').
+.RS
+.RE
+.TP
+.B \[em]dont-write-reads
+The reads will not be written to any files after classification,
+and none of the usual per-category output files will be created.
+The classification statistics will still be printed to standard
+out.
+.RS
+.RE
+.TP
+.B \[em]preserve-read-order
+The relative ordering of reads within each output file will be the
+same as that in the input files.
+i.e.\ if read \f[I]r1\f[] precedes \f[I]r2\f[] in a single output
+file, then \f[I]r1\f[] also precedes \f[I]r2\f[] in the input.
+Note: If this flag is specified, the -T/\[em]num-threads flag is
+ignored, and xenome will only operate with a single worker thread.
+.RS
+.RE
+.SS xenome help
+.PP
+xenome help
+.PP
+Prints a summary of all of the xenome commands.
+.PP
+\[em]
+.SH FUTURE RELEASES
+.PP
+Bzip support will be introduced.
+.SH AUTHORS
+Bryan Beresford-Smith, Andrew Bromage, Thomas Conway, Jeremy Wazny.
+