Galaxy |

usage: drep compare [-p PROCESSORS] [-d] [-h] [-ms MASH_SKETCH]

[--S_algorithm {ANIn,goANI,ANImf,gANI}] [-n_PRESET {normal,tight}] [-pa P_ANI] [-sa S_ANI] [--SkipMash] [--SkipSecondary] [-nc COV_THRESH] [-cm {total,larger}] [--clusterAlg CLUSTERALG] [--run_tax] [--tax_method {percent,max}] [-per PERCENT] [--cent_index CENT_INDEX] [--warn_dist WARN_DIST] [--warn_sim WARN_SIM] [--warn_aln WARN_ALN] [-g [GENOMES [GENOMES ...]]] work_directory

I/O PARAMETERS:

-g [GENOMES [GENOMES ...]], --genomes [GENOMES [GENOMES ...]]: genomes to cluster in .fasta format (default: None)

GENOME COMPARISON PARAMETERS:

-ms MASH_SKETCH, --MASH_sketch MASH_SKETCH: MASH sketch size (default: 1000)
--S_algorithm {goANI,ANIn,ANImf,gANI}: Algorithm for secondary clustering comaprisons: ANImf = (RECOMMENDED) Align whole genomes with nucmer; filter alignment; compare aligned regions ANIn = Align whole genomes with nucmer; compare aligned regions gANI = Identify and align ORFs; compare aligned ORFS (default: ANImf)
-n_PRESET {normal,tight}: Presets to pass to nucmer tight = only align highly conserved regions normal = default ANIn parameters (default: normal)

CLUSTERING PARAMETERS:

-pa P_ANI, --P_ani P_ANI: ANI threshold to form primary (MASH) clusters (default: 0.9)
-sa S_ANI, --S_ani S_ANI: ANI threshold to form secondary clusters (default: 0.99)

`--SkipMash`	Skip MASH clustering, just do secondary clustering on all genomes (default: False)
`--SkipSecondary`
	Skip secondary clustering, just perform MASH clustering (default: False)

-nc COV_THRESH, --cov_thresh COV_THRESH: Minmum level of overlap between genomes when doing secondary comparisons (default: 0.1)
-cm {total,larger}, --coverage_method {total,larger}: Method to calculate coverage of an alignment (for ANIn/ANImf only; gANI can only do larger method) total = 2*(aligned length) / (sum of total genome lengths) larger = max((aligned length / genome 1), (aligned_length / genome2)) (default: larger)

`--clusterAlg CLUSTERALG`
	Algorithm used to cluster genomes (passed to scipy.cluster.hierarchy.linkage (default: average)

TAXONOMY:

--run_tax

generate taxonomy information (Tdb) (default: False)

--tax_method {percent,max}: Method of determining taxonomy percent = The most descriptive taxonimic level with at least (per) hits max = The centrifuge taxonomic level with the most overall hits (default: percent)
-per PERCENT, --percent PERCENT: minimum percent for percent method (default: 50)

`--cent_index CENT_INDEX`
	path to centrifuge index (for example, /home/mattolm/download/centrifuge/indices/b+h+v (default: None)

WARNINGS:

`--warn_dist WARN_DIST`
	How far from the threshold to throw cluster warnings (default: 0.25)
`--warn_sim WARN_SIM`
	Similarity threshold for warnings between dereplicated genomes (default: 0.98)
`--warn_aln WARN_ALN`
	Minimum aligned fraction for warnings between dereplicated genomes (ANIn) (default: 0.25)