# HG changeset patch # User nml # Date 1590772194 14400 # Node ID 08d801182fa1bd68b140a1c1b4d52e7ae65b8b7b # Parent fb3683870b74beb1333e76eeb38f986004f5fdf0 "planemo upload for repository https://github.com/phac-nml/ecoli_serotyping commit 6615f6e5ae2eac1f8e90f25e1707c8b7ab161517" diff -r fb3683870b74 -r 08d801182fa1 ectyper.xml --- a/ectyper.xml Mon Dec 30 10:10:44 2019 -0500 +++ b/ectyper.xml Fri May 29 13:09:54 2020 -0400 @@ -1,7 +1,7 @@ - + ectyper is a standalone serotyping module for Escherichia coli. It supports fasta and fastq file formats. - ectyper + ectyper -
- - + + + + - + -
+ + +
adv_param['logging']==True - + + + adv_param['blastresults']==True + @@ -76,35 +94,34 @@ **Syntax** -This tool identifies the serotype of assembled or assembly-free Escherichia coli genome sample based on a set of either *wzm/wzt* or *wzx/wzy* and *fliC/flkA/flmA* alleles corresponding to O and H antigens, respectively. -The non-E.coli genomes and other Escherichia genus species are successfully identified and well handled. The 0.9.0 version improves tool sensitivy when target alleles are truncated or -poorly covered by raw reads. +This tool identifies the serotype of both assembled or assembly-free Escherichia coli genome samples based on a set of the key O and H antigen determinant genes including *wzm/wzt* or *wzx/wzy* and *fliC/flkA/flmA*. +Unique to the tool, species identification module allows for non-E.coli genomes identification including other Escherichia genus species. +This version improves antigen call rates on "difficult samples" by use of an adaptive threshold. This is especially useful when antigen genes are truncated or poorly covered by raw reads. +If no antigen call is being predicted by the tool, try to lower %coverage parameter first. For more information on the new Quality Control module and running parameter details please visit https://github.com/phac-nml/ecoli_serotyping. -For more information please visit https://github.com/phac-nml/ecoli_serotyping. - ----- **Input:** Accepts a variety of inputs including both single and/or multiple FASTQ and/or FASTA file(s). Inputs might contain pure raw reads, but for more accurate results, draft assemblies are recommended. -The default MASH RefSeq genome sketch is included and updated every 6 months, but one can supply custom sketch file for species identification. -One can download RefSeq genome sketch containing approximately 91,283 genomes from https://gembox.cbcb.umd.edu/mash/refseq.genomes.k21s1000.msh. +The default MASH RefSeq genome sketch (https://gembox.cbcb.umd.edu/mash/refseq.genomes.k21s1000.msh) containing approximately 91K genomes is included and automatically updated every 6 months. + **Output:** -Tab-delimited report listing identified O and H antigens together with corresponding highest scoring alleles and normalized BLAST score defined as (%identity x query coverage length) / 10000 +Tab-delimited report listing identified O and H antigens together with corresponding the highest-scoring alleles and normalized BLAST score defined as (%identity x %coverage) / 1e4. +If *verifyEcoli* parameter is enabled, final report will contain allele quality control information on results for reporting purposes. PASS (REPORTABLE) QC flag means that O and H antigen calls are of sufficient to unambiguously resolve them from all other antigens. ----- **Parameters (Optional):** - - - **Print the allele sequences as the final columns of the output?** Turn ON/OFF addition of the actual O and H antigen allelic sequences in the report - - **Enable E. coli species verification:** Turn ON/OFF for more rigorous species verification (recommended) - - **Include log file in the run outputs?:** Turn ON/OFF optional output of the ectyper log file for a more detailed results assessment + - **Enable E. coli species verification:** for species verification in case samples are of non-E.coli origin + - **Include BLAST allele alignment results tab-delim file in the outputs?** Get reference allele sequences and detailed BLAST output + - **Include log file in the run outputs?:** Get optional logs of the ectyper run for a more detailed results assessment and troubleshooting