Repository 'drep_dereplicate'
hg clone https://toolshed.g2.bx.psu.edu/repos/iuc/drep_dereplicate

Changeset 0:8dfcdbeaeed8 (2020-05-05)
Next changeset 1:ef7cd2e7bc05 (2022-02-12)
Commit message:
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/drep commit 8fa5ff35b45c2b046c7f4800410cf39cb89a299a"
added:
drep_dereplicate.xml
macros.xml
test-data/Enterococcus_casseliflavus_EC20.fasta
test-data/Enterococcus_faecalis_T2.fna
test-data/Enterococcus_faecalis_TX0104.fa
b
diff -r 000000000000 -r 8dfcdbeaeed8 drep_dereplicate.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/drep_dereplicate.xml Tue May 05 06:12:47 2020 -0400
[
@@ -0,0 +1,100 @@
+<tool id="drep_dereplicate" name="dRep dereplicate" version="@VERSION@.0" python_template_version="3.5">
+    <description>De-replicate a list of genomes</description>
+    <macros>
+        <import>macros.xml</import>
+    </macros>
+    <expand macro="requirements" />
+    <command detect_errors="exit_code"><![CDATA[
+         @PREPARE_GENOMES@
+         dRep dereplicate outdir
+         @FILTER_OPTIONS@
+         @GENOME_COMPARISON_OPTIONS@
+         @CLUSTERING_OPTIONS@
+         @SCORING_OPTIONS@
+         @TAXONOMY_OPTIONS@
+         @WARNING_OPTIONS@        
+         @GENOMES@
+         --debug
+         || (rc=\$?; 
+             ls -ltr `find outdir -type f`;
+             cat outdir/data/checkM/checkM_outdir/checkm.log;
+             cat outdir/log/logger.log;
+             exit \$rc)
+    ]]></command>
+    <inputs>
+        <expand macro="genomes"/>
+        <expand macro="filtering_options"/>
+        <expand macro="genome_comparison_options"/>
+        <expand macro="clustering_options"/>
+        <expand macro="scoring_options"/>
+        <expand macro="taxonomy_options"/>
+        <expand macro="warning_options"/>
+        <expand macro="select_drep_outputs"/>
+    </inputs>
+    <outputs>
+        <collection name="dereplicated_genomes" type="list" label="dereplicated_genomes">
+             <discover_datasets pattern="__designation__" directory="outdir/dereplicated_genomes" ext='fasta'/>
+        </collection>
+        <expand macro="drep_outputs" />
+    </outputs>
+    <tests>
+        <expand macro="test_defaults_log">
+            <has_text text="dRep dereplicate finished" />
+        </expand>
+        <test>
+            <param name="genomes" ftype="fasta" value="Enterococcus_casseliflavus_EC20.fasta,Enterococcus_faecalis_T2.fna,Enterococcus_faecalis_TX0104.fa"/>
+            <conditional name="filter">
+                <param name="set_options" value="yes"/>
+                <conditional name="quality">
+                    <param name="source" value="checkm"/>
+                    <param name="checkM_method" value="taxonomy_wf"/>
+                </conditional>
+            </conditional>
+            <output name="log">
+                <assert_contents>
+                    <has_text text="dRep dereplicate finished" />
+                </assert_contents>
+            </output>
+        </test>
+    </tests>
+    <help><![CDATA[
+**dRep dereplicate**
+
+`dRep <https://drep.readthedocs.io/en/latest/overview.html>`_ performs rapid pair-wise comparison of genome sets.
+
+
+
+
+
+`De-replication <https://drep.readthedocs.io/en/latest/overview.html#genome-de-replication>`_ is the process of identifying sets of genomes that are the “same” in a list of genomes, and removing all but the “best” genome from each redundant set. How similar genomes need to be to be considered “same”, how to determine which genome is “best”, and other important decisions are discussed in `Choosing parameters. <https://drep.readthedocs.io/en/latest/choosing_parameters.html>`_   Detailed options for each module are described at: https://drep.readthedocs.io/en/latest/module_descriptions.html
+
+A common use for genome de-replication is the case of individual assembly of metagenomic data. If metagenomic samples are collected in a series, a common way to assemble the short reads is with a “co-assembly”. That is, combining the reads from all samples and assembling them together. The problem with this is assembling similar strains together can severely fragment assemblies, precluding recovery of a good genome bin. An alternative option is to assemble each sample separately, and then “de-replicate” the bins from each assembly to make a final genome set.
+
+The steps to this process are:
+
+  - Assemble each sample separately using your favorite assembler. You can also perform a co-assembly to catch low-abundance microbes
+  - Bin each assembly (and co-assembly) separately using your favorite binner
+  - Pull the bins from all assemblies together and run **dRep** on them
+  - Perform downstream analysis on the de-replicated genome list
+
+
+
+**INPUTS**
+
+  - Genome sets in fasta format.
+
+
+**OUTPUTS**
+
+  - `Figures <https://drep.readthedocs.io/en/latest/example_output.html#figures>`_ that show the relationship of the Genome inputs.
+  - `Warnings <https://drep.readthedocs.io/en/latest/example_output.html#warnings>`_ report two things: de-replicated genome similarity and secondary clusters that were almost different. 
+  - A Dataset collection of the “best” genome of each secondary cluster.
+  - `Tables from intermediate steps <https://drep.readthedocs.io/en/latest/advanced_use.html>`_ 
+
+    * Chdb.csv # CheckM results for Bdb
+    * Widb.csv # Winning genomes' checkM information 
+
+
+    ]]></help>
+    <expand macro="citations" />
+</tool>
b
diff -r 000000000000 -r 8dfcdbeaeed8 macros.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/macros.xml Tue May 05 06:12:47 2020 -0400
[
b'@@ -0,0 +1,474 @@\n+<macros>\n+    <token name="@VERSION@">2.5.4</token>\n+    <xml name="requirements">\n+        <requirements>\n+            <requirement type="package" version="@VERSION@">drep</requirement>\n+            <yield/>\n+        </requirements>\n+    </xml>\n+    <xml name="citations">\n+        <citations>\n+            <citation type="doi">10.1038/ismej.2017.126</citation>\n+            <yield />\n+        </citations>\n+    </xml>\n+\n+\n+    <xml name="genomes">\n+        <param argument="--genomes" type="data" format="fasta" label="genomes fasta files" multiple="true"/>\n+    </xml>\n+    <token name="@PREPARE_GENOMES@"><![CDATA[\n+    #import re \n+    #set $genomefiles = [] \n+    #for $genome in $genomes\n+        #set $input_name = $re.sub(\'[^\\w\\-_.]\', \'_\',str($genome.element_identifier.split(\'/\')[-1]))\n+        ln -s \'${genome}\' \'${input_name}\' &&\n+        $genomefiles.append($input_name)\n+    #end for\n+]]></token>\n+    <token name="@GENOMES@"><![CDATA[\n+    -g \n+    #for $genomefile in $genomefiles\n+    \'${genomefile}\' \n+    #end for\n+]]></token>\n+\n+\n+    <xml name="checkm_method">\n+        <param argument="--checkM_method" type="select" label="checkm method" optional="true">\n+           <option value="taxonomy_wf">taxonomy_wf (faster)</option>\n+           <option value="lineage_wf">lineage_wf (more accurate)</option>\n+        </param>\n+    </xml>\n+    <token name="@CHECKM_METHOD@"><![CDATA[\n+    #if $checkM_method:\n+    --checkM_method $checkM_method \n+    #end if\n+]]></token>\n+\n+    <xml name="filtering_options">\n+        <conditional name="filter">\n+            <param name="set_options" type="select" label="set filtering options">\n+                <option value="yes">Yes</option>\n+                <option value="no" selected="true">No (use --checkM_method taxonomy_wf)</option>\n+            </param>\n+            <when value="yes">\n+                <param argument="--length" type="integer" value="50000" label="Minimum genome length"/>\n+                <param argument="--completeness" type="integer" value="75" min="0" max="100" label="Minimum genome completeness percent"/>\n+                <param argument="--contamination" type="integer" value="25" min="0" max="100" label="Maximum genome contamination percent"/>\n+                 \n+                <conditional name="quality">\n+                    <param argument="source" type="select" label="genome quality">\n+                        <help>\n+                            --ignoreGenomeQuality is useful with\n+                            bacteriophages or eukaryotes or things where checkM\n+                            scoring does not work. Will only choose genomes based\n+                            on length and N50. \n+                        </help>\n+                        <option value="checkm" selected="true">Run checkM</option>\n+                        <option value="genomeInfo">User supplied genomeInfo csv file</option>\n+                        <option value="ignoreGenomeQuality">--ignoreGenomeQuality (NOT RECOMMENDED!)</option>\n+                    </param>\n+                    <when value="checkm">\n+                        <param argument="--checkM_method" type="select" label="checkm method" optional="true">\n+                            <help>\n+                                Using the checkm method of lineage_wf can require more than 40Gb of RAM.\n+                            </help>\n+                            <option value="taxonomy_wf">taxonomy_wf (faster)</option>\n+                            <option value="lineage_wf">lineage_wf (more accurate)</option>\n+                        </param>\n+                    </when>\n+                    <when value="genomeInfo">\n+                        <param argument="--genomeInfo" type="data" format="csv" label="genomes fasta files">\n+                            <help><![CDATA[\n+                            A CSV dataset that must contain: [\n+                            "genome"(history dataset name of .fasta dataset of that genome'..b'lt: ANImf)\n+\n+  -n_PRESET {normal,tight}\n+                        Presets to pass to nucmer\n+                        tight   = only align highly conserved regions\n+                        normal  = default ANIn parameters (default: normal)\n+\n+]]></token>\n+\n+    <token name="@CLUSTERING_HELP@"><![CDATA[\n+CLUSTERING PARAMETERS:\n+  -pa P_ANI, --P_ani P_ANI\n+                        ANI threshold to form primary (MASH) clusters\n+                        (default: 0.9)\n+  -sa S_ANI, --S_ani S_ANI\n+                        ANI threshold to form secondary clusters\n+                        (default: 0.99)\n+\n+  --SkipMash            Skip MASH clustering, just do secondary clustering on\n+                        all genomes (default: False)\n+  --SkipSecondary       Skip secondary clustering, just perform MASH clustering\n+                        (default: False)\n+\n+  -nc COV_THRESH, --cov_thresh COV_THRESH\n+                        Minmum level of overlap between genomes when doing\n+                        secondary comparisons (default: 0.1)\n+  -cm {total,larger}, --coverage_method {total,larger}\n+                        Method to calculate coverage of an alignment\n+                        (for ANIn/ANImf only; gANI can only do larger method)\n+                        total   = 2*(aligned length) / (sum of total genome lengths)\n+                        larger  = max((aligned length / genome 1), (aligned_length / genome2))\n+                        (default: larger)\n+\n+  --clusterAlg CLUSTERALG\n+                        Algorithm used to cluster genomes (passed to\n+                        scipy.cluster.hierarchy.linkage (default: average)\n+\n+]]></token>\n+\n+    <token name="@SCORING_HELP@"><![CDATA[\n+SCORING CRITERIA\n+Based off of the formula: \n+A*Completeness - B*Contamination + C*(Contamination * (strain_heterogeneity/100)) + D*log(N50) + E*log(size)\n+\n+A = completeness_weight; B = contamination_weight; C = strain_heterogeneity_weight; D = N50_weight; E = size_weight:\n+  -comW COMPLETENESS_WEIGHT, --completeness_weight COMPLETENESS_WEIGHT\n+                        completeness weight (default: 1)\n+  -conW CONTAMINATION_WEIGHT, --contamination_weight CONTAMINATION_WEIGHT\n+                        contamination weight (default: 5)\n+  -strW STRAIN_HETEROGENEITY_WEIGHT, --strain_heterogeneity_weight STRAIN_HETEROGENEITY_WEIGHT\n+                        strain heterogeneity weight (default: 1)\n+  -N50W N50_WEIGHT, --N50_weight N50_WEIGHT\n+                        weight of log(genome N50) (default: 0.5)\n+  -sizeW SIZE_WEIGHT, --size_weight SIZE_WEIGHT\n+                        weight of log(genome size) (default: 0)\n+\n+\n+]]></token>\n+\n+    <token name="@TAXONOMY_HELP@"><![CDATA[\n+TAXONOMY:\n+  --run_tax             generate taxonomy information (Tdb)\n+                        (default: False)\n+\n+  --tax_method {percent,max}\n+                        Method of determining taxonomy\n+                        percent = The most descriptive taxonimic level with at least (per) hits\n+                        max     = The centrifuge taxonomic level with the most overall hits\n+                        (default: percent)\n+\n+\n+  -per PERCENT, --percent PERCENT\n+                        minimum percent for percent method\n+                        (default: 50)\n+\n+\n+  --cent_index CENT_INDEX\n+                        path to centrifuge index (for example,\n+                        /home/mattolm/download/centrifuge/indices/b+h+v\n+                        (default: None)\n+\n+]]></token>\n+\n+    <token name="@WARNINGS_HELP@"><![CDATA[\n+WARNINGS:\n+  --warn_dist WARN_DIST\n+                        How far from the threshold to throw cluster warnings\n+                        (default: 0.25)\n+  --warn_sim WARN_SIM   Similarity threshold for warnings between dereplicated\n+                        genomes (default: 0.98)\n+  --warn_aln WARN_ALN   Minimum aligned fraction for warnings between\n+                        dereplicated genomes (ANIn) (default: 0.25)\n+\n+]]></token>\n+\n+\n+</macros>\n'
b
diff -r 000000000000 -r 8dfcdbeaeed8 test-data/Enterococcus_casseliflavus_EC20.fasta
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Enterococcus_casseliflavus_EC20.fasta Tue May 05 06:12:47 2020 -0400
b
b'@@ -0,0 +1,48964 @@\n+>gi|478483683|ref|NC_020995.1| Enterococcus casseliflavus EC20, complete genome\n+AAAGTATTTTTTTCTAACCTTTTTTATCGTAATCTGTGGAAAACTTTTTCAATCCGTGCTATTTTAGTTA\n+TATCTATTCTTAGTTATAGGAGGACAATTTATGCCATCTGCTGATTCTATTTGGCAAGATCTTCAACGTT\n+CCTTTAAAGAAGAGCTGAATCCGGCCAGTTATAGTGCTTGGATCGAGACTGCCAATGTCTTGTCGTTTGA\n+AAAAAATCAGCTGCTGATCGAAGTACCCAGCGATCTTCATAAATCTTATTGGGAAAAAAATCTAGCTGCC\n+AAAATTGTTGAAATGGGATTTATGAAAACTGGTGAAGAATTGATTCCTAGTTTTGTGACTGTCGAAGAAG\n+CAGAAGCTTTAAAAACAGCCCCTTCTACTATTCAAACAGCTGCAGAAGAAAACGAGCGGCCGCCGAAATC\n+GATCTTAAATGAAAAATACACATTTGATACCTTTGTCATCGGGAAAGGCAATCAGATGGCCCACGCTGCT\n+GCTTTAGTTGTTGCAGAAGATCCTGGGTCTATTTATAATCCGCTGTTCTTCTATGGTGGCGTTGGTTTAG\n+GGAAAACCCACTTGATGCACGCGATCGGTCATCAAATGTTGCTGAAACGTCCCAATGCCAAAATCAAGTA\n+TGTTAGTAGTGAAAATTTCACCAATGATTTCATTACTTCTATTCAAAAGAACAAAATGGAAGATTTTCGA\n+AACGAATACCGCAATGTTGATCTTTTGCTGGTGGATGATATTCAATTCTTAGTCAATAAAGAAGGAACCC\n+AAGAAGAATTCTTTAATACCTTCGAAGAACTGTATCGCAATAATAAACAGATCGTTCTGACAAGTGATCG\n+TTTGCCAAATGAGATCCCGACTTTGCCGGAACGTTTGGTTTCCCGTTTTGCTTGGGGCTTGTCCGTTGAT\n+ATCACCCCGCCGGATCTAGAGACGCGGACTGCGATTTTGCGGAAAAAAGCCGAAGCCGAACGCTTGGAGA\n+TCCCCGACGATACCTTAAGTTATATCGCTGGGCAGATCGATTCGAATATCCGAGAACTTGAAGGAGCACT\n+CGTGCGGGTGCAAGCTTTTGCTACGATGCAAAACTCAGACATTACAACGAGCTTGGCAGCTGAAGCCATC\n+AAAGCCTTAAAATCAAGCCATGGCTCGACCCAAGTTTCGATTTTGCAAATCCAAGAAGAAGTCGCAAAAT\n+ACTATCACATTCATGTCAATGATCTAAAAGGGAAAAAACGGGTCAAAGGCATCGTGGTTCCACGGCAGAT\n+CGCGATGTATCTCTCTCGAGAATTGACCGATAGTTCTTTACCAAAAATCGGCGGCGAATTTGGCGGCAAA\n+GACCATACAACGGTCATTCATGCCCATGAAAAAATTCAGCATTTAGTCGAAACAGATCCCACGATCAAAA\n+ATGAGATCGCTGAAATCAAACAAATCCTCTTCAGCTGATCTGTGGATAAGAAAAGAAGAACCAAAAAAGT\n+TGTCCACAAGTTATTCACAGGCATTTTCGTTAGTCTAATCACTCTTTTCTCGAGTTATCCACATTACTAA\n+CAAGCCTATTACTACTATTACTTTTATTTAATAACTATAAATTAAAGGAGTATCGCTATGAAGCTAACTT\n+TAAACCGAACAGAGTTCATGCAAGAATTACAAACTGTCCAACGGGCGATTTCAACCAAAACCACCATCCC\n+GATCTTAACTGGAGTAAAATTATCCCTTTCAGAAAAAGGATTGACCATGACTGGGAGCAACGCCGATATT\n+TCCATTGAAACTTTTTTAAGTGTGGAAAACGAAAAAGCGCAAATGCAAATCGAAAAAACAGGAGCGATCG\n+TTTTACAAGCACGTTTCTTCAGTGAAATCGTTCGTCGTTTGCCTGAAAGTACCTTAACCTTAGAAGTATT\n+AGACAATAATCAAGTAGCGATCACTTCTGGAAAAGCCAACTTTACCGTCAACGGCTTGGATGCCGATAGT\n+TATCCACATTTACCAGTTGTCGAAAGTCAAGATTCGATCGAGATTCCAGCGCACGTGTTGAATAAGGTCG\n+TTAGTGAAACAGTCTTTGCGGTTTCGCAACACGAAAGCCGTCCGATCTTGACTGGGGTCCACTTTGTCTT\n+AGAAAATCAAAAATTATTAGCTGTTGCGACGGACTCACACCGTCTGAGCCAACGGGTGATTCCATTGGAA\n+AGTGGAGAAACAGCCTTCAACATCGTAATTCCTGGCAAAAGCTTAACGGAACTTTCTCGTTCCTTAACAG\n+ATGAAGAAGAAGCGATCCAAATCAGCATTATGGATAACCAAGTGTTGTTCCAAACGAAAACCATGAAATT\n+CTATTCTCGTTTATTGGAAGGAACTTACCCAGATACCAACCGTCTGATTCCTTCAAGCTTCAATACTGAG\n+ATTGAATTTTCTGTCCCAGAATTGTTACAAGCCATCGAACGGGCGTCATTGCTTTCTCATGAAGGCCGTA\n+ATAACATCGTTCGTTTGGCGATCTCCGAAGAAGCCGTTGTCTTATATGGAAACTCACCAGAAATCGGGAA\n+AGTCGAAGAAGATCTTTCTTTTGAAAAAGTGACCGGCGACCCATTAGAGATCTCTTTCAATCCTGACTAT\n+ATGAAAGCAGCACTACGAGCATTTGGTGACACCAGCATTGTGATCCGCTTTATCTCAGCGATCCGTCCCT\n+TTACATTGGAGCCGACAGAGAGCAAAGGCAGCTTTATCCAGCTGATCACACCGGTGCGAACCAACTAGTT\n+TTTCATGTCTTTTGAAAAAAGTTGAAACAATCATCTGAAAATGAATAAATGAGTTAAAAGGGCTTAAAAT\n+CGTTTTTAAGCCCTTTTTTCTATTTTGGCTTCTTTTTTGTCTAAAAGCAGTAAGTCTTCTGAGAATGAAA\n+AATATGCCGATGAATTTGTTTTCTTGGCAAAAAAAGAGTATAATAGAGCTAAACGCTCTTTGATAGAATT\n+GAGGGGAATTATGAAAATGCAAATACCGTTAGAAACGGAATACATGACACTTGGACAAATGCTCAAAGAA\n+GTCAGTGTGATCAGCAGCGGCGGCCAAGCGAAATGGTACCTTGCAGAGCACACCGTTTTTGTCGACGGCG\n+AGCCAGAAAATCGACGAGGGCGCAAATTGTATGCGGGAATGCGTGTTGAGCTACCTGATGAAGGTACTTT\n+TTTTATGGTGAAGAAGGAAGACGCCGATGCGCCTGAATGAGTTGCATTTAAGCAATTATCGGAACTATGA\n+TTCGCTGACACTGACTTTTGAGAAAGGTCTGGTCATTTTTTTAGGCGAAAACGCGCAAGGAAAAACCAAT\n+ATTTTAGAAAGTATCTATGTATTGGCGATGACCAAAAGCCACCGCACCTCCAGCGAGCAAGAGCTTATCC\n+GCTGGGACACAGAAGGTGCGCGGATCTCTGGCAGTGTCAGTCGGGGACGCTCAACGATCCCGTTAGAACT\n+GTTTTTGTCAAAAAAAGGACGAAAAACGAAAGTAAACCACATTGAGCAAAAAAAGCTCAGTAGTTACATA\n+GGGCAGTTGAATGTCATTTTATTTGCTCCAGAAGACCTCTCCCTTGTAAAAGGAAGTCCCCAAGTCCGCC\n+GTAAATTTCTCGATATGGAAATTGGGCAGATCGATCCAATCTATCTCTATGATCTCGTTCAATACCAATC\n+CGTTTTGAAACAGCGCAATCAATACTTAAAACAGCTGAATGAAAAAAAACAGACCGATGAGATCTATTTA\n+GATGTTTTGACGGAACAGCTGGTGGCTTTCGGCAGTAAAATCATTTTAGCCAGACAACGATTTGTTCAGC\n+GCTTGGCGT'..b'ACTGCGAGATTCATCGCCTTATCGGTCT\n+TGGCACGAATCAGGTCCATCACCGCCTCGGCTTGAGAAAGGTCAACCCGTCCGTTTAAAAAGGCCCGCTT\n+CGTGAATTCTCCCGGCTCAGCCAACCGTGCGCCTTGTCGCAAAACCAGCTGCAAGAGTTGATTGACGACA\n+ACGATCCCGCCGTGGCAGTTGATCTCAACGACATCTTCTCGGGTAAAGGTCCGCGGCTTTTTCATCACTG\n+ATAGCATCACTTCATCCATCAAACGGTTTTCTTCTGGGTCTACGATATGACCATAATGGATCGTATGACT\n+AGGGACTTGGGCGAGGGTTTTAGTGCCTGCTTGAAAAATCCGATCCGCAATTGCGATCGCTTTTTCCCCG\n+CTTAATCGCACAATACTGATGGCCCCTTCGCCTGGCGGGGTGGAAATCGCGGCAATCGTATCAAATTCTT\n+GCGTTATATTCGCCATGTTGCTTGCTCCTCCTATTTTTTCACATAAAAAAAGTGCCCACTCCACCCGTCA\n+AATTGATTGAAAATCAATAAAAACGAGTAGACTTAGCACTTTGACTGCTTGTCAATTAAATTCTGGAAAT\n+AGATTAGCACATTTTTGCGTGAGTCGCAAGCTTCCTTCAAAAACTTTTAAATTTTGCATCTGTCCGCAAA\n+AACCTTTAAAATAAAAGCAAAGAATAAGAAAAGAGGGATCTATTATGGAAACCATCAAATCATCAAACAG\n+CGCTGCTCGCATTAAAGAAATCATTTTATCGACAGGAAACGTGAATCGGCCTTACGTCGTGCGGGATATC\n+GTCTTTGCGGCAGACAAAATCGAAGTAGATCTATTTGATACATCTGTGAATTTGAACGATCTGTTGGCAG\n+ATGTGACCTATCGCTTAAAAGAAACCGCCCACAGCTACGGAGCGAATGCAGTGATCAACTGTCATTTCGA\n+ACATGACCGCATCGTCGAAGGCGACAAAACCTACCTTGAGATCTTTGCCTATGGTACAGTGGTTCAATTT\n+ACTCAATCAACCATCGGCGGCTAATTGCCTCTATCATCAAAGAAAAATACAAAAAAGAGTCCGCGGCATA\n+CGGACTCTTTTTTGATACCAAAGGCTGACCTACGTAACGTATGTTGACTTTGATAAGAATGATTTCTAGT\n+CAAGGATACTTGGTTTTAAATTTGTTGTCAATCGCTTTTTCCCATCAAAAAAGGGAAAGGCTTGTCAGAC\n+CGCCTTTCCCTTTTTTCATTAAATTCGTTTGTCCGCAGGTTCCACTACTAAGTAGCGATACGGTTCATCC\n+CCTTCAGAATGAGTTTGAACATACGCATCTTTACTCAAAACTGAATGGATCTGCTTTCTTTCAAACGCCG\n+GCATCGGTTCTAAAAAGACTGGGCGGCCTGTTCGTTTGACTTTTTCAGCAGTACGGCTAGCCAAACGTTC\n+TAAGATCGCTTGACGTTTTTCACGGTAATCTCCGACATTCACTACAATCGACAATTTGTTTTTCGCGATG\n+CGGTGAATATACACTTGCGCCAAATATTGCAGTGCATTCAAGGTTTTGCCGTGCTTCCCAATCAAAATGC\n+CTTGTTTTTGCGTTTCTAAATGAAAGACGACCACGCCGTCTTGACGAGCGGTTTTTACTAAGGCGGGTGC\n+ATTCAATGCTTTTGAAATATTTGTCAGATAGAGTCCCAGCTGAGCCAGCGCTTCCTCATCTGAAAGATCC\n+GTCAGCAAAACAGGTTCGCTCGCTGCTGCTTCCATCACACTGCTTGCATCCTCTGTCGTCACAGGATCTG\n+GTTCTACCAGCGGTTTTGCTGGTACTTCTTCAACGGGTGCCGGTTCTTCAACGATTCGTTTTTCAATGGA\n+GACCCTTGCTTGTTTTTTTCCTAGACCCAAGAAGCCTTTTTTTCCTTCATCCAGGACTTCGATTACTGCT\n+TGATCTTTGGTGATACCAAGGACTTGCAATCCCTCTTGAATTGCTTCATCTACTGTCAGATTTTCATAAA\n+TCGGCATTGCAGTGCAACCTCCTCTTATTTTTTCCGTTTCTTTTTAGGACTCATTGCTTTTTTCAGCGCA\n+CGTTCACGCTCGCGTTCTTTGCGGGCAGCTTCTTCTCTTTCTTGGCGAATCTTGAATGGGTTATTTAAGA\n+TCATTGTTTGCACAACTTGGAATGCATTGGAAACGACCCAGTACAAGGAAAGACCGCTGGCAATGTTGAT\n+CCCCATCAGTAAGATCATCATCGGCATAGCGAAATTCATGATTTTCAGACTCGCATTTGATTCCACTTGG\n+CTCATACTTGATAAGTAAGTACTTGCAAAAGTAAAGAGCGCCGCTAAAATCGGCAAGATGAAGTATGGGT\n+CACGATCACCCAGCTGCAGCCATAAGAATTTTCCTTCTTGCAGAGCAGGAACCCGTGAGATCGATTGCCA\n+CAAGGCCATCAAAATCGGCATTTGTACCAATAATGGCAGACAGCCGGCATAAGGATTGACATTGTTTTCC\n+GCATACAAACGTTGTGTTTCTTCTTTTAACTTGCTTTGTGTTTCTGTATCTTTTGACGCATATTGCTGTT\n+GCAGCGCTTTTAGCTTCGGTTGTAATTCTTGCGTTTTGCGCATGCTTTTGGTTTGGAAGTGCATCAACGG\n+CATCAAAATGACCCGAATGATCAATGTAAAGAGGATGATCCCGATCCCGGCATTGCCAAAGGAGAGGGCT\n+TTGATCGCCTCTGCAAAATAATAAACGATGTAACGGTCCCATAGACCAGTACTTTGTGCATTTACTTCAC\n+CTGTTGCACCGCAGGCAGATAAGACAACTACTAATGAGAGGACTCCCAAAAATAGTAAGACGCGTTTCCC\n+TTTTTTCACGTAACTAATTCCTTTCCATTAATGAATAATTTTGGCAAGTTTTAAGACATGCTCTAAATTC\n+GAGTAGATTTCCTTTGAGGAACAAGTGGCGATACTTGGACGGGCGATCACGATGAAATCGACCGCTGGTT\n+CGATTCGCGGCTTCATTTCCGTTAAACTTGCACGGATCTTGCGCTTGATCTCATTTCGCTTCACTGCATT\n+ACCGACTTTTTTCCCTACAGATAAACCAACTCGAAAATGTGATTGCTCTTTAGGTAAGACATACACCACG\n+AACTTCCGATTTGCAAAAGAATTTCCTGCCTGAAAAACTTTTTGAAAATCTTGCTCTGTTTTTACTCGAT\n+AGGATTTTCTCATCTTCGGTTCCTTCACTTCTTTCAATTAAGCTTCGCGAAACGGATTGAATCCGTAACC\n+ACTCACCTGTCAATAATACCATATTTGCCAAAAGTCAGACACTATATTCCGCAAAAACGGCATAAAAAAA\n+AACCACTGGCATTCAGTGGCTTACGCAGAAATAACCTTTCTGCCTTTACGACGACGGCTAGCTAAAACAC\n+GACGTCCATTTTTTGTACTCATACGTTTACGGAATCCGTGAACTTTTTGACGTTTACGTTTATTTGGTTG\n+ATACGTTCTTTTCATTTATTTCCACCTCCATCAGGATAATGCTTATGCGCACAGACATACTTTGATAGTA\n+TAACGAGTCAAGTACTAGTTTGTCAATAAGAAGTTTGCTCAAAATCAGGCATTCTTTATCCACAATCCCT\n+TTGATTTGTGTGGGTTTACTGTTTATAGCCAATTTTTGTGGATAACTTTTTCCCCAGAGTTATTTTGGCT\n+TTTTCCACAAACAAGCAGCCTGTAGAAAAGTTTTCCACAATCACTTTTTTAACTGTTGATAACTACCCCG\n+AAACCTAGTGTTATCAAGGTGTGATCTTCACATCGATCTGTGGATAAAGGAGATTGTTTTACAGTTATGC\n+ACATTCTTTTTCTACCTGTGGATAGTCGGATTTCGAACATGTGCATTTCCATTTTTCTTTTATCCCGTTT\n+TGTGGA\n+\n'
b
diff -r 000000000000 -r 8dfcdbeaeed8 test-data/Enterococcus_faecalis_T2.fna
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Enterococcus_faecalis_T2.fna Tue May 05 06:12:47 2020 -0400
b
b'@@ -0,0 +1,40833 @@\n+>NZ_GG692854.1 Enterococcus faecalis T2 plasmid scaffold supercont1.22, whole genome shotgun sequence\n+TAAAATTAATGTGCCAGTCTTTTTTATCAAAACACCTATTAATAGAGGGGTATTTGAAGAGATCTTCGGCGAAACATTAA\n+AAGGATAGGGTGAAAAAAATGAGTAATAAAAATGAACACGGGTTTTGGGAATGGCTACAAATAGACTACTTTTCAAGGTT\n+TCCAGATGCAACGAATGATGATGTGACTAAATTCTTGTTACGCTTTACAGAAGCTAGTAAAAATTCAACCAAAGAAGGAT\n+CAAAAATCATTGAAGAATTGTTTGAGGAAGAACGAAAACGCAGAAAAGGACGGTGATTTTTCGATGAACGAACGTGAAAA\n+AGATATAAAAAAGTGGCTTTGCCAGTTATTAGATCAGACCTACCTAAATGCGGAAGCGTATAAAAATTTTTTTGTAAGAG\n+TTCTTCCAAAACAGCGTAAACGGACACTTGGTAACTATTTAGAAGTAGAACGTGTATTAGAAGTGAGCAATTTGTTACGA\n+GAGCCTGTGGAAGTCATGCTGACACTGATACGCTTATTAGCTGCTCATATTGTTGTTGTTAATCGTGAGCAATTTCAAGA\n+GGAAGAAGCAAAAGAGAAAATAGTGAAGGAATTATTAGGGGAATTGCTTAAGCAAGGGAAAATTAGTCAAAAAGAACAAA\n+CAATTATGTTAGGCACTGGTTTTTTAGAAGAAGAAACTGCTCTTTATGGCGAGTTAGAAAGTTGGGCGACTGATGCGAAA\n+GAAACCATTTATTGTACGGTAGTTGAGAATGGCTTCCCTATTAAAATGGAGTTACACAAACTTGGGTATCAATGGCTAAA\n+AAGTCGCCAAGCTTGGGTGAAAAGTTACGAGACACAAGAAGCTGCTGAAGTCGCCAAAGGTCAGCTTTGGGCATTAAGTA\n+GTGAAATAGAAGTTAGTGTAGAGACACCGATTACTTGTTTGTTTCATTTTGATTATTATTTATCCGTTAAGCCAGCGGAA\n+CGTTACAATGAAACGATTGTTGCTTTTGGGTATATCTATGAAAATTACGGCTTTAAAAAGAAATTTGTCAAACAAGTACC\n+AGTAAAAGATTTTTCAGGAGAACGTGAGCGTTTAGCACGATTAGAAATTCCTTTTGAACTAGTAGTACCAAAAGAAAGAC\n+AAGTGATTTATTGACTATTTAAGAATAAATTAGGAGGAATAACAATGGCAGTGAAAATTGATGGAACGATAAAAAAACGT\n+GTTCAATCTTTAATGGCTTTAAATGGACAGTCTTATGAAGAATGGCTTAATAATCAGCACCAAAAGTATTTAAATGAACA\n+ATCAGAAGTCATTGACCGACTATTGAAAAAAGAATTGGAACGGAAAAAAGGAACAAATGAATAGTTAAAGTAAGTAGAAT\n+AATTTAGGAGGAAATCAACGTGAAAAACAAAATTAAGAAAAAAGTGAAGTATTTTACAGCTGTAATTCAAACGATTATTG\n+GTCTTGGATGGATTGAAGTGAGTACGATGATGCCGGCATTTGCCGATGTTGAACGAACAATTCAAGGCGTTGAAACTGGT\n+TTAGGATCGGAATTTAAAAAGTTTGCTAATCCGGCATTAGGTATAGCAGTCCTCATTTATGGAGGAGCTAGATTTATGGG\n+ACATGATATTGCACAATGGGCAAAAAAATGGGTTTTCGGAGCATTTGTTGGTGCAGCAATTATTGTGAACTTTACTTGGA\n+TTAAAGATACTGTTTGGGGTTGGTTAGGAGGTTGATTCTTAAATGACTGTAGAAGTGGAGATTTTATCGTCTAACGTTGA\n+AGTCTTAGAAAAACAACCACCGTTGGAGTTACTTCCCTTCGCAGATTTTATTGAAATCACAAGATGCATGTATGTTAAGG\n+AGGCAGTAACTTTCCAACTATTGGAAGTTGGTGGAGATCCTTTTTATCGGGGAAGTTTTGAAAAATTAAATGATGAATGG\n+ACACTAGAAAAAGATATACAAAAGAAGTTAAATCAACTAGTCAAAAAAAAGAAAATGACGGCAGAGCAGGCGGAGGGCTT\n+ACTCTACAAAATTCCTTTTAAGAATCTACAGGAGGCTAGTTTTTCGAACGAAAAGAAAACGAGATTCTTCAAAAAAGAGA\n+AGAAGCCCAAAAATACCCAAAAAGTCGGGAGGCTTAAAAGGAAAATACACGTTATTTCTCCAATAAAACTTTCCCAGAAA\n+CAATTGAGAATTTTAGGAGTGTTTGTGATAGGAATACTCTTAATTGTGATTGGTTGGAAATTTATGGCGGGAAGTCAATC\n+GACGGCCAAAACTAGTTTAGAACCTACGTATCAGCAGTTAGTCAATAAAGAAAAGTACGCTGAAATAGTAAAAAAATATC\n+CTGAGAAAGAACCAGAGCTAATTGAGGAGTTATTTCAAAAAGAAGATAAAGCTGGTTTGAAAAAAATAGCTGAACACTCT\n+AATACACAGTTAGCTCAGCTATATCTTGCTTTTTTAGACAAAGATTGGCAGAAAGTAACAGAACTTTCCAAGTTACCACA\n+GGATAGTGATGTTCAAGCAATGGTAGGCTATGCTTTTTTAGAACAAGGTAAGATAGAAGAAGCAAAGCTTATTAATAAAG\n+AAATTCAAAACGATACGCTAACAAAACAAATCAAAAGTAAGGAAATCGAACAGGCGTATAAACTTCTTAGAGAGCGAAAA\n+ATATCTGAAGCGGAGAGAATAAACGAAAGATTGAAGGATAATGGATTATCCGAAGCAATTAAAGTAGCCAAAAGTATTCA\n+TAACTTGTTAGAAAAATACGCCAAAGATAAGGAAAATAAAGAATTATCAGAAAACGAGCGAAAAGAAGCTGCTGATAATT\n+ATCAGCTATGGCTGAAAAATTTGGAACAAATTGGTAAGTCTGTTCATTAATTCGGAGGATAAAATTATGAATGAATTTAC\n+GAATTTCAACTCTTTGTTCCACTGATTACTTTTACGGATTAGATAAAAAAAGAGCGAAAAATGTGATATAATCACCAAGA\n+GTAAATGGAAGTTGAAGACGGTAGCATAAATCCACCGCAAAGGAGCTGGTGCCTATGAGAGTACTCATAGTAAATCAGAA\n+AAACTGGCAAAGGAGTTGTGCAGTTGTGTCTGTATTAGAAGTGTTGGCCTTGCTTACACTATTAATACAAGTCTATAAAT\n+TAGGCAAAAAAGACGACAACAAAAAAGACCGTCGGTAAACTTTAGCAGGTTTCGACGGTCTTTTTTTGATTGTTGAAATA\n+TATGCTGCCGATCTTCTATCGGTCATTTACTCTAAGGGGACATGTTTCGAGCATGTCCTCTTTTCATAATCTATTATACG\n+TATAACGAACGAGAAAGTAAATAAAAAAGCAAAAAAAGGGCTGTTCTAAAAAGAATAGCTCTTTTTTTGTACACAACAAT\n+TGAGAGGGATGAAAGAAATGAACGGAAATCAAAAAGAAACCGCTAAAAAGCAGCATAAATATCTAATTATTGGATTATGT\n+TCAGTTGCGTTACTAGGAAGTGGCTTGACGTATGCCGCACTCAACCAGGGGGAAAAGAAGGAAGCACAGACAGAGCAACA\n+AGGTACAAAACCTAAAGAAGAGCGTCAAACACCAAAAAGTAAACAGTCCCCTTGGGAACGGAAAGTGACGGAGAATGAAG\n+AAAAAAACAAGGATAAAGACAAAAAAATTAAAAGTAAGCAGCCACATAAAACAAAAGATTCGTTAGCAGAAATAGTTAGC\n+GGTTTTGAACGTACAAAAGAAGAAAAACCAAAACTTTTTGGTGTGGAGATACCTGAAATAAAAAGCGATTTATTAGGACA\n+ACTAGCGAATGCTCTTGTTCA'..b'TGAGA\n+ACTACATTTATTATTACAATCATCACCGAATCAAGGAAAAACTTAACTGGAAAAGCCCAGTAGAATTTCGACAATTCAAT\n+CAAAAAACTGCATAAAAATAGAGTGGAAAAATCCACTCTA\n+>NZ_GG692836.1 Enterococcus faecalis T2 genomic scaffold supercont1.4, whole genome shotgun sequence\n+GAAAAGAGTTCTAAATGATAAATACAAACATGCACTAGAGCTAATGGAAACAAACAGCATGCGAGAAGTGGAACGAAAAA\n+CAGGTATTTCTTTGTCTACTCTCAAAAGAATCAAGAAACAAGCAAAGGAAGAACAGTTACTTAATGAGAAATAATTCAGG\n+AGTAGTTATTATGGAAAATAGAGAGAAAATTATTCAGTTGTTGAAGAATCCTTTAGTAACAGGTTATGGGATTGAGATGA\n+TGTCAAATGGGCGACTCTATTCAGCGAACTTTCAAAGATATAGGAATCGGATGAAGAAAGAAGAAAATCCAATGATTATC\n+TTTGATACTATGACTGAAAAAGTTGAAAAGGTATTTTTAGAATTAGCTGAAGAAGTCATACGAACGAACCCTAAAACAAA\n+ACAAGAATTCAAAGATATGATTAAAGAATATAGTTATAAGGAGGATAACAAATGGTAGTTCGAAAAACATATGATCATTG\n+GGGGATCGAAATCAGCACATGGAACAAAAGTAATATAGTTACTTTTATTGATTGTGACTGCGGTCAATTAGCTAAAAGAG\n+AACTAGGAAAGTATAATCAGTATAAATGTGATAGCTGCAATAAAGAATACAAATTGTATCAAGGCAATTATATCGCAATT\n+GATGAAAAAATAAATGAAGTAGCTCAAAACGATTGAAAAAGTAAAAAAGATTTTTGATAGTGTGATAGTGTGCTTTCAAT\n+AATT\n+>NZ_GG692835.1 Enterococcus faecalis T2 genomic scaffold supercont1.3, whole genome shotgun sequence\n+TCAGAATGGTGCATCCCTCAAAACGAGGGAAAATCCCCTAAAACGAGGGATAAAACATCCCTCAAATTGGGGGATTGCTA\n+TCCCTCAAAACAGGGGGACACAAAAGACACTATTACAAAAGAAAAAAGAAAAGATTATTCGTCAGAGAATTCTCATGTTT\n+GACAGCTTATCATCGATAAGCTTTAATGCGGTAGTTTATCACAGTTAAATTGCTAACGCAGTCAGGCACCGTGTATGAAA\n+TCTAACAATGCGCTCATCGTCATCCTCGGCACCGTCACCCTGGATGCTGTAGGCATAGGCTTGGTTATGCCGGTACTGCC\n+GGGCCTCTTGCGGGATATCGTCCATTCCGACAGCATCGCCAGTCACTATGGCGTGCTGCTAGCGCTATATGCGTTGATGC\n+AATTTCTATGCGCACCCGTTCTCGGAGCACTGTCCGACCGCTTTGGCCGCCGCCCAGTCCTGCTCGCTTCGCTACTTGGA\n+GCCACTATCGACTACGCGATCATGGCGACCACACCCGTCCTGTGGATCCTCTACGCCGGACGCATCGTGGCCGGCATCAC\n+CGGCGCCACAGGTGCGGTTGCTGGCGCCTATATCGCCGACATCACCGATGGGGAAGATCGGGCTCGCCACTTCGGGCTCA\n+TGAGCGCTTGTTTCGGCGTGGGTATGGTGGCAGGCCCCGTGG\n+>NZ_GG692834.1 Enterococcus faecalis T2 genomic scaffold supercont1.2, whole genome shotgun sequence\n+AGCCCTTCAATCGCCAGAGAAATCTACGAGATGTATGAAGCGGTTAGTATGCAGCCGTCACTTAGAAGTGAGTATGAGTA\n+CCCTGTTTTTTCTCATGTTCAGGCAGGGATGTTCTCACCTAAGCTTAGAACCTTTACCAAAGGTGATGCGGAGAGATGGG\n+TAAGCACAACCAAAAAAGCCAGTGATTCTGCATTCTGGCTTGAGGTTGAAGGTAATTCCATGACCGCACCAACAGGCTCC\n+AAGCCAAGCTTTCCTGACGGAATGTTAATTCTCGTTGACCCTGAGCAGGCTGTTGAGCCAGGTGATTTCTGCANTAGCCA\n+GACTTGGGGGTGATGAGTTTACCTTCAAGAAACTGATCAGGGATAGCGGTCAGGTGTTTTTACAACCACTAAACCCACAG\n+TACCCAATGATCCCATGCAATGAGAGTTGTTCCGTTGTGGGGAAAGTTATCGCTAGTCAGTGGCCTGAAGAGACGTTTGG\n+CTGATCGGCAAGGTGTTCTGGTCGGCGCATAGCTGATAACAATTGAGCAAGAATCTTCATCGAATTAGGGGAATTTTCAC\n+TCCCCTCAGAACATAACATAGTAAATGGATTGAATTATGAAGAATGGTTTTTATGCGACTTACCGCAGC\n+>NZ_GG692833.1 Enterococcus faecalis T2 genomic scaffold supercont1.1, whole genome shotgun sequence\n+GGGAGCGTCAATAATTTTGTGTAAATAAATTGTCCTCCTGCAAAATAATTAGTTACTCAGTAAACATTGAAACTAATGTA\n+TCGGTTACCTGTTGAAAACCTTTATGGCTTCTGTTTAGAAATTTTTGATTGTATGTATCAAAAATGCTGACTAGAAAGCG\n+TTCTAGTGATTCTTCATTTTGAAACTGCTCTTTTCTACGGCTGTATCTTTTAATTTGCTTATTGAAAGACTCGATTAGAT\n+TGGTTGAGTAAATGGTTCTACGAATGCTAGGTGGAAAATCATAAAAAGTTAATAAGTCTTGGTTTTCTATGAGTGACTGC\n+GTCACTTTAGGATAGTTTTTCTTCCATTTCTCAATCATGCCGGATAAGAAGGTATTCGCTTCTTCTTTTGAGTTAGCTTG\n+ATAAACAGCCTTAAAGTCATCACAGATTTCTTTTCGGTCTTTGACACGTACTTTATGAGCGATATTACGAGATACATGGA\n+TACAACAATGCTGATATTTTGCTTTAGGATAAATTTGATGGATAGTATCTTTCATGCCTTTTAAGCCGTCCGTAATAAAA\n+AGCAAGACTTCTTGAACTCCTCTGGAGTTAATATCCTGTAGCAGCTCATTCCAAACGTATGTTGATTCAGTTGGAGCAAT\n+CGCATAACTCAGTACTTCTTTAGTGCCGTCTTCTCGTATACCAATGGCAATATAAATCGCTTCTTTGGATACGGTTTGAC\n+GTTTTAGTGGAATGTAAGTAGCGTCCATAAAAATAGCGACATACTTATCATTTAAGGCTCTGGATTTAAAGGCATTTACT\n+TCTTCAGTCAGAACTTTAGTCATGTTGGACATGGTTTGTGGAGTATAGTGATGACCGTACATTTTTTCGATCAAATCAGC\n+AATTTCAGACATCGTAACACCTTTTTCGAATAAATGGATAATAGTGGTTTCCAATGTATCGTTTGTTCTTTTGTAGGCTG\n+GTAAAGTTTGTTGTTTAAACTCACCATTACGATCTCTAGGTATTTCCAATGTTAATTCACCATATTCGGTTTTGATTGAT\n+CGAAAGTAAGAACCGTTTCTCGAATTACCTGAATTAAAACCAGTGCGATCATATTTTTCGTAATCTAAAAAAGCCGTTAA\n+TTCAGTCCGTAGGAGTGTGTTTATCGCTTTTTCTAAGTGCGAACGGAATAATTCATTTAAATCGCCTTTAGTGACTAGAG\n+TTTGCACAATTTCTGTAGTAAAATCATTCATAGGGAAGTCCTCTTTTCTGTGAATTGGTTGTCGTTAACTTTATTCTACA\n+GAAGCGACTTCCTTTTTTGTATGGATTTTTTCATTTACACAAAATATTTTACACTCTC\n'
b
diff -r 000000000000 -r 8dfcdbeaeed8 test-data/Enterococcus_faecalis_TX0104.fa
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Enterococcus_faecalis_TX0104.fa Tue May 05 06:12:47 2020 -0400
b
b'@@ -0,0 +1,39594 @@\n+>NZ_GG669016.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD95, whole genome shotgun sequence\n+TTCAAGATAAGTTTTAAGTCTGTGTCCTTACACGAGATTTTTTACGCAAAAATAATTCTTTGTAGTTCATCAGCACAAGC\n+ACATTTTTATATAACTGATTAATTTTGTTGAATTATAGTTATATCTATATTGATTAATAGCTGATCTTGCTAAGCATGGA\n+TTTAATAAGAATATTTTTGTTAAAAAATCATATAACCTTACGT\n+>NZ_GG669015.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD94, whole genome shotgun sequence\n+ACAAAGAAAGAGAATAAATAAATGAGATAGGAAGTGTTTCAATTTTTTTGTTACGAGTAATCCAAGAAGAAACAGTTCCT\n+TGAGAGAAAGCATGTAATTCACAAAATTCTTCTACAGTAATCCCTAGATGCTTGATAATGTATGCGTTGATAGGGTGTGG\n+ATATACAAAGGTTTTTCTAGGCATGGTGGGTAAGTTTCCTTTCTT\n+>NZ_GG669014.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD93, whole genome shotgun sequence\n+AATTTTTCTTGGATGGCGCGGGACAGAATCGAACTGCCGACACATGGAGCTTCAATCCATTGCTCTACCAACTGAGCTAC\n+CGAGCCAAAAACGGTCTGGACGGGACTCGAACCCGCGACCTCCTGCGTGACAGGCAGGCATTCTAACCAGCTGAACTACC\n+AAACCAATTCGTTTTTGCTTTATGCAAATGTATGCTTTAAAAAAT\n+>NZ_GG669013.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD92, whole genome shotgun sequence\n+GGAGGATTACCCAAGTCCGGCTGAAGGGAACGGTCTTGAAAACCGTCAGGCGGGTAAAACCGTGCAAGGGTTCGAATCCC\n+TTATCCTCCTTTCTTAGGAATCAATTTTCCNTGGTCTAATTATCGCGGGGTGGAGCAGTCAGGTAGCTCGTCGGGCTCAT\n+AACCCGAAGGTCGTAGGTTCAAATCCTGCCCCCGCAATTGCTTTT\n+>NZ_GG669012.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD91, whole genome shotgun sequence\n+GGAGTCTGACGAAGCTTATAAGCAGTTTATTGATGAGTATTTTCCATCTTACGACTATGCAAAAGTCAATCGTCTATTGC\n+AATTACGAGCAGACATTTTTTCTACTCTTGCAGGTGAAGCAATCGCAAGCGACGTTAACGGTAAATTTAATAACGACTTA\n+GAAAACATTACAAAACGAATCTACAATTCTAATTCTAATGCGTTGATG\n+>NZ_GG669011.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD90, whole genome shotgun sequence\n+GGAATCAGATGAAGTTTATAAACAATTCATTGATGAATATTTTCCATCCTTTGACTATGCGAAAGTTAATCGCTTGTTAC\n+AATTACGAGCAGACATTTTTTCTACCATTGCAGGTGAAGCAATAGCTAGTGATGTTAACGGTAAATTTAATAACGACTTA\n+GAGAATATCACAAAACGAATCTACAATTCTAATTCTAATGCGTTGATA\n+>NZ_GG669010.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD89, whole genome shotgun sequence\n+GTCGTATTTTTCACGAATAAAGATATCTGCATTTTCGTATTTTTCAGCTTTCCATGGTTCGATATTGCTCGCCCATAATT\n+GCCAACCTTCCTTGGTTTCAACAGGTTCTAACAAACGAAAAGCAACACCGCAAGCACCTTGAAACCGAGCTGTGTCAGAA\n+TCAAGCATGGCAAACCGCATATCATTCACTAACTCTGTCAGCCGTTCGAACTCTTT\n+>NZ_GG669009.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD88, whole genome shotgun sequence\n+CGCCAATGAAACATTAACCGCCGTAGCGAAAAACGCCAGCGGTACAGAAAGTACGCCAACAACGTTCCAAACGCCAGCGG\n+ATGAGACAACCGTAACCGCACCAACAATCACAGGAGTGACAGGTAATTCAACGGCAGGTTACGAGGTTAAAGGAACTACT\n+GATGCCAATGCCACGGTTGAGATCCGAAATGCAGGAGGTGCCGTGATAGGCACAGGGAGCGCC\n+>NZ_GG669008.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD87, whole genome shotgun sequence\n+CAGGATAGTATTTTCAGATGTATTCCCTGGTGTAGAAATAAAATCAGGTGATGGCGCTATGAATTTGTGGAGTTTGAACG\n+GTGGGTACAATAATTATTTAGCGACATCCCCAACAGGAACAGCTACAGGTTTTGGTGCAGACATTATTATCATTGATGAT\n+TTAATTAAAAATGCCGAAGAAGCAAATAATGCTATGGTTTTAGAGAAGCACTGGGATTGGTTTACC\n+>NZ_GG669007.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD86, whole genome shotgun sequence\n+GTATTTTTCATTTGCAACTATCCTTTTATTTTTTATTTGTATCAGTATCAATTTTACAGTAAACATGCATTTATGCCGAG\n+AAAATTTATTGATGTTGAGAAGAACCCTTAACTAAACTTGGAGACGAATGTCGGCATAGCGTGAGCTATTAAGCCGACCA\n+TTCGACAAGTTTTGGGATTGTTAAGGGTTCCGAGGCTCAACGTCAATAAAGCAATTGGAATAAAGCAAT\n+>NZ_GG669006.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD85, whole genome shotgun sequence\n+ATCAACTTTTCATAGTTCTTCTCAATATACTTACGGTACTTCTCTTGGTCTCGTTTGCTAAAATCTTCTAGCATCTCACT\n+TTGAGTAATACCGTGTAAATCAGCTTGTGCCAATAACTGTCTTTGAATTTTAACTAAAGCACGTTCGAAAACAGATTCTA\n+GTTCACTAAGAGTTTTCTTTTCCAGTTTCAAGCGTGCTTTGTCTTCTAATTCACGACGTTTTTCCCAGTA\n+>NZ_GG669005.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOLD84, whole genome shotgun sequence\n+TGGCAAAAGGTGCCGATTTTATTGGTGGCGTGGATCCTTATTCACTAGATGGTGATTACAAAAAATCATTGGCTGAAACA\n+TTCCGCTTAGCAGATAAACATGGTGTCGGTGTGGATATTCATTTACATGACCGTCATGAAGCTGGGACAACAACGATTAA\n+AGAAATTATTCGTTTAACGAAAGAATATGGTCTACAAGACAAAGTATTTATCAGTCATGCCTTCGGGTTAAA\n+>NZ_GG669004.1 Enterococcus faecalis TX0104 genomic scaffold SCAFFOL'..b'GTGGTATTACCCAAGTTACATTTTGTGTTACTGGAGAAACCCAACTTAAAATCTGTTACTAAATA\n+AGCCACACTAGACATTACAGCAGGACATAACAAAAATGGAATAAATAAAATAGGATTTAATACAGTAGGTAATCCATACA\n+TTACTGGTTCACCAATATTAAACAAAACAGGCATAATTCCAAGTTTTGCAACTTCTTTATAATGACTATTTTTTGATACC\n+CAAAAAATCGCAATTAATAAACCTAAAGTTGCATACCAGACAAACGATCCGTATGATACACTTGTCCAAACATAAGGAAT\n+AGTTTCATTGTTTTGATAAGCTTCTAAATTAACTAAAAGCGCTACTCCAAATACACCTTCCATAATAGGTGCCATAACAT\n+TTCCACCATGAATACCGAAAAACCAGAAAAATTGAACTAAGAAAGCAACTAATATAACAACAAAAAAACTTTGGGAAAGT\n+CCTAACATAGGGCGTTGTAATATTTCGTAAATTACATCTGTTAAAATTTTTCCTGTAATTCTATTTAACAAAAATGTCAA\n+AATAGCAATTATATAAAGAGAAACTAAAGCCGGAATAATTGATAAAAATGGTTTAGCGATAGCTGGAGGAACTGTATCAG\n+GTAATTTAATTGTCCAATTTTTATTCATGAGTTTACAAAAAATAATTGAAGATAAAAAACCTATGATTATTGCGGTAAAA\n+TAACCATTTGAATTAATTTGTGTACCAGGCAACAATCCTGATATAGTCACATTTAAACTGTTACCTGTAACTGTTATTCC\n+CTCAACATCAGTAAATAGTGTTGTTAAGTCAATATTATTATTATTCGATAAATTATAAGTTGAAGTCATAGAATTACTTA\n+TAGAAATAATAAATGAAGACAGGGCGACTAGTCCTGAAGACAAAGTATCTGTTTTATATATTTTAGCAATATTTACTCCT\n+AAACAATAAATAAACAGTAAGGAAACAATAGAAATGCTTCCTTTTGATATCAAATTATTTATATCTACTAACCATTGAAA\n+ATAATCAGTAATCTTCTCATAGCCAAATTGCATAGGAAAATCTACTAAAAAAGCATTTAATAATATTGCAACCGAACCTG\n+TCATTATTACAGGCATCGTACCCATAAATGAATCTCTTAACGCTACTAAAAATCGTTGATTCCCAATCTTAGTAGCTATA\n+GGTAAAACTTTATTCTGGATTGTATTCATTATTTTCTCACTCATTATGATTGACTCCTTCAAAAAATAGTTAGAAAGCGC\n+TATCTAATTATAGTATAGATTTTTTTGACCATTTGAACATGATAATCTATACTTAAAACATGGTATTTTCTTCATAGTTA\n+GTTCTATAAAGGAGGATAATAATGGAGGATTTTTGGTATCATAATAAGTCAGTTTCTGCTCCAATTTCCCTTTCCCAATG\n+CGGATATGAATCTTATCATCCCAATTCTTCTATTCGTAATTATATAGTTCAACAAAAATGGATATTTCATTATGTTTTAT\n+CTGGTAAGGGATTCTTAGAAGTAGAAGCTCAACATTTTGAACTTATAGAACACGATATTTTCTTTTTTTTCAAGGTCAAA\n+AAGTGAAGTATTATACAGATAAAAAGGAACCTTGGACACTGATTTGGTTAGGTATACAAGGTGATAAGACTTCTGAATTT\n+TTGAAAGAAACAACTTTACTAAATACTCATACAGTTAGCTTGACTAAGAATATAAATAAAAAACACACTATTGAAAATAG\n+TATGTGAAAATAGAGAACTAATTGGGGTTTGTCAACTAAACTGTGGAAGTTAAATAGTTAAGAGTTTTTAACCACCATAA\n+TTCTCTCGGCTATTTTGAAATCAGATAAATTTTGATCGGACACATAGTTGAATAAACCTACCGTTGTTACGGAAGGTAAA\n+TCGCATACTTTTCATTCTAGAGGAGAAGGTCTCGTTGATCATTAACTGCTTAAGCACCATAGCCCAGTTTTGAATTCGAC\n+CACCTGACCACTTCGCGTGTAATTCTTTCGTGCGTAAATAAAGTAGTTTTAGGAGTGCATTCTCATTAGAGAACGCTCCT\n+TTTTTTGTGACTTTTCTGAAGCTGGAGTGGACACTTTCAACCGCATTGGTAGTGTACATAATTTTTCGAATGGCACTACC\n+ATAATCAAATAGTTGTTCAACATGTGCAAAGTTCCGTTTCCAGACATCTACAGCACCAGAATAATGAGACCACCGATTTT\n+GAAAACTGCCAAAAGCAGCATGTGCAGCGTTTAGAGAAGAAGCACCGTAGAACTTTTTCATATCTCGGCAGACTTCCTTA\n+TAGTCCTTACTTGGAATATAGCGCAATGCATTTCGAACAAGATGAACAATACAGCGCTGAACAATTACCGATGGAAAGAT\n+CGCTTTTGCGCCTTCTTCGAGGCCAGAAACACCGTCCATCGAAATGAAAAAGACATCTTCGACACCACGTGCTTTCAGTT\n+CATCAAATACTTGCATCCAGCGATTTTTAGATTCTGTTTGATTTAACCATAATCCTAAAATCTCCTTATTTCCTTTGAGA\n+TCATAGCCAAGAATGGTGTATACAGCATATTCTTTGGCTTCATAATTTTCTCGTAAAGTAACATACATACAATCAACGAA\n+TAGAAAGGCATAACACTTTGCTAGGGGGCGGGCTTGCCATTCTTCCAATTCAGGAAGGACAGCGTCAGTGATATCTGAAA\n+TCATTTCATGGGAAATATCAAAGCCATAGATAGCTTCGACGGTTGCGGCAATATCTCGTTGACTCATTCCTCGTGCATAC\n+ATGGAAAGAACCTTCCCTTCGATGTCGGAGACATCTCGTTTTCTCTTAGGAATTAACTCTGGTTCAAAGGAAGCTTCCCG\n+GTCTCTAGGAACATCAATAGCTACTTCACCAAAACTGGTTTTAAGCGTTTTAGTTCCATAGCCATTTCGACGGTTATCGT\n+GTTCCTTAGGCTCTTTAGAATGGGCATCATAACCTAAATGATTATTCAATTCTCCTTGAAGCATTTTTTCAAAAAGGGGC\n+CCAAACACATCTTTCAAAGCATCTTGCATGTCATCGACAGATTCAGGTTGATAGGCATTCAGAATGGATTCAGCTAACTT\n+TTCGGCATCAGGATTTCTTTTCTTTCTAGCCATCGTGTAATCACCTCTTGATCTTATTGTAGAAAAAGAAAAACCGCGAA\n+GCAACCACCTGCCTAGGATTAATGGTTACTTACACGGTTTACATTACACTCTCATCATTAAAAAATATCTTTCGATATTA\n+CTAAAAAATGAAATTATCTATTTGAAATTTAAATCTGATAATATATTGAGTGTTCCAATGAGGGCTTTGATCCTATTATC\n+CATATCTTTTTCAGTTTTAATATTGATACACATTATAAGAAAAAGAAACTTTTGGTTAATTTATTTCATCCCAGTGCAAA\n+ATATAAAAAAGTAAAATAGACCATAGGCAATAGATTAACAATTAATATAATACAAGCAAATATTCTATTCATTGTATTTC\n+TGCTTTCAAAAAGTTCTAATATTGAAAGTATAATTCCACAAATGCCTATAATAAGTATGGGAATAAGATTAGAAAATAAC\n+ACCCCGAGCCAATCTGTGAAAATTACTATATTTACTAATATCCCTATTAACATACCTATCATATTTATTAAATCAATTTT\n+TTTCATTAGATATTTCTCCTTTATTAAATCTGAAGTACTTATATTAAAATAATTCTAAACTATTTACCTCAAAAAGTGCA\n+ATAAATTATTTCTGAACAATAAAATGATAGTCAAAACAAAAAAATATTGTTAATCGTTCAGAGCCATCCACCATTTTTAC\n+AATTAATGTATTCATATAAACTATTTACGATACTATAATAAAAGCATTATAGTTATTTATAAATAATTAAAGCAACT\n'