Repository 'hyphy_fubar'
hg clone https://toolshed.g2.bx.psu.edu/repos/iuc/hyphy_fubar

Changeset 6:3285fd1f4bde (2020-02-20)
Previous changeset 5:bece0bad8e89 (2020-02-17) Next changeset 7:ce53071d60a9 (2020-02-29)
Commit message:
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/hyphy/ commit 8d5ae1d04c43988fdcc458f4f08376a15e72db8e"
modified:
hyphy_fubar.xml
macros.xml
test-data/gard-in1.fa
test-data/gard-out1.nex
test-data/meme-in1.fa
test-data/meme-in1.nhx
test-data/meme-out1.json
b
diff -r bece0bad8e89 -r 3285fd1f4bde hyphy_fubar.xml
--- a/hyphy_fubar.xml Mon Feb 17 14:52:11 2020 -0500
+++ b/hyphy_fubar.xml Thu Feb 20 18:10:47 2020 -0500
[
@@ -12,29 +12,19 @@
             --alignment ./fubar_input.fa
             --tree ./fubar_input.nhx
             --code '$gencodeid'
-            --method '$posterior'
+            --method '$posteriorEstimationMethod.method'
             --grid '$grid_points'
-            --chains '$mcmc'
-            --chain-length '$chain_length'
-            --burn-in '$samples'
-            --samples '$samples_per_chain'
+            @posteriorEstimationMethod_cmd@
             --concentration_parameter '$concentration'
             > '$fubar_log'
     ]]></command>
     <inputs>
         <expand macro="inputs"/>
         <expand macro="gencode"/>
-        <param name="grid_points" type="integer" value="20" min="5" max="50" label="Grid points"/>
-        <param name="posterior" type="select" label="Posterior estimation method">
-            <option value="Metropolis-Hastings">Full Metropolis-Hastings MCMC algorithm</option>
-            <option value="Collapsed-Gibbs">Collapsed Gibbs sampler</option>
-            <option value="Variational-Bayes">0-th order Variational Bayes approximations</option>
-        </param>
-        <param name="mcmc" type="integer" value="5" min="2" max="20" label="Number of MCMC chains"/>
-        <param name="chain_length" type="integer" value="2000000" min="500000" max="50000000" label="Length of each chain"/>
-        <param name="samples" type="integer" value="1000000" min="100000" max="1900000" label="Samples to use for burn-in"/>
-        <param name="samples_per_chain" type="integer" value="100" min="50" max="1000000" label="Samples to draw from each chain"/>
-        <param name="concentration" type="float" value="0.5" min="0.001" max="1" label="Concentration parameter of the Dirichlet prior"/>
+        <param argument="--grid" name="grid_points" type="integer" value="20" min="5" max="50" label="Grid points" />
+        <expand macro="conditional_posteriorEstimationMethod" />
+        <param argument="--concentration_parameter" name="concentration" type="float" value="0.5" min="0.001" max="1" label="Concentration parameter of the Dirichlet prior" />
+
     </inputs>
     <outputs>
         <data name="fubar_log" format="txt"/>
@@ -49,13 +39,91 @@
         </test>
     </tests>
     <help><![CDATA[
-Model-based selection analyses (such as those performed by PAML and HyPhy) can be slow, becoming impractical for large alignments. We present a method to model and detect selection much faster than existing methods and to leverage Bayesian MCMC to robustly account for parameter estimation errors.
+
+FUBAR : Faste Unbiased Bayesian AppRoximation
+=============================================
+
+What question does this method answer?
+--------------------------------------
+
+Which site(s) in a gene are subject to pervasive, i.e. consistently across the entire phylogeny, diversifying selection?
+
+Recommended Applications
+------------------------
+
+The phenomenon of pervasive selection is generally most prevalent in pathogen evolution and any biological system influenced by evolutionary arms race dynamics
+(or balancing selection), including adaptive immune escape by viruses. As such, FUBAR is ideally suited to identify sites under positive selection which
+represent candidate sites subject to strong selective pressures across the entire phylogeny.
+
+FUBAR is our recommended method for detecting pervasive selection at individual sites on large (> 500 sequences) datasets for which other methods have prohibitive runtimes, unless you have access to a computer cluster.
+
+Brief description
+-----------------
 
-Results: By exploiting some commonly used approximations, FUBAR can perform detection of positive selection under a model that allows rich site- to-site rate variation about 30 to 50 times faster than existing random effects likelihood methods, and 10 to 30 times faster than existing fixed effects likelihood methods. We introduce an ultra-fast MCMC routine that allows a flexible prior specification, with no parametric constraints on the prior shape. Furthermore, our method allows us to visualize Bayesian inference for each site, revealing the model supported by the data.
+Perform a Fast Unbiased AppRoximate Bayesian (FUBAR) analysis of a
+coding sequence alignment to determine whether some sites have been
+subject to pervasive purifying or diversifying selection. There are three methods
+for estimating the posterior distribution of
+grid weights: collapsed Gibbs MCMC (faster), 0-th order Variation
+Bayes approximation (fastest), full Metropolis-Hastings (slowest).
+
+Input
+-----
+
+1. A *FASTA* sequence alignment.
+2. A phylogenetic tree in the *Newick* format
+
+Note: the names of sequences in the alignment must match the names of the sequences in the tree.
+
+
+Output
+------
+
+A JSON file with analysis results (http://hyphy.org/resources/json-fields.pdf).
 
-See the online documentation_ for more information.
+A custom visualization module for viewing these results is available (see http://vision.hyphy.org/FUBAR for an example)
+
+Further reading
+---------------
+
+http://hyphy.org/methods/selection-methods/#FUBAR
+
+
+Tool options
+------------
+::
+
+
+    --code             Which genetic code to use
+
+    --grid             The number of grid points
+                        Smaller : faster
+                        Larger : more precise posterior estimation but slower
+                        default value: 20
 
-.. _documentation: http://hyphy.org/methods/selection-methods/#fubar
+    --method           Inference method to use
+                            Variational-Bayes : 0-th order Variational Bayes approximation; fastest [default]
+                            Metropolis-Hastings : Full Metropolis-Hastings MCMC algorithm; orignal method [slowest]
+                            Collapsed-Gibbs  : Collapsed Gibbs sampler [intermediate speed]
+
+
+    --chains           How many MCMC chains to run (does not apply to Variational-Bayes)
+                            default value: 5
+
+    --chain-length     MCMC chain length (does not apply to Variational-Bayes)
+                            default value: 2,000,000
+
+    --burn-in          MCMC chain burn in (does not apply to Variational-Bayes)
+                            default value: 1,000,000
+
+    --samples          MCMC samples to draw (does not apply to Variational-Bayes)
+                            default value: 1,000
+
+    --concentration_parameter
+                        The concentration parameter of the Dirichlet prior
+                        default value: 0.5
+
+
     ]]></help>
     <expand macro="citations">
         <citation type="doi">10.1093/molbev/mst030</citation>
b
diff -r bece0bad8e89 -r 3285fd1f4bde macros.xml
--- a/macros.xml Mon Feb 17 14:52:11 2020 -0500
+++ b/macros.xml Thu Feb 20 18:10:47 2020 -0500
[
@@ -6,6 +6,8 @@
     </xml>
     <xml name="substitution">
         <param name="model" type="select" label="Substitution model">
+            <option value="GTR">GTR - General time reversible
+            model</option>
             <option value="LG">LG - Generalist empirical model from
             Le and Gascuel (2008)</option>
             <option value="HIVBm">HIVBm - Specialist empirical model
@@ -26,10 +28,43 @@
             for invertebrate mitochondrial genomes</option>
             <option value="gcpREV">gcpREV - Specialist empirical
             model for green plant chloroplast genomes</option>
-            <option value="GTR">GTR - General time reversible
-            model</option>
         </param>
     </xml>
+
+    <xml name="conditional_posteriorEstimationMethod">
+        <conditional name="posteriorEstimationMethod">
+            <param argument="--method" type="select" label="Posterior estimation method">
+                <option value="Variational-Bayes">0-th order Variational Bayes approximation</option>
+                <option value="Metropolis-Hastings">Full Metropolis-Hastings MCMC algorithm</option>
+                <option value="Collapsed-Gibbs">Collapsed Gibbs sampler</option>
+            </param>
+            <when value="Variational-Bayes">
+            </when>
+            <when value="Metropolis-Hastings">
+                <expand macro="mcmc_options" />
+            </when>
+            <when value="Collapsed-Gibbs">
+                <expand macro="mcmc_options" />
+            </when>
+        </conditional>
+    </xml>
+
+    <token name="@posteriorEstimationMethod_cmd@">
+            #if $posteriorEstimationMethod.method != "Variational-Bayes"
+                --chains '$posteriorEstimationMethod.chains'
+                --chain-length '$posteriorEstimationMethod.chain_length'
+                --burn-in '$posteriorEstimationMethod.samples'
+                --samples '$posteriorEstimationMethod.samples_per_chain'
+            #end if
+    </token>
+
+    <xml name="mcmc_options">
+        <param argument="--chains" type="integer" value="5" min="2" max="20" label="Number of MCMC chains" />
+        <param argument="--chain-length" name="chain_length" type="integer" value="2000000" min="500000" max="50000000" label="Length of each chain" />
+        <param argument="--burn-in" name="samples" type="integer" value="1000000" min="100000" max="1900000" label="Samples to use for burn-in" />
+        <param argument="--samples" name="samples_per_chain" type="integer" value="100" min="50" max="1000000" label="Samples to draw from each chain" />
+    </xml>
+
     <xml name="gencode">
         <param name="gencodeid" type="select" label="Genetic code">
             <option value="Universal">Universal code</option>
@@ -62,27 +97,28 @@
             <option value="All">All branches</option>
             <option value="Internal">Internal branches</option>
             <option value="Leaves">Leaf branches</option>
-            <option value="'Unlabeled-branches'">Unlabeled
-            branches</option>
+            <option value="'Unlabeled-branches'">Unlabeled branches</option>
         </param>
     </xml>
     <xml name="citations">
         <citations>
-            <citation type="doi">10.1093/bioinformatics/bti079</citation>
+            <citation type="doi">10.1093/molbev/msz197</citation>
             <yield/>
         </citations>
     </xml>
-    <token name="@VERSION@">2.5.3</token>
+    <token name="@VERSION@">2.5.4</token>
     <xml name="requirements">
         <requirements>
-            <requirement type="package" version="@VERSION@">
-            hyphy</requirement>
+            <requirement type="package" version="@VERSION@">hyphy</requirement>
             <yield/>
         </requirements>
     </xml>
     <token name="@HYPHYMPI@">\${GALAXY_MPIRUN:-mpirun -np \${GALAXY_SLOTS:-1}} HYPHYMPI</token>
-    <token name="@HYPHY_ENVIRONMENT@"><![CDATA[export HYPHY=`which hyphy` &&
-export HYPHY_PATH=`dirname \$HYPHY` &&
-export HYPHY_LIB=`readlink -f \$HYPHY_PATH/../lib/hyphy` &&]]></token>
-    <token name="@HYPHY_INVOCATION@"><![CDATA[@HYPHY_ENVIRONMENT@ hyphy LIBPATH=\$HYPHY_LIB ]]></token>
+    <token name="@HYPHY_ENVIRONMENT@"><![CDATA[
+        export HYPHY=`which hyphy` &&
+        export HYPHY_PATH=`dirname \$HYPHY` &&
+        export HYPHY_LIB=`readlink -f \$HYPHY_PATH/../lib/hyphy` &&]]></token>
+    <token name="@HYPHY_INVOCATION@"><![CDATA[
+        @HYPHY_ENVIRONMENT@ hyphy LIBPATH=\$HYPHY_LIB
+    ]]></token>
 </macros>
b
diff -r bece0bad8e89 -r 3285fd1f4bde test-data/gard-in1.fa
--- a/test-data/gard-in1.fa Mon Feb 17 14:52:11 2020 -0500
+++ b/test-data/gard-in1.fa Thu Feb 20 18:10:47 2020 -0500
b
b'@@ -1,735 +1,120 @@\n->TREESPARROW_HENAN_1_2004\n-ATGGAGAAAATAGTGCTTCTTCGTGCAATGATCAATCTTGTTAAAAGTGA\n-TCAGATTGGCGTTGGTTACCATGCAGACTACTCGACAGAGCAGGGTGACA\n-CAATAATGGAAAAGAACGTTACTGTTACACATGCTCAAGACATATTGGAA\n-AAGACACACAACGGGAAGCTCTGCGACCTAGATGGAGTGAAGCCTCTAAT\n-TTTGAAAGATTGTAGTGTAGCTGGATGGCTCCTCGGAAACCCAATGTGTG\n-ACGAATTCATCAATGTGCCGGAGTGGTCTTACATAGTGGAGAAGGCCAGT\n-CCAGCCAATGACCTCTGTTACCCAGGGGATTTCAACGACTATGAAGAACT\n-GAAACACCTATTGAGCAGAATAAACCATTTTGAAAAAATTCAGATCATCC\n-CCAAAAGTTCTTGGTCCGATCATGAAGCCTCATCAGGGGTGAGCTCAGCA\n-TGTCCATACCTGGGGAAGCCCTCCTTTTTCAGAAATGTGGTATGGCTTAT\n-CAAAAAGAACAGTACATACCCAACAATAAAGAGGGGCTACAATAATACCA\n-ACCCAGAAGATCTTTTGGTACTGTGGGGGATTCACCATCCTAATGATGCG\n-GCAGAGCAGATAAAGCTCTATCAAAACCCAACCACCTATATTTCCGTTGG\n-AACATCAACACTAAACCAGAGATTGGTACCAAAAATAGCTACTAGATCCA\n-AAGTAAATGGGCAAAGTGGAAGAATGGAGTTCTTCTGGACAATTTTAAAG\n-CCGAATGACGCTATCAACTTCGAGAGTAATGGAAATTTCATTGCTCCAGA\n-ATATGCATACCAAATTGTCAAGAAAGGGGACTCAGCAATTATGAAAAGTG\n-AATTGGAATATGGTAACTGCAACACCAAGTGTCAAACTCCAATGGGGGCG\n-ATAAACTCTAGTATGCCATTCCACAACATACACCCTCTCACCATCGGGGA\n-ATGCCCCAAATATGTGAAATCAAACAGATTAGTCCTTGCGACAGGGCTCA\n-GAAATAGCCCTCAAAGAGAGAGAAGAAGAAAAAAGAGAGGACTATTTGGA\n-GCTATAGCAGGTTTTATAGAGGGAGGATGGCAGGGAATGGTAGATGGTTG\n-GTATGGGTACCACCATAGCAATGAGCAGGGGAGTGGATACGCTGCAGACA\n-AAGAATCCACTCAAAAAGCAATAGATGGAGTCACCAATAAGGTCAACTCG\n-ATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAA\n-TAACTTAGAAAGGAGAATAGAAAATTTAAACAAGAAGATGGAGGACGGAT\n-TCCTAGATGTCTGGACTTATAATGCTGAACTTCTGGTTCTCATGGAAAAT\n-GAGAGAACTCTAGACTTTCATGACTCAAATGTCAAGAACCTTTACGAAAA\n-GGTCCGACTACAACTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTT\n-TCGAGTTCTATCACAAATGTGATAATGAATGTATGGAAAGTGTAAAAAAC\n-GGAACGTATGACTACCCGCAGTATTCAGAAGAAGCAAGACTAAACAGAGA\n-GGAAATAAGTGGAGTAAAATTGGAATCAATAGGAACTTACCAAATACTGT\n-CAATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGCT\n-GGTCTATCTTTATGGATGTGCTCCAATGGATCGTTACAATGCAGAATT\n->HUMAN_VIETNAM_CL105_2005\n-ATGGAGAAAATAGTGCTTCTTTTTGCGATAGTCAGTCTTGTTAAAAGTGA\n-TCAGATTTGCATTGGTTACCATGCAAACAACTCGACAGAGCAGGTTGACA\n-CAATAATGGAAAAGAACGTTACTGTTACACATGCCCAAGACATACTGGAA\n-AAGACACACAACGGGAAGCTCTGCGATCTAGATGGAGTGAAGCCTCTAAT\n-TTTGAGAGATTGTAGTGTAGCTGGATGGCTACTCGGAAACCCAATGTGTG\n-ACGAATTCATCAATGTGCCGGAATGGTCTTACATAGTGGAGAAGGTCAAT\n-CCAGTCAATGACCTCTGTTACCCAGGGGTTTTCAATGACTATGAAGAATT\n-GAAACACCTATTGAGCAGAATAAACCATTTTGAGAAAATTCAGATCATCC\n-CCAAAAGTTCTTGGTCCAGTCATGAAGCCTCATTAGGGGTGAGCTCAGTA\n-TGTCCATACCAGGGAAAGTCCTCCTTTTTCAGAAATGTGGTATGGCTTAT\n-CAAAAAGAACAGTACATACCCAACAATAAAGAGGAGCTACAATAATACCA\n-ACCAAGAAGATCTTTTGGTAATATGGGGGATTCATCATCCTAATGATGCG\n-GCAGAGCAGATAAAGCTCTATCAAAACCCAACCACCTATATTTCCGTTGG\n-GACATCAACACTAAACCAGAGATTGGTACCAAGAATAGCTACTAGATCCA\n-AAGTAAACGGGCAAAGTGGGAGGATGGAGTTCTTCTGGACAATTTTAAAA\n-CCGAATGATGCAATCAACTTCGAGAGTAATGGAAATTTCATTGCTCCAGA\n-ATATGCATACAAGATTGTCAAGAAAGGGGACTCAACAATTATGAAAAGTG\n-AATTGGAATATGGTAACTGCAACACCAAGTGTCAAACTCCAATGGGGGCG\n-ATAAACTCTAGTATGCCATTCCACAATATACACCCTCTCACCATCGGGGA\n-ATGCCCCAAATATGTGAAATCAAACAGATTAGTCCTTGCGACTGGGCTCA\n-GAAATAGCCCTCAAAAAGAGAGAAGAAGAAAAAAGAGAGGATTATTTGGA\n-GCTATAGCAGGTTTTATAGAGGGAGGATGGCAGGGAATGGTAGATGGTTG\n-GTATGGGTACCACCATAGCAATGAGCAGGGGAGTGGGTACGCTGCAGACA\n-AAGAATCCACTCAAAAGGCAATAGATGGAGTCACCAATAAGGTCAACTCG\n-ATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAA\n-CAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGGT\n-TCCTAGATGTCTGGACTTATAATGCTGAACTTCTGGTTCTCATGGAAAAT\n-GAGAGAACTCTAGACTTTCATGACTCAAATGTCAAGAACCTTTACGACAA\n-GGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTT\n-TCGAGTTCTATCATAAATGTGATAATGAATGTATGGAAAGTGTAAGAAAC\n-GGAACGTATGACTACCCGCAGTATTCAGAAGAAGCAAGATTAAAAAGAGA\n-GGAAATAAGTGGAGTAAAATTGGAATCAATAGGAATTTACCAAATACTGT\n-CAATTTATTCTACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATAGCT\n-GGTCTATCCTTATGGATGTGCTCCAATGGGTCGTTACAA---------\n->TREESPARROW_HENAN_4_2004\n-ATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAAGTGA\n-TCAGATTTGCATTGGTTACCATGCAAACAACTCGACAGAGCAGGTTGACA\n-CAATAATGCAAAAGAACGTTACTGTTACACATGCCCAAGACATACTGGAA\n-AAGACACACAACGGGAAGCTCTGCGATCTAGATGGAGTGAAACCTCTAAT\n-TTTAAGAGATTGTAGTGTAGCTGGATGGCTCCTCGGAAACCCAATGTGTG\n-ACGAATTCATCAATGTGCCGGAATGGTCTTACATAGTGGAGAAGGCCAGT\n-CCAGCCAATGACCTCTGTTACCCAGGGGATTTCAACGACTATGAAGAACT\n-G'..b'TTAC-TGCCAGCCACCATGAATATTGTACGGTACCATAAA-TACTTGAC\n+CACCTGTAGTACATAAAAACCC-AATCC--ACATCAAAA----CCCCCCC\n+CC-CATGCTTACAAGCAAGTACAGCAACCAACCCTCAA-CTATCATACAT\n+CAACTGCAACTCCAAAGCCAC-CCCTCACCCAC-TAGGATACCAACAAAC\n+CTACCCACCC-TTAACAGTACATAGTACATAAAGCCATTTACCGTACATA\n+GCACATTACA-GTCAAATCCCTTCTCGTCCCC-ATGG-ATGACCCCCC-T\n+CAGAT-AGGGGTCCCTTGACCACCATCC\n+>BRO5\n+ATTCTAATTTAAACTATTCT-CTGTTCTTTCATGGGGAAGCAGATTTGGG\n+TACCACCCAAGTATTGACTCACCCATCAACAACCGCTATGTATTTCGTAC\n+ATTAC-TGCCAGCCACCATGAATATTGTACAGTACCATAAA-TACTTGAC\n+CACCTGTAGTACATAAAAACCC-AATCC--ACACCAAAA----CCCCCCC\n+CC-CATGCTTACAAGCAAGTACAGCAACCAACCCTCAA-CTATCACACAT\n+CAACTGCAACTCCAAAGCCAC-CCCTCACCCAC-TAGGATACCAACAAAC\n+CTACCCACCC-TTAACAGTACATAGTACATAAAGCCATTTACCGTACATA\n+GCACATTACA-GTCAAATCCCTTCTCGTCCCC-ATGG-ATGACCCCCC-T\n+CAGAT-AGGGGTCCCTTGACCACCATCC\n+>BRO6\n+ATTCTAATTTAAACTATTCT-CTGTTCTTTCATGGGGAAGCAGATTTGGG\n+TACCACCCAAGTATTGACTCACCCATCAACAACCGCTATGTATTTCGTAC\n+ATTAC-TGCCAGCCACCATGAATATTGTACGGTACCATAAA-TACTTGAC\n+CACCTGTAATACATAAAAACCC-AATTC--ACACCAAAA----CCCCCCC\n+CC-CATGCTTACAAGCAAGTACAGCAACCAACCCTCAA-CTATCACACAT\n+CAACTGCAACTCCAAAGCCAC-CCCTCACCCAC-TAGGATACCAACAAAC\n+CTACCCACCC-TTAACAGTACATAGTACATAAAGCCATTTACCGTACATA\n+GCACATTACA-GTCAAATCCCTTCTCGTCCCC-ATGG-ATGACCCCCC-T\n+CAGAT-AGGGGTCCCTTGACCACCATCC\n+>BRO7_1\n+ATTCTAATTTAAACTATTCT-CTGTTCTTTCATGGGGAAGCAGATTTGGG\n+TACCACCCAAGTATTGACTCACCCATCAACAACCGCTATGTATTTCGTAC\n+ATTAC-TGCCAGCCACCATGAATATTGTACGGTACCATAAA-TACTTGAC\n+CACCTGTAGTACATAAAAACCC-AATCC--ACACCAAAA----CCCCCCC\n+CC-CATGCTTACAAGCAAGTACAGCAACCAACCCTCAA-CTATCACACAT\n+CAACTGCAACTCCAAAGCCAC-CCCTCACCCAC-TAGGATACCAACAAAC\n+CTACCCACCC-TTAACAGTACATAGTACATAAAGCCATTTACCGTACATA\n+GCACATTACA-GTCAAATCCCTTCTCGTCCCC-ATGG-ATGACCCCCC-T\n+CAGAT-AGGGGTCCCTTGACCACCATCC\n+>BRO8\n+ATTCTAATTTAAACTATTCT-CTGTTCTTTCATGGGGAAGCAGATTTGGG\n+TACCACCCAAGTATTGACTCACCCATCAACAACCGCTATGTATTTCGTAC\n+ATTAC-TGCCAGCCACCATGAATATTGTACGGTACCATAAA-TACTTGAC\n+CACCTGTAGTACATAAAAACCC-AATCC--ACATCAAAA----CCCCCCC\n+CC-CATGCTTACAAGCAAGTACAGCAACCAACCCTCAA-CTATCACACAT\n+CAACTGCAACTCCAAAGCCAC-CCCTCACCCAC-TAGAATACCAACAAAC\n+CTACCCACCC-TTAACAGTACATAGTACATAAAGCCATTTACCGTACATA\n+GCACATTACA-GTCAAATCCCTTCTCGTCCCC-ATGG-ATGACCCCCC-T\n+CAGAT-AGGGGTCCCTTGACCACCATCC\n+>BRO9\n+ATTCTAATTTAAACTATTCT-CTGTTCTTTCATGGGGAAGCAGATTTGGG\n+TACCACCCAAGTATTGACTCACCCATCAACAACCGCTATGTATTTCGTAC\n+ATTAC-TGCCAGCCACCATGAATATTGTACGGTACCATAAA-TACTTGAC\n+CACCTGTAGTACATAAAAACCC-AACCC--ACATCAAAA----CCCCCCC\n+CC-CATGCTTACAAGCAAGTACAGCAACCAACCCTCAA-CTATCACACAT\n+CAATTGCAACTCCAAAGCCAC-CCCTCACCCAC-TAGGATACCAACAAAC\n+CTACCCACCC-TTAACAGTACATAGTACATAAAGCCATTTACCGTACATA\n+GCACATTACA-GTCAAATCCCTTCTCGTCCCC-ATGG-ATGACCCCCC-T\n+CAGAT-AGGGGTCCCTTGACCACCATCC\n+>BRO10\n+ATTCTAATTTAAACTATTCT-CTGTTCTTTCATGGGGAAGCAGATTTGGG\n+TACCACCCAAGTATTGACTCACCCATCAACAACCGCTATGTATTTCGTAC\n+ATTAC-TGCCAGCCACCATGAATATTGTACGGTACCATAAA-TACTTGAC\n+CACCTGTAGTACATAAAAACCC-AATCC--ACATCAAAA----CCCCCCC\n+CC-CATGCTTACAAGCAAGTACAGCAATCAACCTTCAA-CTATCACACAT\n+CAACTGCAACTCCAAAGCCAC-CCCTCACCCAC-TAGGATACCAACAAAC\n+CTACCCACCC-TTAACAGTACATAGCACATAAAGCCATTTATCGTACATA\n+GCACATTACA-GTCAAATCCCTTCTCGTCCCC-ATGG-ATGACCCCCC-T\n+CAGAT-AGGGGTCCCTTGACCACCATCC\n+>BRO11_1\n+ATTCTAATTTAAACTATTCT-CTGTTCTTTCATGGGGAAGCAGATTTGGG\n+TACCACCCAAGTATTGACTCACCCATCAACAACCGCTATGTATTTCGTAC\n+ATTAC-TGCCAGCCACCATGAATATTGTACGGTACCATAAA-TACTTGAC\n+TACCTGTAGTACATAAAAACCC-AACCC--ACATCAAAA----CCCTGCC\n+CC-CATGCTTACAAGCAAGTACAGCAATCAACCTTCAA-CTGTCACACAT\n+CAACTGCAACTCCAAAGCCAC-CCCTCACCCAC-TAGGATACCAACAAAC\n+CTACCCACCC-TTAACAGTACATAGCACATAAAGTCATTTACCGTACATA\n+GCACATTACA-GTCAAATCCCTTCTCGTCCCC-ATGG-ATGACCCCCC-T\n+CAGAT-AGGGGTCCCTTGACCACCATCC\n+>BRO12\n+ATTCTAATTTAAACTATTCT-CTGTTCTTTCATGGGGAAGCAGATTTGGG\n+TACCACCCAAGTATTGACTCACCCATCAACAACCGCTATGTATTTCGTAC\n+ATTAC-TGCCAGCCACCATGAATATTGTACAGTACCATAAA-TACTTGAC\n+TACCTGTAGTACATAAAAACCC-AATCC--ACATCAAAA----CCCCCTC\n+CC-CATGCTTACAAGCAAGTACAGCAATCAACCTTCAA-CTATCACACAT\n+CAACTGCAACTCCAAAGCCAC-CCCTCACCCAC-TAGGATACCAACAAAC\n+CTACCCACCC-TTAACAGTACATAGTACATAAAGCCATTTACCGTACATA\n+GCACATTACA-GTCAAATCCCTTCTCGTCCCC-ATGG-ATGACCCCCC-T\n+CAGAT-AGGGGTCCCTTGACCACCATCC\n\\ No newline at end of file\n'
b
diff -r bece0bad8e89 -r 3285fd1f4bde test-data/gard-out1.nex
--- a/test-data/gard-out1.nex Mon Feb 17 14:52:11 2020 -0500
+++ b/test-data/gard-out1.nex Thu Feb 20 18:10:47 2020 -0500
b
b"@@ -1,50 +0,0 @@\n-#NEXUS\n-\n-BEGIN TAXA;\n-\tDIMENSIONS NTAX = 21;\n-\tTAXLABELS\n-\t\t'TREESPARROW_HENAN_1_2004' 'TREESPARROW_HENAN_3_2004' 'TREESPARROW_HENAN_4_2004' 'CHICKEN_HEBEI_326_2005' 'SWINE_ANHUI_2004' 'TREESPARROW_HENAN_2_2004' 'CHICKEN_HONGKONG_915_97' 'GOOSE_HONGKONG_W355_97' 'DUCK_HONGKONG_Y283_97' 'HONGKONG_97_98' 'HONGKONG_538_97' 'DUCK_GUANGZHOU_20_2005' 'GOOSE_SHANTOU_2216_2005' 'PEREGRINEFALCON_HK_D0028_2004' 'CK_HK_WF157_2003' 'HUMAN_VIETNAM_CL105_2005' 'DUCK_VIETNAM_376_2005' 'VIETNAM_3062_2004' 'MALLARD_VIETNAM_16_2003' 'CHICKEN_THAILAND_KANCHANABURI_CK_160_2005' 'DUCK_VIETNAM_272_2005' ;\n-END;\n-\n-BEGIN CHARACTERS;\n-\tDIMENSIONS NCHAR = 1698;\n-\tFORMAT\n-\t\tDATATYPE = DNA\n-\t\tGAP=-\n-\t\tMISSING=?\n-\t;\n-\n-MATRIX\n-\t'TREESPARROW_HENAN_1_2004'                  ATGGAGAAAATAGTGCTTCTTCGTGCAATGATCAATCTTGTTAAAAGTGATCAGATTGGCGTTGGTTACCATGCAGACTACTCGACAGAGCAGGGTGACACAATAATGGAAAAGAACGTTACTGTTACACATGCTCAAGACATATTGGAAAAGACACACAACGGGAAGCTCTGCGACCTAGATGGAGTGAAGCCTCTAATTTTGAAAGATTGTAGTGTAGCTGGATGGCTCCTCGGAAACCCAATGTGTGACGAATTCATCAATGTGCCGGAGTGGTCTTACATAGTGGAGAAGGCCAGTCCAGCCAATGACCTCTGTTACCCAGGGGATTTCAACGACTATGAAGAACTGAAACACCTATTGAGCAGAATAAACCATTTTGAAAAAATTCAGATCATCCCCAAAAGTTCTTGGTCCGATCATGAAGCCTCATCAGGGGTGAGCTCAGCATGTCCATACCTGGGGAAGCCCTCCTTTTTCAGAAATGTGGTATGGCTTATCAAAAAGAACAGTACATACCCAACAATAAAGAGGGGCTACAATAATACCAACCCAGAAGATCTTTTGGTACTGTGGGGGATTCACCATCCTAATGATGCGGCAGAGCAGATAAAGCTCTATCAAAACCCAACCACCTATATTTCCGTTGGAACATCAACACTAAACCAGAGATTGGTACCAAAAATAGCTACTAGATCCAAAGTAAATGGGCAAAGTGGAAGAATGGAGTTCTTCTGGACAATTTTAAAGCCGAATGACGCTATCAACTTCGAGAGTAATGGAAATTTCATTGCTCCAGAATATGCATACCAAATTGTCAAGAAAGGGGACTCAGCAATTATGAAAAGTGAATTGGAATATGGTAACTGCAACACCAAGTGTCAAACTCCAATGGGGGCGATAAACTCTAGTATGCCATTCCACAACATACACCCTCTCACCATCGGGGAATGCCCCAAATATGTGAAATCAAACAGATTAGTCCTTGCGACAGGGCTCAGAAATAGCCCTCAAAGAGAGAGAAGAAGAAAAAAGAGAGGACTATTTGGAGCTATAGCAGGTTTTATAGAGGGAGGATGGCAGGGAATGGTAGATGGTTGGTATGGGTACCACCATAGCAATGAGCAGGGGAGTGGATACGCTGCAGACAAAGAATCCACTCAAAAAGCAATAGATGGAGTCACCAATAAGGTCAACTCGATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAATAACTTAGAAAGGAGAATAGAAAATTTAAACAAGAAGATGGAGGACGGATTCCTAGATGTCTGGACTTATAATGCTGAACTTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTTCATGACTCAAATGTCAAGAACCTTTACGAAAAGGTCCGACTACAACTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTTCTATCACAAATGTGATAATGAATGTATGGAAAGTGTAAAAAACGGAACGTATGACTACCCGCAGTATTCAGAAGAAGCAAGACTAAACAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAACTTACCAAATACTGTCAATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGCTGGTCTATCTTTATGGATGTGCTCCAATGGATCGTTACAATGCAGAATT\n-\t'TREESPARROW_HENAN_3_2004'                  ATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAAGTGATCAGATTTGCATTGGTTACCATGCAAACAACTCGACAGAGCAGGTTGACACTATAATGGAAAAGAACGTTACTGTTACACATGCCCAAGACATACTGGAAAAGACACACAACGGGAAGCTCTGCGATCTAGATGGAGTGAAGCCTCTAATTTTGAGAGATTGTAGTGTAGCTGGATGGCTCCTCGGAAACCCAATGTGTGACGAATTCATCAATGTGCCGGAATGGTCTTACATAGTGGAGAAGGCCAGTCCAGCCAATGACCTCTGTTACCCAGGGGATTTCAACGACTATGAAGAACTGAAACACCTATTGAGCAGAATAAACCATTTTGAGAAAATTCGGATCATCCCCAAAAGTTCTTGGTCCAATCATGATGCCTCATCAGGGGTGAGCTCAGCATGTCCATACCAGGGGAAGCCCTCCTTTTTCAGAAATGTGGTATGGCTTATCAAAAAGAACAGTACATACCCAACGATAAAGAGGAGCTACAATAATACCAACCCAGAAGATCTTTTGGTACTGTGGGGGATTCACCATCCTAATGATGCGGCAGAGCAGATAAAGCTCTATCAAAACCCAACCACCTATATTTCCGTTGGAACATCAACACTAAACCAGAGATTGGTACCAAAAATAGCTACTAGATCCAAAGTAAATGGGCAAAGTGGAAGAATGGAGTTCTTCTGGACAATTTTAAAGCCGAATGACGCTATCAACTTCGAGAGTAATGGAAATTTCATTGCTCCAGAATATGCATACAAAATTGTCAAGAAAGGGGACTCAGCAATTATGAAAAGTGAATTGGAATATGGTAACTGCAACACCAAGTGTCAAACTCCAATGGGGGCGATAAATTCTAGTATGCCATTCCACAACATACACCCTCTCACCATCGGGGAATGCCCCAAATATGTGAAATCAAACAGATTAGTCCTTGCGACAGGGCTCAGAAATAGCCCTCAAAGAGAGGGAAGAAGAAAAAAGAGAGGACTATTTGGAGCTATAGCAGGTTTTATAGAGGGAGGATGGCAGGGAATGGTAGATGGTTGGTATGGGTACCACCATAGCAATGAGCAGGGGAGTGGATACGCTGCAGACAAAGAATCCACTCAAAAAGCAATAGATGGAGTCACCAATAAGGTCAACTCGATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAATAACTTAGAAAGGAGAATAGAAAATTTAAACAAGAAGATGGAGGACGGATTCCTAGATGTCTGGACTTATAATGCTGAACTTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTTCATGACTCAAATGTCAAGAACCTTTACGAAAAGGTCCGACTACAACTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTTCTATCACAAATGTGATAATGAATG"..b'7)Node6:0.006520820816328921,((DUCK_GUANGZHOU_20_2005:0.009483822383439561,(GOOSE_SHANTOU_2216_2005:0.007374093130581441,PEREGRINEFALCON_HK_D0028_2004:0.007242107381463219)Node14:0.0007056004954733679)Node12:0.00710363602212726,CK_HK_WF157_2003:0.00757594740908166)Node11:0.004227199989057285)Node5:0.001170889307235767,TREESPARROW_HENAN_3_2004:0.008919169512223372)Node4:0.0008380584919822353,(TREESPARROW_HENAN_4_2004:0.006708184198226799,((CHICKEN_HONGKONG_915_97:1e-10,(HONGKONG_97_98:0.006483231977175029,HONGKONG_538_97:1e-10)Node24:0.0003031375872603963)Node22:0.0001037022453470389,(GOOSE_HONGKONG_W355_97:1e-10,DUCK_HONGKONG_Y283_97:1e-10)Node27:0.002891685438446985)Node21:0.02744187404194088)Node19:0.002944759819938243)Node3:0.001672904530995233,SWINE_ANHUI_2004:0.01260983958335299)Node2:0.01303612329048956,DUCK_VIETNAM_272_2005:0.008088746451854261)Node1:0.0006545914783555439,((HUMAN_VIETNAM_CL105_2005:0.01105269263722943,DUCK_VIETNAM_376_2005:0.006417734235732284)Node33:0.002021990910251294,VIETNAM_3062_2004:0.002304874937323682)Node32:0.0006817702832121299,(MALLARD_VIETNAM_16_2003:0.005709114871229771,CHICKEN_THAILAND_KANCHANABURI_CK_160_2005:0.008864540916547606)Node37:9.336374870077841e-05);\n-\tTREE tree_2 = ((((((TREESPARROW_HENAN_1_2004:0.002586740118529802,TREESPARROW_HENAN_3_2004:0.003828953390398881)Node5:0.00187457851284572,TREESPARROW_HENAN_4_2004:0.0002656521581222459)Node4:0.008299727091877812,((CHICKEN_HEBEI_326_2005:0.01484958179784225,(SWINE_ANHUI_2004:1e-10,TREESPARROW_HENAN_2_2004:0.002239639293515125)Node12:0.01361271232984685)Node10:0.01278549184522065,((CHICKEN_HONGKONG_915_97:1e-10,GOOSE_HONGKONG_W355_97:1e-10)Node16:0.001326679361120118,((DUCK_HONGKONG_Y283_97:0.0006676977779578139,HONGKONG_538_97:1e-10)Node20:0.0006271503105586317,HONGKONG_97_98:0.004109899012531796)Node19:0.001653355437332724)Node15:0.01224714418221361)Node9:0.003429034672045344)Node3:0.005031763140085913,(((DUCK_GUANGZHOU_20_2005:0.004262245747717084,GOOSE_SHANTOU_2216_2005:0.004308213533754201)Node26:0.002157320455556664,CK_HK_WF157_2003:0.00647028357035998)Node25:4.000752717957338e-05,PEREGRINEFALCON_HK_D0028_2004:0.002090471815463569)Node24:0.003260422306103818)Node2:0.00707214212698231,DUCK_VIETNAM_272_2005:0.01454290899825338)Node1:0.0007147389821184919,(((HUMAN_VIETNAM_CL105_2005:0.004710041013757377,DUCK_VIETNAM_376_2005:0.01496207542884359)Node34:0.001166828623060772,MALLARD_VIETNAM_16_2003:0.003061545152147631)Node33:0.001134022328667363,VIETNAM_3062_2004:1e-10)Node32:0.0002767006572468676,CHICKEN_THAILAND_KANCHANABURI_CK_160_2005:0.00405819047265216);\n-\tTREE tree_3 = (TREESPARROW_HENAN_1_2004:1.793736422068138e-05,((TREESPARROW_HENAN_3_2004:0.005512611253559831,TREESPARROW_HENAN_4_2004:0.000171467489882283)Node3:0.001655366533442949,(CHICKEN_HEBEI_326_2005:0.005483900345571787,(SWINE_ANHUI_2004:0.006320038544899231,(((CHICKEN_HONGKONG_915_97:1e-10,HONGKONG_97_98:1e-10)Node12:0.001309621960005619,(GOOSE_HONGKONG_W355_97:1e-10,HONGKONG_538_97:1e-10)Node15:0.0005799496924062597)Node11:0.01406566295495963,((DUCK_HONGKONG_Y283_97:999.9836725287659,(((HUMAN_VIETNAM_CL105_2005:0.002027831968935447,DUCK_VIETNAM_376_2005:0.003754856650868615)Node23:0.00177604858958343,(VIETNAM_3062_2004:0.002068037509299944,MALLARD_VIETNAM_16_2003:1e-10)Node26:0.0001623690336813484)Node22:0.0008290773242978894,(CHICKEN_THAILAND_KANCHANABURI_CK_160_2005:0.004523465945045592,DUCK_VIETNAM_272_2005:0.005056282226890766)Node29:0.001128740164377167)Node21:0.01150309854222087)Node19:0.001994389759866116,(((DUCK_GUANGZHOU_20_2005:0.003847909099661096,GOOSE_SHANTOU_2216_2005:0.005686200296849465)Node34:0.006130725555048717,PEREGRINEFALCON_HK_D0028_2004:8.387427766777967e-05)Node33:0.0009906361822363255,CK_HK_WF157_2003:0.005714641337818262)Node32:0.004952071733007253)Node18:0.002028752021615747)Node10:0.004186888950236505)Node8:0.002350770416932014)Node6:0.002154523112988181)Node2:0.0002651492563879275,TREESPARROW_HENAN_2_2004:0.007562549920247083);\n-END;\n'
b
diff -r bece0bad8e89 -r 3285fd1f4bde test-data/meme-in1.fa
--- a/test-data/meme-in1.fa Mon Feb 17 14:52:11 2020 -0500
+++ b/test-data/meme-in1.fa Thu Feb 20 18:10:47 2020 -0500
b
b'@@ -1,1 +1,1 @@\n->HUMAN_VIETNAM_CL105_2005\rATGGAGAAAATAGTGCTTCTTTTTGCGATAGTCAGTCTTGTTAAAAGTGA\rTCAGATTTGCATTGGTTACCATGCAAACAACTCGACAGAGCAGGTTGACA\rCAATAATGGAAAAGAACGTTACTGTTACACATGCCCAAGACATACTGGAA\rAAGACACACAACGGGAAGCTCTGCGATCTAGATGGAGTGAAGCCTCTAAT\rTTTGAGAGATTGTAGTGTAGCTGGATGGCTACTCGGAAACCCAATGTGTG\rACGAATTCATCAATGTGCCGGAATGGTCTTACATAGTGGAGAAGGTCAAT\rCCAGTCAATGACCTCTGTTACCCAGGGGTTTTCAATGACTATGAAGAATT\rGAAACACCTATTGAGCAGAATAAACCATTTTGAGAAAATTCAGATCATCC\rCCAAAAGTTCTTGGTCCAGTCATGAAGCCTCATTAGGGGTGAGCTCAGTA\rTGTCCATACCAGGGAAAGTCCTCCTTTTTCAGAAATGTGGTATGGCTTAT\rCAAAAAGAACAGTACATACCCAACAATAAAGAGGAGCTACAATAATACCA\rACCAAGAAGATCTTTTGGTAATATGGGGGATTCATCATCCTAATGATGCG\rGCAGAGCAGATAAAGCTCTATCAAAACCCAACCACCTATATTTCCGTTGG\rGACATCAACACTAAACCAGAGATTGGTACCAAGAATAGCTACTAGATCCA\rAAGTAAACGGGCAAAGTGGGAGGATGGAGTTCTTCTGGACAATTTTAAAA\rCCGAATGATGCAATCAACTTCGAGAGTAATGGAAATTTCATTGCTCCAGA\rATATGCATACAAGATTGTCAAGAAAGGGGACTCAACAATTATGAAAAGTG\rAATTGGAATATGGTAACTGCAACACCAAGTGTCAAACTCCAATGGGGGCG\rATAAACTCTAGTATGCCATTCCACAATATACACCCTCTCACCATCGGGGA\rATGCCCCAAATATGTGAAATCAAACAGATTAGTCCTTGCGACTGGGCTCA\rGAAATAGCCCTCAAAAAGAGAGAAGAAGAAAAAAGAGAGGATTATTTGGA\rGCTATAGCAGGTTTTATAGAGGGAGGATGGCAGGGAATGGTAGATGGTTG\rGTATGGGTACCACCATAGCAATGAGCAGGGGAGTGGGTACGCTGCAGACA\rAAGAATCCACTCAAAAGGCAATAGATGGAGTCACCAATAAGGTCAACTCG\rATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAA\rCAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGGT\rTCCTAGATGTCTGGACTTATAATGCTGAACTTCTGGTTCTCATGGAAAAT\rGAGAGAACTCTAGACTTTCATGACTCAAATGTCAAGAACCTTTACGACAA\rGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTT\rTCGAGTTCTATCATAAATGTGATAATGAATGTATGGAAAGTGTAAGAAAC\rGGAACGTATGACTACCCGCAGTATTCAGAAGAAGCAAGATTAAAAAGAGA\rGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAATTTACCAAATACTGT\rCAATTTATTCTACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATAGCT\rGGTCTATCCTTATGGATGTGCTCCAATGGGTCGTTACAA---------\r>CHICKEN_HEBEI_326_2005\rATGGAGAGAATAGTGCTTCTTCTTGCAATAATCGGTCTTGTTAAAAGTGA\rTCAGATTTGCATTGGTTACCATGCAAACAACTCGACAGAGCAGGTTGACA\rCAATAATGGAAAAGAACGTTACTGTTACACATGCTCAAGACATACTGGAG\rAAGACACACAACGGGAAGCTCTGCAACCCAGATGGAGTGAAGCCTCTAAT\rTTTGAAAGATTGTAGTGTAGCTGGATGGCTCCTCGGAAACCCAATGTGTG\rACGAATTTATCAATGTGCCGGAATGGTCTTACATAGTGGAGAAGGCCAGT\rCCAGCCAATGGCCTCTGTTACCCAGGGGATTTCAATGACTATGAAGAACT\rGAAACACCTATTGAGCAGAATAAACCATTTTGAGAAAATTCAGATCATCC\rCCAAAAGTTCTTGGTCCGATCATGGAGCCTCATCAGGGGTGAGCTCAGCA\rTGTTCCTATCTGGGGAAGCCCTCCTTTTTCAGAAATGTGGTATGGCTTAT\rCAAAAAGAATAATACATACCCACCAATAAAGGTGAGCTACAACAATACCA\rACCAAGAAGATCTTTTGGTACTGTGGGGGATTCACCATCCCAATGATGAG\rGCAGAGCAGATAAAGATCTATCAAAACCCAACCACCTATATTTCCGTTGG\rAACATCAACACTAAACCAGAGATTGGTACCAAAAATAGCTACTAGATCCA\rAAGTAAACGGGCAAAGTGGAAGAATGGAGTTCTTCTGGACAATTTTAAAG\rCCGAATGATGCTATCAATTTCGATAGTAATGGAAATTTCATTGCTCCAGA\rATATGCATACAAAATTGTCAAGAAAGGGGACTCAGCGATTATGAAAAGTG\rAATTGGAATATGGCAACTGCAACACCAAGTGTCAAACTCCAATGGGGGCG\rATAAATTCTAGTATGCCATTCCACAACATACACCCTCTCACCGTCGGGGA\rATGCCCCAAATATGTGAAATCAAACAGATTAGTCCTCGCGACTGGACTCA\rGAAATGCCCCTCAAAGAGAGGGAGGAAGAAAAAAGAGAGGACTATTTGGA\rGCCATAGCAGGGTTTATAGAGGGAGGATGGCAGGGAATGGTAGATGGTTG\rGTATGGGTACCACCATAGCAATGAGCAGGGGAGTGGATACGCTGCAGACA\rAAGAATCCACTCAAAAGGCAATAGATGGAGTCACCAATAAGGTCAACTCG\rATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAA\rTAACTTAGAAAGGAGAATAGAAAATTTAAACAAGAAGATGGAGGACGGAT\rTCCTAGATGTCTGGACTTATAACGCTGAACTTCTGGTTCTCATGGAAAAT\rGAGAGAACTCTAGACTTTCATGACTCAAATGTCAAGAACCTTTACGAAAA\rGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTT\rTCGAGTTCTATCACAAATGTGATAATGAATGTATGGAAAGTGTAAAAAAC\rGGAACGTATGACTACCCGCAGTATTCAGAAGAAGCAAGACTAAACAGAGA\rGGAAATAAGTGGAGTAAAATTGGAATCAATGGGAACTTACCAAATACTGT\rCAATTTATTCAACAGTGGCGAGTTCCCTAGCATTGGCAATCATGGTAGCT\rGGTCTATCTTTATGGATGTGCTCCAATGGATCGTTACAATGCAGAATT\r>CHICKEN_HONGKONG_915_97\rATGGAGAAAATAGTGCTTCTTCTTGCAACAGTCAGTCTTGTTAAAAGTGA\rTCAGATTTGCATTGGTTACCATGCAAACAACTCGACAGAGCAGGTTGACA\rCAATAATGGAAAAGAATGTTACTGTTACACATGCCCAAGACATACTGGAA\rAGGACACACAACGGGAAGCTCTGCGATCTAAATGGAGTGAAACCTCTCAT\rTTTGAGGGATTGTAGTGTAGCTGGATGGCTCCTCGGAAACCCTATGTGTG\rACGAATTCATCAATGTGCCGGAATGGTCTTACATAGTGGAGAAGGCCAGT\rCCAGCCAATGACCTCTGTTATCCAGGGAATTTCAACGACTATGAAGAACT\rGAAACACCTATTGAGCAGAATAAACCATTTTGAGAAAATTCAGATCATCC\rCCAAAAGTTCTTGGTCCAATCATGATGCCTCATCA'..b'GGGAATTTAA\rCAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGGT\rTCCTAGATGTCTGGACTTATAATGCTGAACTTCTGGTTCTCATGGAAAAT\rGAGAGAACTCTAGACTTTCATGACTCAAATGTCAAGAACCTTTACGACAA\rGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTT\rTCGAGTTCTATCATAAATGTGATAATGAATGTATGGAAAGTGTAAGGAAC\rGGAACGTATGACTACCCGCAGTATTCAGAAGAAGCAAGACTAAAAAGAGA\rGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAATTTACCAAATACTGT\rCAATTTATTCTACAGTAGCGAGTTCCCTAGCACTGGCAATCATGGTAGCT\rGGTCTATCCTTATGGATGTGCTTCAATGGGTCGTTACAATGCAGAATT\r>CK_HK_WF157_2003\rATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAAGTGA\rTCAGATTTGCATTGGTTACCATGCAAACAACTCGACAGAGCAGGTTGACA\rCAATAATGGAAAAGAACGTTACTGTTACACATGCCCAAGACATACTGGAA\rAAGACCCACAACGGGAAGCTCTGCGACCTAGATGGAGTGAAGCCTCTAAT\rTTTGAGAGATTGTAGTGTAGCTGGATGGCTCCTCGGGAACCCAATGTGTG\rACGAATTCATCAATGTACCGGAATGGTCTTACATAGTGGAGAAGGCCAGT\rCCATCCAATGACCTCTGTTACCCAGGGGATTTCAACAATTATGAAGAACT\rGAAACACCTATTGAGCAGAATAAACCATTTTGAGAAAATTCAGATCATCC\rCCAAAAGCTCTTGGTCCAATCATGAAGCCTCATCAGGGGTGAGCTCAGCA\rTGTCCATACCTGGGAAAGCCCTCCTTTTTCAGAAATGTGGTATGGCTTAT\rCAAAAAGAACAGTACATACCCAACAATAAAGAGGAGCTACAATAATACCA\rACCAAGAAGATCTTTTGGTACTGTGGGGGATTCACCATCCTAATGATGCG\rGCAGAGCAGATAAAGCTCTATCAAAACCCAACCACCTATATTTCCGTTGG\rAACATCAACACTAAACCAGAGATTGGTACCAAAAATAGCTACTAGATCCA\rAAGTAAACGGGCAAAGTGGAAGGATGGAGTTCTTCTGGACAATTTTAAAA\rCCGAATGATGCAATCAACTTCGAGAGTAATGGAAATTTCATTGCTCCAGA\rATATGCATACAAAATTGTCAAGAAAGGGGACTCAGCAATTATGAAAAGTG\rAATTGGAATATGGTAACTGCAACACCAAGTGTCAAACTCCAATGGGGGCG\rATAAACTCTAGTATGCCCTTCCACAACATACACCCTCTCACCATCGGGGA\rATGCCCCAAATATGTGAAATCAAACAGACTAGTCCTTGCGACTGGGCTCA\rGAAATAGCCCTCAAAGAGAGAGAAGAAGAAAAAAGAGAGGACTATTTGGA\rGCTATAGCGGGTTTTATAGAGGGAGGATGGCAGGGAATGGTAGATGGTTG\rGTATGGATACCACCATAGCAATGAGCAGGGGAGTGGATACGCTGCAGACA\rAAGAATCCACTCAAAAGGCAATAGATGGAGTCACCAATAAGGTCAACTCG\rATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAA\rTAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGAT\rTCCTAGATGTCTGGACTTATAATGCTGAACTTCTAGTTCTCATGGAAAAT\rGAGAGAACTCTAGACTTTCATGACTCAAATGTCAAGAACCTTTACGACAA\rGGTCCGACTACAGCTTAGGGATAATGCAAAAGAGCTGGGTAACGGTTGTT\rTCGAGTTCTATCACAAATGTGATAATGAATGTATGGAAAGTGTAAGAAAC\rGGAACGTATGACTACTCGCAGTATTCAGAAGAAGCAAGACTAAAAAGAGA\rGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAACTTACCAAATACTGT\rCAATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGCT\rGGTCTATCTTTATGGATGTGCTCCAATGGTTCGTTACAATGT------\r>SWINE_ANHUI_1_2004\rATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAGGTGA\rTCAGATTTGCACTGGTTACCATGCAAACAACTCGACAGAGCAGGTTGACA\rCAATAATGGAAAAGAACGTTACTGTTACACATGCTCAAGACATACTGGAA\rAAGACACACAACGGGAAGCTCTGCGACCTAGATGGAGTGAAGCCTCTAAT\rTTTAAGAGATTGTAGTGTAGCTGGATGGCTCCTCGGGAACCCAATGTGTG\rACGAATTCATCAATGTGCCGGAATGGTCTTACATAGTGGAGAAGGCCAAT\rCCAGCCAATGACCTCTGTTACCCAGGGGATTTCAACGACTATGAAGAACT\rGAAACACCTATTGAGCAGAATAAACCATTTTGAGAAAATTCAGATCATCC\rCCAAAAGTTCTTGGTCCGATCATGAAGCCTCATCAGGGGTGAGCTCAGCA\rTGTCCATACCAGGGAAGGTCCTCCTTTTTCAGAAATGTGGTATGGCTTAT\rCAAAAAGAACAGTGCATACCCAACAATAAAGAGGAGCTACAATAATACCA\rACCAAGAAGATCTTTTGGTACTGTGGGGGATTCACCACCCTAATGATGCG\rGCAGAGCAGATAAAGCTCTATCAAAACCCAACCACCTATATTTCCGTTGG\rGACATCAACACTAAACCAGAGATTGGTACCAAAAATAGCTACTAGATCCA\rAAGTAAACGGACAAAGTGGAAGAATGGAGTTCTTCTGGACAATTTTAAAA\rCCGAATGATGCTATCAATTTCGAGAGTAATGGAAATTTCATTGCTCCAGA\rATATGCATACAAAATTGTCAAGAAAGGGGACTCTGCAATTATGAAAAGTG\rAATTGGAATATGGCAACTGCAACACCAAGTGTCAAACTCCAGTGGGGGCG\rATAAATTCTAGCATGCCATTCCACAACATACACCCTCTCACCATCGGGGA\rATGCCCCAAATATGTGAAATCAAACAGATTAGTCCTTGCGACTGGACTCA\rGAAATGCCCCTCAAAGAGAGGGAAGAAGAAAAAAGAGAGGACTATTTGGA\rGCTATAGCAGGGTTTATAGAGGGAGGATGGCAGGGGATGGTAGATGGTTG\rGTATGGGTACCACCATAGCAATGAGCAGGGGAGTGGATACGCTGCAGACA\rAAGAATCCACTCAAAAAGCAATAGATGGAGTCACCAATAAGGTCAACTCG\rATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAA\rTAACTTAGAAAGGAGAATAGAAAATTTAAACAAGAAGATGGAGGACGGAT\rTCCTAGATGTCTGGACTTATAATGCTGAACTTCTGGTTCTCATGGAAAAT\rGAGAGAACTCTAGACTTTCATGACTCAAATGTCAAGAACCTTTACGACAA\rGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTT\rTCGAGTTCTATCACAGATGTGATAATGAATGTATGGAAAGTGTAAGAAAC\rGGAACGTATGACTACCCGCAGTATTCGGAAGAAGCAAGACTAAACAGAGA\rGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAACTTACCAAATACTGT\rCAATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTGGCT\rGGTCTATCTTTATGGATGTGCTCCAATGGATCGTTACAATGCAGAATT\r\n\\ No newline at end of file\n'
b
diff -r bece0bad8e89 -r 3285fd1f4bde test-data/meme-in1.nhx
--- a/test-data/meme-in1.nhx Mon Feb 17 14:52:11 2020 -0500
+++ b/test-data/meme-in1.nhx Thu Feb 20 18:10:47 2020 -0500
b
@@ -1,1 +1,1 @@
-((((((CHICKEN_HEBEI_326_2005:0.02100885319673648,(SWINE_ANHUI_1_2004:0.007702393698306516):0.002095219592954275):0.003887237703073042):0.003404921392531202,(((CHICKEN_HONGKONG_915_97,(GOOSE_HONGKONG_W355_97):0.002899766890966483):0.001306376767766534,HONGKONG_1_97_98:0.003844928589233716):0.000182535836694054,HONGKONG_1_538_97):0.02096173714686362):0.003130608143291779,(((GOOSE_SHANTOU_2216_2005:0.006207842095177651):0.002530613798219486,PEREGRINEFALCON_HK_D0028_2004:0.003608296348267232):0.003288900909856382,CK_HK_WF157_2003:0.00663129750258774):0.004655295319725731):0.0116808817874948,(((HUMAN_VIETNAM_CL105_2005:0.006521522005742001):0.001967887030302483,HUMAN_VIETNAM_3062_2004:0.001641397323851184):0.0003125512332168847,MALLARD_VIETNAM_16_2003:0.003244216605281072):0.0004125980823087554):0.0003400912533471183,CHICKEN_CK_160_2005:0.006168335080088849)
+((((((CHICKEN_HEBEI_326_2005:0.02100885319673648,(SWINE_ANHUI_1_2004:0.007702393698306516):0.002095219592954275):0.003887237703073042)):0.003130608143291779,(CK_HK_WF157_2003:0.00663129750258774):0.004655295319725731):0.0116808817874948,(MALLARD_VIETNAM_16_2003:0.003244216605281072):0.0004125980823087554):0.0003400912533471183,CHICKEN_CK_160_2005:0.006168335080088849)
b
diff -r bece0bad8e89 -r 3285fd1f4bde test-data/meme-out1.json
--- a/test-data/meme-out1.json Mon Feb 17 14:52:11 2020 -0500
+++ b/test-data/meme-out1.json Thu Feb 20 18:10:47 2020 -0500
[
b'@@ -4,568 +4,568 @@\n      "0":      [\n [0, 0, 1, 0, 0, 0, 1, 0, 0],\n       [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 0, 1.817789942174416, 1, 0, 0.6666666666666666, 0, 0.1215618554703716],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 0, 1.900735833403759, 1, 0, 0.6666666666666666, 0, 0.1271087320415013],\n-      [4.223295318660594, 0, 0.1812832787403597, 0, 0.8187167212596403, 0, 0.6666666666666666, 0, 0.1220041843130002],\n-      [0, 0, 0, 1.454313013998262, 1, 0, 0.6666666666666666, 0, 0.0972549051541481],\n-      [0, 0, 0, 3.87077558632765, 1, 0, 0.6666666666666666, 0, 0.2588520551613089],\n-      [5.621376790757513, 1.370160457631764, 0.9999999999999731, 0.7647952134694137, 2.686739719592879e-14, 0, 0.6666666666666666, 0, 0.254019836850675],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 0, 1.424462463669818, 1, 0, 0.6666666666666666, 0, 0.09525869635105789],\n-      [5.523939453749047, 0, 0.10495989356895, 0, 0.89504010643105, 0, 0.6666666666666666, 0, 0.1595776938144102],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 0, 1.857975844217296, 1, 0, 0.6666666666666666, 0, 0.1242492247327626],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [3.058609686328263, 0, 0.2099197871379, 0, 0.7900802128621, 0, 0.6666666666666666, 0, 0.08835829648556724],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [4.559377199598298, 0, 0.21, 0, 0.79, 0, 0.6666666666666666, 0, 0.1317130473340185],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [12.52835534598018, 0, 0.1028092111730828, 0, 0.8971907888269173, 0, 0.6666666666666666, 0, 0.3619239620814606],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [4.526013289893084, 0, 0.21, 0, 0.79, 0, 0.6666666666666666, 0, 0.1307492178402363],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [5.778235281128926, 0, 0.1048549336753811, 0, 0.895145066324619, 0, 0.6666666666666666, 0, 0.1669238897710598],\n-      [0, 0, 0, 1.811443451630381, 1, 0, 0.6666666666666666, 0, 0.1211374438547285],\n-      [4.16975333524863, 0, 0.1049198024571935, 0, 0.8950801975428065, 0, 0.6666666666666666, 0, 0.1204574428422312],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [6.067135083387604, 6.067135083387516, 0.313721188257573, 0, 0.686278811742427, 0, 0.6666666666666666, 0, 0.3025558868541813],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [12.64639196707219, 0, 0.1815339017197267, 0, 0.8184660982802733, 0, 0.6666666666666666, 0, 0.3653338495245107],\n-      [11.47465033410186, 1.6250977062539, 0.6784748819617452, 1.547845247613481, 0.3215251180382548, 0, 0.6666666666666666, 0, 0.4384989474073118],\n-      [2.150572226072722, 2.150572226072722, 0, 3.001300331424568, 1, 0, 0.6666666666666666, 0, 0.2628338207031371],\n-      [0, 0, 0, 1.596774706161373, 1, 0, 0.6666666666666666, 0, 0.1067818077026798],\n-      [4.743412316579805, 0, 0.0586026566410284, 0, 0.9413973433589716, 0, 0.6666666666666666, 0, 0.1370295247854218],\n-      [0, 0, 1, 0, 0, 0, 1, 0, 0],\n-      [3.682413523444095, 0, 0.10495989356895, 0, 0.89504010643105, 0, 0.6666666666666666, 0, 0.1063789823661781'..b'[0.2307510906732697],\n+      [0.2365287112368824] \n       ],\n-     "Log Likelihood":-3679.541894127126,\n+     "Log Likelihood":-3056.215005465311,\n      "Rate Distributions":{\n-       "Substitution rate from nucleotide A to nucleotide C":0.2089484190839437,\n+       "Substitution rate from nucleotide A to nucleotide C":0.1810796022573985,\n        "Substitution rate from nucleotide A to nucleotide G":1,\n-       "Substitution rate from nucleotide A to nucleotide T":0.1069376934774225,\n-       "Substitution rate from nucleotide C to nucleotide G":0.01756475048192434,\n-       "Substitution rate from nucleotide C to nucleotide T":1.483513244796427,\n-       "Substitution rate from nucleotide G to nucleotide T":0.1186889750817524\n+       "Substitution rate from nucleotide A to nucleotide T":0.08708610799393901,\n+       "Substitution rate from nucleotide C to nucleotide G":0.02931879134098465,\n+       "Substitution rate from nucleotide C to nucleotide T":1.419136867291551,\n+       "Substitution rate from nucleotide G to nucleotide T":0.1065442350926398\n       },\n      "display order":0,\n-     "estimated parameters":36\n+     "estimated parameters":20\n     }\n   },\n  "input":{\n-   "file name":"/tmp/tmpu23i1b/job_working_directory/000/3/working/meme_input.fa",\n-   "number of sequences":13,\n+   "file name":"/tmp/tmpdqyv1aux/job_working_directory/000/3/working/./meme_input.fa",\n+   "number of sequences":5,\n    "number of sites":566,\n    "partition count":1,\n    "trees":{\n-     "0":"(((((CHICKEN_HEBEI_326_2005:0.02100885319673648,(SWINE_ANHUI_1_2004:0.007702393698306516)Node7:0.002095219592954275)Node5:0.003887237703073042)Node4:0.003404921392531202,(((CHICKEN_HONGKONG_915_97:-1,(GOOSE_HONGKONG_W355_97:-1)Node13:0.002899766890966483)Node11:0.001306376767766534,HONGKONG_1_97_98:0.003844928589233716)Node10:0.000182535836694054,HONGKONG_1_538_97:-1)Node9:0.02096173714686362)Node3:0.003130608143291779,(((GOOSE_SHANTOU_2216_2005:0.006207842095177651)Node19:0.002530613798219486,PEREGRINEFALCON_HK_D0028_2004:0.003608296348267232)Node18:0.003288900909856382,CK_HK_WF157_2003:0.00663129750258774)Node17:0.004655295319725731)Node2:0.0116808817874948,(((HUMAN_VIETNAM_CL105_2005:0.006521522005742001)Node25:0.001967887030302483,HUMAN_VIETNAM_3062_2004:0.001641397323851184)Node24:0.0003125512332168847,MALLARD_VIETNAM_16_2003:0.003244216605281072)Node23:0.0004125980823087554,CHICKEN_CK_160_2005:0.006168335080088849)"\n+     "0":"(((((CHICKEN_HEBEI_326_2005:0.02100885319673648,(SWINE_ANHUI_1_2004:0.007702393698306516)Node7:0.002095219592954275)Node5:0.003887237703073042)Node4:-1)Node3:0.003130608143291779,(CK_HK_WF157_2003:0.00663129750258774)Node9:0.004655295319725731)Node2:0.0116808817874948,(MALLARD_VIETNAM_16_2003:0.003244216605281072)Node11:0.0004125980823087554,CHICKEN_CK_160_2005:0.006168335080088849)"\n     }\n   },\n  "tested":{\n    "0":{\n      "CHICKEN_CK_160_2005":"test",\n      "CHICKEN_HEBEI_326_2005":"test",\n-     "CHICKEN_HONGKONG_915_97":"test",\n      "CK_HK_WF157_2003":"test",\n-     "GOOSE_HONGKONG_W355_97":"test",\n-     "GOOSE_SHANTOU_2216_2005":"test",\n-     "HONGKONG_1_538_97":"test",\n-     "HONGKONG_1_97_98":"test",\n-     "HUMAN_VIETNAM_3062_2004":"test",\n-     "HUMAN_VIETNAM_CL105_2005":"test",\n      "MALLARD_VIETNAM_16_2003":"test",\n-     "Node10":"test",\n      "Node11":"test",\n-     "Node13":"test",\n-     "Node17":"test",\n-     "Node18":"test",\n-     "Node19":"test",\n      "Node2":"test",\n-     "Node23":"test",\n-     "Node24":"test",\n-     "Node25":"test",\n      "Node3":"test",\n      "Node4":"test",\n      "Node5":"test",\n      "Node7":"test",\n      "Node9":"test",\n-     "PEREGRINEFALCON_HK_D0028_2004":"test",\n      "SWINE_ANHUI_1_2004":"test"\n     }\n   },\n  "timers":{\n    "MEME analysis":{\n      "order":2,\n-     "timer":249\n+     "timer":437\n     },\n    "Model fitting":{\n      "order":1,\n-     "timer":2\n+     "timer":4\n     },\n    "Total time":{\n      "order":0,\n-     "timer":252\n+     "timer":442\n     }\n   }\n }\n\\ No newline at end of file\n'