Repository 'ncbi_entrez_direct_efetch'
hg clone https://toolshed.g2.bx.psu.edu/repos/iuc/ncbi_entrez_direct_efetch

Changeset 0:91efba463050 (2022-03-22)
Next changeset 1:4d0b8d9aee09 (2022-12-17)
Commit message:
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/ncbi_entrez_direct commit 8f96f378620bb663dcce2845ecb14355413f7afa"
added:
README.rst
__efetch_build_options.py
efetch.xml
macros.xml
b
diff -r 000000000000 -r 91efba463050 README.rst
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/README.rst Tue Mar 22 22:30:08 2022 +0000
b
@@ -0,0 +1,17 @@
+Galaxy NCBI Entrez Direct Tools
+===============================
+
+This repo requires a readme as administrators should very aware of some
+restrictions NCBI places on the use of the Entrez service.
+
+NCBI requests that you please limit large jobs to either weekends or
+between 9:00 PM and 5:00 AM Eastern time during weekdays. This is not a
+request that the Galaxy tool can easily service, so we've included it in
+the disclaimer on every tool quite prominently.
+
+Failure to comply with NCBI's policies may result in an block.
+
+Note that these are *IP* level blocks so the Galaxy tools uses a
+concatenation of the administrator's emails, and the user email, in
+hopes that NCBI will contact all relevant parties should their system be
+abused.
b
diff -r 000000000000 -r 91efba463050 __efetch_build_options.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/__efetch_build_options.py Tue Mar 22 22:30:08 2022 +0000
[
b'@@ -0,0 +1,225 @@\n+#!/usr/bin/env python\n+\n+# Daniel Blankenberg\n+# Creates the options for tool interface\n+\n+# http://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi\n+db_list = \'\'\'<DbName>pubmed</DbName>\n+<DbName>protein</DbName>\n+<DbName>nuccore</DbName>\n+<DbName>nucleotide</DbName>\n+<DbName>nucgss</DbName>\n+<DbName>nucest</DbName>\n+<DbName>structure</DbName>\n+<DbName>genome</DbName>\n+<DbName>annotinfo</DbName>\n+<DbName>assembly</DbName>\n+<DbName>bioproject</DbName>\n+<DbName>biosample</DbName>\n+<DbName>blastdbinfo</DbName>\n+<DbName>books</DbName>\n+<DbName>cdd</DbName>\n+<DbName>clinvar</DbName>\n+<DbName>clone</DbName>\n+<DbName>gap</DbName>\n+<DbName>gapplus</DbName>\n+<DbName>grasp</DbName>\n+<DbName>dbvar</DbName>\n+<DbName>gene</DbName>\n+<DbName>gds</DbName>\n+<DbName>geoprofiles</DbName>\n+<DbName>homologene</DbName>\n+<DbName>medgen</DbName>\n+<DbName>mesh</DbName>\n+<DbName>ncbisearch</DbName>\n+<DbName>nlmcatalog</DbName>\n+<DbName>omim</DbName>\n+<DbName>orgtrack</DbName>\n+<DbName>pmc</DbName>\n+<DbName>popset</DbName>\n+<DbName>probe</DbName>\n+<DbName>proteinclusters</DbName>\n+<DbName>pcassay</DbName>\n+<DbName>biosystems</DbName>\n+<DbName>pccompound</DbName>\n+<DbName>pcsubstance</DbName>\n+<DbName>pubmedhealth</DbName>\n+<DbName>seqannot</DbName>\n+<DbName>snp</DbName>\n+<DbName>sra</DbName>\n+<DbName>taxonomy</DbName>\n+<DbName>unigene</DbName>\n+<DbName>gencoll</DbName>\n+<DbName>gtr</DbName>\'\'\'.replace("<DbName>", "").replace("</DbName>", "").split("\\n")\n+\n+\n+help = \'\'\'  (all)\n+                 docsum                      DocumentSummarySet XML\n+                 docsum             json     DocumentSummarySet JSON\n+                 full                        Same as native except for mesh\n+                 uid                         Unique Identifier List\n+                 url                         Entrez URL\n+                 xml                         Same as -format full -mode xml\n+\n+  bioproject\n+                 native                      BioProject Report\n+                 native             xml      RecordSet XML\n+\n+  biosample\n+                 native                      BioSample Report\n+                 native             xml      BioSampleSet XML\n+\n+  biosystems\n+                 native             xml      Sys-set XML\n+\n+  gds\n+                 native             xml      RecordSet XML\n+                 summary                     Summary\n+\n+  gene\n+                 gene_table                  Gene Table\n+                 native                      Gene Report\n+                 native             asn.1    Entrezgene ASN.1\n+                 native             xml      Entrezgene-Set XML\n+                 tabular                     Tabular Report\n+\n+  homologene\n+                 alignmentscores             Alignment Scores\n+                 fasta                       FASTA\n+                 homologene                  Homologene Report\n+                 native                      Homologene List\n+                 native             asn.1    HG-Entry ASN.1\n+                 native             xml      Entrez-Homologene-Set XML\n+\n+  mesh\n+                 full                        Full Record\n+                 native                      MeSH Report\n+                 native             xml      RecordSet XML\n+\n+  nlmcatalog\n+                 native                      Full Record\n+                 native             xml      NLMCatalogRecordSet XML\n+\n+  pmc\n+                 medline                     MEDLINE\n+                 native             xml      pmc-articleset XML\n+\n+  pubmed\n+                 abstract                    Abstract\n+                 medline                     MEDLINE\n+                 native             asn.1    Pubmed-entry ASN.1\n+                 native             xml      PubmedArticleSet XML\n+\n+  (sequences)\n+                 acc                         Accession Number\n+                 est                         EST Report\n+                 fasta                       FASTA\n+    '..b'   INSDSet XML\n+                 gss                         GSS Report\n+                 ipg                         Identical Protein Report\n+                 ipg                xml      IPGReportSet XML\n+                 native             text     Seq-entry ASN.1\n+                 native             xml      Bioseq-set XML\n+                 seqid                       Seq-id ASN.1\n+\n+  snp\n+                 chr                         Chromosome Report\n+                 docset                      Summary\n+                 fasta                       FASTA\n+                 flt                         Flat File\n+                 native             asn.1    Rs ASN.1\n+                 native             xml      ExchangeSet XML\n+                 rsr                         RS Cluster Report\n+                 ssexemplar                  SS Exemplar List\n+\n+  sra\n+                 native             xml      EXPERIMENT_PACKAGE_SET XML\n+                 runinfo            xml      SraRunInfo XML\n+\n+  structure\n+                 mmdb                        Ncbi-mime-asn1 strucseq ASN.1\n+                 native                      MMDB Report\n+                 native             xml      RecordSet XML\n+\n+  taxonomy\n+                 native                      Taxonomy List\n+                 native             xml      TaxaSet XML\'\'\'.split("\\n")\n+\n+db = {}\n+name = None\n+all = "(all)"\n+for line in help:\n+    if line.strip() and line[2] != \' \':\n+        name = line.strip()\n+        db[name] = {}\n+    elif line.strip():\n+        format = line[0:len("                 docsum             ")].strip()\n+        mode = line[len("                 docsum             "):len("                 docsum             json     ")].strip()\n+        if format not in db[name]:\n+            db[name][format] = []\n+        db[name][format].append(mode)\n+\n+for name in db_list:\n+    if name not in db:\n+        db[name] = {}\n+\n+db["sequences"] = db["(sequences)"]\n+del db["(sequences)"]\n+\n+print(\'<conditional name="db">\')\n+print(\'    <param name="db" type="select" label="Database" argument="-db">\')\n+for name in sorted(db.keys()):\n+    if name == all:\n+        continue\n+    print(\'        <option value="%s">%s</option>\' % (name, name))\n+print(\'        <option value="">Manual Entry</option>\')\n+print(\'    </param>\')\n+\n+for name in sorted(db.keys()):\n+    if name == all:\n+        continue\n+    my_dict = db[all].copy()\n+\n+    for format, modes in db[name].items():\n+        if format in my_dict:\n+            for mode in modes:\n+                if mode not in my_dict[format]:\n+                    my_dict[format].append(mode)\n+        else:\n+            my_dict[format] = modes\n+    if "" not in my_dict:\n+        my_dict[""] = [""]\n+    print(\'    <when value="%s">\' % name)\n+    print(\'        <conditional name="format">\')\n+    print(\'            <param name="format" type="select" label="Format" argument="-format">\')\n+    for format in sorted(my_dict.keys()):\n+        print(\'                <option value="%s">%s</option>\' % (format, format or "None"))\n+    print(\'            </param>\')\n+    for format in sorted(my_dict.keys()):\n+        print(\'            <when value="%s">\' % format)\n+        print(\'                <param name="mode" type="select" label="Mode" argument="-mode">\')\n+        if "" not in my_dict[format]:\n+            my_dict[format].append("")\n+        for mode in sorted(my_dict[format]):\n+            print(\'                    <option value="%s">%s</option>\' % (mode, mode or "None"))\n+        print(\'                </param>\')\n+        print(\'            </when>\')\n+    print(\'        </conditional>\')\n+    print(\'    </when>\')\n+print(\'    <when value="">\')\n+print(\'        <param name="db_manual" type="text" label="Database" argument="-db"/>\')\n+print(\'        <param name="format" type="text" label="Format" argument="-format"/>\')\n+print(\'        <param name="mode" type="text" label="Mode" argument="-mode"/>\')\n+print(\'    </when>\')\n+print(\'</conditional>\')\n'
b
diff -r 000000000000 -r 91efba463050 efetch.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/efetch.xml Tue Mar 22 22:30:08 2022 +0000
[
b'@@ -0,0 +1,2581 @@\n+<tool id="ncbi_entrez_direct_efetch" name="NCBI EFetch" version="@TOOL_VERSION@">\n+  <description>fetch records from NCBI</description>\n+  <macros>\n+    <import>macros.xml</import>\n+  </macros>\n+  <expand macro="requirements"/>\n+  <version_command>efetch -version</version_command>\n+  <command detect_errors="exit_code"><![CDATA[\n+      @ECONTACT@\n+      efetch\n+      #if str( $db.db ) == "":\n+          #if str( $db.db_manual ):\n+              -db "${db.db_manual}"\n+          #end if\n+          #if str( $db.format ):\n+              -format "${db.format}"\n+          #end if\n+          #if str( $db.mode ):\n+              -mode "${db.mode}"\n+          #end if\n+      #else:\n+        #if str( $db.db ):\n+            -db "${db.db}"\n+        #end if\n+        #if str( $db.format.format ):\n+            -format "${db.format.format}"\n+        #end if\n+        #if str( $db.format.mode ):\n+            -mode "${db.format.mode}"\n+        #end if\n+      #end if\n+      #if str( $ranges.seq_start ):\n+          -seq_start "${ranges.seq_start}"\n+      #end if\n+      #if str( $ranges.seq_stop ):\n+          -seq_stop "${ranges.seq_stop}"\n+      #end if\n+      #if str( $ranges.strand ):\n+          -strand "${ranges.strand}"\n+      #end if\n+      #if str( $ranges.complexity ):\n+          -complexity "${ranges.complexity}"\n+      #end if\n+      #if str( $ranges.chr_start ):\n+          -chr_start "${ranges.chr_start}"\n+      #end if\n+      #if str( $ranges.chr_stop ):\n+          -chr_stop "${ranges.chr_stop}"\n+      #end if\n+      #if str( $query.source ) == "id":\n+          -id "${query.id}"\n+      #else:\n+          < \'$query.input_file\'\n+      #end if\n+      > \'${output_result}\'\n+  ]]>\n+  </command>\n+  <inputs>\n+    <conditional name="query">\n+      <param name="source" type="select" label="Select query source">\n+        <option value="history">NCBI WebEnv History</option>\n+        <option value="id">Direct Entry</option>\n+      </param>\n+      <when value="history">\n+        <param label="Input File" name="input_file" type="data" format="xml"/>\n+      </when>\n+      <when value="id">\n+        <param label="Query ID" name="id" type="text" argument="-id"/>\n+      </when>\n+    </conditional>\n+\n+<conditional name="db">\n+    <param name="db" type="select" label="Database" argument="-db">\n+        <option value="annotinfo">annotinfo</option>\n+        <option value="assembly">assembly</option>\n+        <option value="bioproject">bioproject</option>\n+        <option value="biosample">biosample</option>\n+        <option value="biosystems">biosystems</option>\n+        <option value="blastdbinfo">blastdbinfo</option>\n+        <option value="books">books</option>\n+        <option value="cdd">cdd</option>\n+        <option value="clinvar">clinvar</option>\n+        <option value="clone">clone</option>\n+        <option value="dbvar">dbvar</option>\n+        <option value="gap">gap</option>\n+        <option value="gapplus">gapplus</option>\n+        <option value="gds">gds</option>\n+        <option value="gencoll">gencoll</option>\n+        <option value="gene">gene</option>\n+        <option value="genome">genome</option>\n+        <option value="geoprofiles">geoprofiles</option>\n+        <option value="grasp">grasp</option>\n+        <option value="gtr">gtr</option>\n+        <option value="homologene">homologene</option>\n+        <option value="medgen">medgen</option>\n+        <option value="mesh">mesh</option>\n+        <option value="ncbisearch">ncbisearch</option>\n+        <option value="nlmcatalog">nlmcatalog</option>\n+        <option value="nuccore">nuccore</option>\n+        <option value="nucest">nucest</option>\n+        <option value="nucgss">nucgss</option>\n+        <option value="nucleotide">nucleotide</option>\n+        <option value="omim">omim</option>\n+        <option value="orgtrack">orgtrack</option>\n+        <option value="pcassay">pcassay</option>\n+        <option value="pccompound">pccompound</option>\n+        <option value="pcsubstance'..b'            </when>\n+            <when value="xml">\n+                <param name="mode" type="select" label="Mode" argument="-mode">\n+                    <option value="">None</option>\n+                </param>\n+            </when>\n+        </conditional>\n+    </when>\n+    <when value="">\n+        <param name="db_manual" type="text" label="Database" argument="-db"/>\n+        <param name="format" type="text" label="Format" argument="-format"/>\n+        <param name="mode" type="text" label="Mode" argument="-mode"/>\n+    </when>\n+</conditional>\n+\n+\n+\n+\n+    <section name="ranges" title="Set Ranges" expanded="False">\n+        <param label="Seq Start" name="seq_start" type="integer" optional="True" min="0" argument="-seq_start"/>\n+        <param label="Seq End" name="seq_stop" type="integer" optional="True" min="0" argument="-seq_stop"/>\n+        <param label="strand" name="strand" type="text" argument="-strand"/>\n+        <param label="Complexity" name="complexity" type="integer" optional="True" min="0" argument="-complexity"/>\n+        <param label="Chr Start" name="chr_start" type="integer" optional="True" min="0" argument="-chr_start"/>\n+        <param label="Chr End" name="chr_stop" type="integer" optional="True" min="0" argument="-chr_stop"/>\n+    </section>\n+  </inputs>\n+  <outputs>\n+    <data format="txt" name="output_result">\n+      <change_format>\n+        <when input="db.format" value="xml" format="xml"/>\n+        <when input="db.format.mode" value="xml" format="xml"/>\n+        <when input="db.format.mode" value="json" format="json"/>\n+        <when input="db.format" value="tabular" format="tabular"/>\n+        <when input="db.format" value="fasta" format="fasta"/>\n+        <when input="db.format" value="fasta_cds_aa" format="fasta"/>\n+        <when input="db.format" value="fasta_cds_na" format="fasta"/>\n+        <when input="db.format" value="gene_fasta" format="fasta"/>\n+      </change_format>\n+    </data>\n+  </outputs>\n+  <tests>\n+    <test>\n+      <param name="db|db" value=""/>\n+      <param name="db|db_manual" value="sra"/>\n+      <param name="db|format" value="runinfo"/>\n+      <param name="query|source" value="id"/>\n+      <param name="query|id" value="SRX8542266"/>\n+      <output name="output_result">\n+          <assert_contents>\n+              <has_text_matching expression="Run" />\n+          </assert_contents>\n+      </output>\n+    </test>\n+  </tests>\n+  <help><![CDATA[\n+NCBI Entrez EFetch\n+==================\n+\n+Responds to a list of UIDs in a given database with the corresponding data\n+records in a specified format.\n+\n+Example Queries\n+---------------\n+\n+Fetch PMIDs 17284678 and 9997 as text abstracts:\n+\n++----------------------+--------------------------------------+\n+| Parameter            | Value                                |\n++======================+======================================+\n+| NCBI Database to Use | PubMed                               |\n++----------------------+--------------------------------------+\n+| ID List              | 17284678 9997                        |\n++----------------------+--------------------------------------+\n+| Output Format        | Abstract                             |\n++----------------------+--------------------------------------+\n+\n+Fetch FASTA for a transcript and its protein product (GIs 312836839 and 34577063)\n+\n++----------------------+--------------------------------------+\n+| Parameter            | Value                                |\n++======================+======================================+\n+| NCBI Database to Use | Protein                              |\n++----------------------+--------------------------------------+\n+| ID List              | 312836839 34577063                   |\n++----------------------+--------------------------------------+\n+| Output Format        | Fasta                                |\n++----------------------+--------------------------------------+\n+\n+@DISCLAIMER@\n+      ]]></help>\n+  <expand macro="citations"/>\n+</tool>\n'
b
diff -r 000000000000 -r 91efba463050 macros.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/macros.xml Tue Mar 22 22:30:08 2022 +0000
[
@@ -0,0 +1,102 @@
+<macros>
+    <token name="@TOOL_VERSION@">13.3</token>
+    <xml name="requirements">
+        <requirements>
+            <requirement type="package" version="@TOOL_VERSION@">entrez-direct</requirement>
+        </requirements>
+    </xml>
+    <token name="@ECONTACT@"><![CDATA[
+        #set $__contact_email__ = ';'.join( str( $__admin_users__ ).split( ',' ) )
+        #if str( $__user_email__ ):
+            #set $__contact_email__ = $__contact_email__ + ";" + str( $__user_email__ )
+        #end if
+        econtact -email "${ __contact_email__ }" -tool "galaxy_ncbi_entrez_direct" > /dev/null ;
+        ]]>        
+    </token>
+    <token name="@DISCLAIMER@"><![CDATA[
+Usage Guidelines and Requirements
+=================================
+
+Frequency, Timing, and Registration of E-utility URL Requests
+-------------------------------------------------------------
+
+In order not to overload the E-utility servers, NCBI recommends that users
+limit large jobs to either weekends or between 9:00 PM and 5:00 AM Eastern time
+during weekdays. Failure to comply with this policy may result in an IP address
+being blocked from accessing NCBI.
+
+Minimizing the Number of Requests
+---------------------------------
+
+If a task requires searching for and/or downloading a large number of
+records, it is much more efficient to use the Entrez History to upload
+and/or retrieve these records in batches rather than using separate
+requests for each record. Please refer to Application 3 in Chapter 3
+for an example. Many thousands of IDs can be uploaded using a single
+EPost request, and several hundred records can be downloaded using one
+EFetch request.
+
+
+Disclaimer and Copyright Issues
+-------------------------------
+
+In accordance with requirements of NCBI's E-Utilities, we must provide
+the following disclaimer:
+
+Please note that abstracts in PubMed may incorporate material that may
+be protected by U.S. and foreign copyright laws. All persons
+reproducing, redistributing, or making commercial use of this
+information are expected to adhere to the terms and conditions asserted
+by the copyright holder. Transmission or reproduction of protected
+items beyond that allowed by fair use (PDF) as defined in the copyright
+laws requires the written permission of the copyright owners. NLM
+provides no legal advice concerning distribution of copyrighted
+materials. Please consult your legal counsel. If you wish to do a large
+data mining project on PubMed data, you can enter into a licensing
+agreement and lease the data for free from NLM. For more information on
+
+The `full disclaimer <https://www.ncbi.nlm.nih.gov/home/about/policies/>`__ is available on
+their website
+
+Liability
+~~~~~~~~~
+
+For documents and software available from this server, the
+U.S. Government does not warrant or assume any legal liability or
+responsibility for the accuracy, completeness, or usefulness of any
+information, apparatus, product, or process disclosed.
+
+Endorsement
+~~~~~~~~~~~
+
+NCBI does not endorse or recommend any commercial
+products, processes, or services. The views and opinions of authors
+expressed on NCBI's Web sites do not necessarily state or reflect those
+of the U.S. Government, and they may not be used for advertising or
+product endorsement purposes.
+
+External Links
+~~~~~~~~~~~~~~
+
+Some NCBI Web pages may provide links to other Internet
+sites for the convenience of users. NCBI is not responsible for the
+availability or content of these external sites, nor does NCBI endorse,
+warrant, or guarantee the products, services, or information described
+or offered at these other Internet sites. Users cannot assume that the
+external sites will abide by the same Privacy Policy to which NCBI
+adheres. It is the responsibility of the user to examine the copyright
+and licensing restrictions of linked pages and to secure all necessary
+permissions.
+        ]]></token>
+    <xml name="citations">
+        <citations>
+            <citation type="bibtex">@Book{ncbiEDirect,
+          author = {Jonathan Kans},
+          title = {Entrez Direct: E-utilities on the UNIX Command Line},
+          year = {2013},
+          publisher = {National Center for Biotechnology Information, Bethesda, Maryland},
+          note = {http://www.ncbi.nlm.nih.gov/books/NBK179288/}
+            }</citation>
+        </citations>
+    </xml>
+</macros>