Galaxy |

Changeset 0:91efba463050 (2022-03-22)

Next changeset 1:4d0b8d9aee09 (2022-12-17)

Commit message:
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/ncbi_entrez_direct commit 8f96f378620bb663dcce2845ecb14355413f7afa"

added:
README.rst
__efetch_build_options.py
efetch.xml
macros.xml

diff -r 000000000000 -r 91efba463050 README.rst
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/README.rst Tue Mar 22 22:30:08 2022 +0000

@@ -0,0 +1,17 @@
+Galaxy NCBI Entrez Direct Tools
+===============================
+
+This repo requires a readme as administrators should very aware of some
+restrictions NCBI places on the use of the Entrez service.
+
+NCBI requests that you please limit large jobs to either weekends or
+between 9:00 PM and 5:00 AM Eastern time during weekdays. This is not a
+request that the Galaxy tool can easily service, so we've included it in
+the disclaimer on every tool quite prominently.
+
+Failure to comply with NCBI's policies may result in an block.
+
+Note that these are *IP* level blocks so the Galaxy tools uses a
+concatenation of the administrator's emails, and the user email, in
+hopes that NCBI will contact all relevant parties should their system be
+abused.

diff -r 000000000000 -r 91efba463050 __efetch_build_options.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/__efetch_build_options.py Tue Mar 22 22:30:08 2022 +0000

[

b'@@ -0,0 +1,225 @@\n+#!/usr/bin/env python\n+\n+# Daniel Blankenberg\n+# Creates the options for tool interface\n+\n+# http://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi\n+db_list = \'\'\'<DbName>pubmed</DbName>\n+<DbName>protein</DbName>\n+<DbName>nuccore</DbName>\n+<DbName>nucleotide</DbName>\n+<DbName>nucgss</DbName>\n+<DbName>nucest</DbName>\n+<DbName>structure</DbName>\n+<DbName>genome</DbName>\n+<DbName>annotinfo</DbName>\n+<DbName>assembly</DbName>\n+<DbName>bioproject</DbName>\n+<DbName>biosample</DbName>\n+<DbName>blastdbinfo</DbName>\n+<DbName>books</DbName>\n+<DbName>cdd</DbName>\n+<DbName>clinvar</DbName>\n+<DbName>clone</DbName>\n+<DbName>gap</DbName>\n+<DbName>gapplus</DbName>\n+<DbName>grasp</DbName>\n+<DbName>dbvar</DbName>\n+<DbName>gene</DbName>\n+<DbName>gds</DbName>\n+<DbName>geoprofiles</DbName>\n+<DbName>homologene</DbName>\n+<DbName>medgen</DbName>\n+<DbName>mesh</DbName>\n+<DbName>ncbisearch</DbName>\n+<DbName>nlmcatalog</DbName>\n+<DbName>omim</DbName>\n+<DbName>orgtrack</DbName>\n+<DbName>pmc</DbName>\n+<DbName>popset</DbName>\n+<DbName>probe</DbName>\n+<DbName>proteinclusters</DbName>\n+<DbName>pcassay</DbName>\n+<DbName>biosystems</DbName>\n+<DbName>pccompound</DbName>\n+<DbName>pcsubstance</DbName>\n+<DbName>pubmedhealth</DbName>\n+<DbName>seqannot</DbName>\n+<DbName>snp</DbName>\n+<DbName>sra</DbName>\n+<DbName>taxonomy</DbName>\n+<DbName>unigene</DbName>\n+<DbName>gencoll</DbName>\n+<DbName>gtr</DbName>\'\'\'.replace("<DbName>", "").replace("</DbName>", "").split("\\n")\n+\n+\n+help = \'\'\' (all)\n+ docsum DocumentSummarySet XML\n+ docsum json DocumentSummarySet JSON\n+ full Same as native except for mesh\n+ uid Unique Identifier List\n+ url Entrez URL\n+ xml Same as -format full -mode xml\n+\n+ bioproject\n+ native BioProject Report\n+ native xml RecordSet XML\n+\n+ biosample\n+ native BioSample Report\n+ native xml BioSampleSet XML\n+\n+ biosystems\n+ native xml Sys-set XML\n+\n+ gds\n+ native xml RecordSet XML\n+ summary Summary\n+\n+ gene\n+ gene_table Gene Table\n+ native Gene Report\n+ native asn.1 Entrezgene ASN.1\n+ native xml Entrezgene-Set XML\n+ tabular Tabular Report\n+\n+ homologene\n+ alignmentscores Alignment Scores\n+ fasta FASTA\n+ homologene Homologene Report\n+ native Homologene List\n+ native asn.1 HG-Entry ASN.1\n+ native xml Entrez-Homologene-Set XML\n+\n+ mesh\n+ full Full Record\n+ native MeSH Report\n+ native xml RecordSet XML\n+\n+ nlmcatalog\n+ native Full Record\n+ native xml NLMCatalogRecordSet XML\n+\n+ pmc\n+ medline MEDLINE\n+ native xml pmc-articleset XML\n+\n+ pubmed\n+ abstract Abstract\n+ medline MEDLINE\n+ native asn.1 Pubmed-entry ASN.1\n+ native xml PubmedArticleSet XML\n+\n+ (sequences)\n+ acc Accession Number\n+ est EST Report\n+ fasta FASTA\n+ '..b' INSDSet XML\n+ gss GSS Report\n+ ipg Identical Protein Report\n+ ipg xml IPGReportSet XML\n+ native text Seq-entry ASN.1\n+ native xml Bioseq-set XML\n+ seqid Seq-id ASN.1\n+\n+ snp\n+ chr Chromosome Report\n+ docset Summary\n+ fasta FASTA\n+ flt Flat File\n+ native asn.1 Rs ASN.1\n+ native xml ExchangeSet XML\n+ rsr RS Cluster Report\n+ ssexemplar SS Exemplar List\n+\n+ sra\n+ native xml EXPERIMENT_PACKAGE_SET XML\n+ runinfo xml SraRunInfo XML\n+\n+ structure\n+ mmdb Ncbi-mime-asn1 strucseq ASN.1\n+ native MMDB Report\n+ native xml RecordSet XML\n+\n+ taxonomy\n+ native Taxonomy List\n+ native xml TaxaSet XML\'\'\'.split("\\n")\n+\n+db = {}\n+name = None\n+all = "(all)"\n+for line in help:\n+ if line.strip() and line[2] != \' \':\n+ name = line.strip()\n+ db[name] = {}\n+ elif line.strip():\n+ format = line[0:len(" docsum ")].strip()\n+ mode = line[len(" docsum "):len(" docsum json ")].strip()\n+ if format not in db[name]:\n+ db[name][format] = []\n+ db[name][format].append(mode)\n+\n+for name in db_list:\n+ if name not in db:\n+ db[name] = {}\n+\n+db["sequences"] = db["(sequences)"]\n+del db["(sequences)"]\n+\n+print(\'<conditional name="db">\')\n+print(\' <param name="db" type="select" label="Database" argument="-db">\')\n+for name in sorted(db.keys()):\n+ if name == all:\n+ continue\n+ print(\' <option value="%s">%s</option>\' % (name, name))\n+print(\' <option value="">Manual Entry</option>\')\n+print(\' </param>\')\n+\n+for name in sorted(db.keys()):\n+ if name == all:\n+ continue\n+ my_dict = db[all].copy()\n+\n+ for format, modes in db[name].items():\n+ if format in my_dict:\n+ for mode in modes:\n+ if mode not in my_dict[format]:\n+ my_dict[format].append(mode)\n+ else:\n+ my_dict[format] = modes\n+ if "" not in my_dict:\n+ my_dict[""] = [""]\n+ print(\' <when value="%s">\' % name)\n+ print(\' <conditional name="format">\')\n+ print(\' <param name="format" type="select" label="Format" argument="-format">\')\n+ for format in sorted(my_dict.keys()):\n+ print(\' <option value="%s">%s</option>\' % (format, format or "None"))\n+ print(\' </param>\')\n+ for format in sorted(my_dict.keys()):\n+ print(\' <when value="%s">\' % format)\n+ print(\' <param name="mode" type="select" label="Mode" argument="-mode">\')\n+ if "" not in my_dict[format]:\n+ my_dict[format].append("")\n+ for mode in sorted(my_dict[format]):\n+ print(\' <option value="%s">%s</option>\' % (mode, mode or "None"))\n+ print(\' </param>\')\n+ print(\' </when>\')\n+ print(\' </conditional>\')\n+ print(\' </when>\')\n+print(\' <when value="">\')\n+print(\' <param name="db_manual" type="text" label="Database" argument="-db"/>\')\n+print(\' <param name="format" type="text" label="Format" argument="-format"/>\')\n+print(\' <param name="mode" type="text" label="Mode" argument="-mode"/>\')\n+print(\' </when>\')\n+print(\'</conditional>\')\n'

diff -r 000000000000 -r 91efba463050 efetch.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/efetch.xml Tue Mar 22 22:30:08 2022 +0000

[

b'@@ -0,0 +1,2581 @@\n+<tool id="ncbi_entrez_direct_efetch" name="NCBI EFetch" version="@TOOL_VERSION@">\n+ <description>fetch records from NCBI</description>\n+ <macros>\n+ <import>macros.xml</import>\n+ </macros>\n+ <expand macro="requirements"/>\n+ <version_command>efetch -version</version_command>\n+ <command detect_errors="exit_code"><![CDATA[\n+ @ECONTACT@\n+ efetch\n+ #if str( $db.db ) == "":\n+ #if str( $db.db_manual ):\n+ -db "${db.db_manual}"\n+ #end if\n+ #if str( $db.format ):\n+ -format "${db.format}"\n+ #end if\n+ #if str( $db.mode ):\n+ -mode "${db.mode}"\n+ #end if\n+ #else:\n+ #if str( $db.db ):\n+ -db "${db.db}"\n+ #end if\n+ #if str( $db.format.format ):\n+ -format "${db.format.format}"\n+ #end if\n+ #if str( $db.format.mode ):\n+ -mode "${db.format.mode}"\n+ #end if\n+ #end if\n+ #if str( $ranges.seq_start ):\n+ -seq_start "${ranges.seq_start}"\n+ #end if\n+ #if str( $ranges.seq_stop ):\n+ -seq_stop "${ranges.seq_stop}"\n+ #end if\n+ #if str( $ranges.strand ):\n+ -strand "${ranges.strand}"\n+ #end if\n+ #if str( $ranges.complexity ):\n+ -complexity "${ranges.complexity}"\n+ #end if\n+ #if str( $ranges.chr_start ):\n+ -chr_start "${ranges.chr_start}"\n+ #end if\n+ #if str( $ranges.chr_stop ):\n+ -chr_stop "${ranges.chr_stop}"\n+ #end if\n+ #if str( $query.source ) == "id":\n+ -id "${query.id}"\n+ #else:\n+ < \'$query.input_file\'\n+ #end if\n+ > \'${output_result}\'\n+ ]]>\n+ </command>\n+ <inputs>\n+ <conditional name="query">\n+ <param name="source" type="select" label="Select query source">\n+ <option value="history">NCBI WebEnv History</option>\n+ <option value="id">Direct Entry</option>\n+ </param>\n+ <when value="history">\n+ <param label="Input File" name="input_file" type="data" format="xml"/>\n+ </when>\n+ <when value="id">\n+ <param label="Query ID" name="id" type="text" argument="-id"/>\n+ </when>\n+ </conditional>\n+\n+<conditional name="db">\n+ <param name="db" type="select" label="Database" argument="-db">\n+ <option value="annotinfo">annotinfo</option>\n+ <option value="assembly">assembly</option>\n+ <option value="bioproject">bioproject</option>\n+ <option value="biosample">biosample</option>\n+ <option value="biosystems">biosystems</option>\n+ <option value="blastdbinfo">blastdbinfo</option>\n+ <option value="books">books</option>\n+ <option value="cdd">cdd</option>\n+ <option value="clinvar">clinvar</option>\n+ <option value="clone">clone</option>\n+ <option value="dbvar">dbvar</option>\n+ <option value="gap">gap</option>\n+ <option value="gapplus">gapplus</option>\n+ <option value="gds">gds</option>\n+ <option value="gencoll">gencoll</option>\n+ <option value="gene">gene</option>\n+ <option value="genome">genome</option>\n+ <option value="geoprofiles">geoprofiles</option>\n+ <option value="grasp">grasp</option>\n+ <option value="gtr">gtr</option>\n+ <option value="homologene">homologene</option>\n+ <option value="medgen">medgen</option>\n+ <option value="mesh">mesh</option>\n+ <option value="ncbisearch">ncbisearch</option>\n+ <option value="nlmcatalog">nlmcatalog</option>\n+ <option value="nuccore">nuccore</option>\n+ <option value="nucest">nucest</option>\n+ <option value="nucgss">nucgss</option>\n+ <option value="nucleotide">nucleotide</option>\n+ <option value="omim">omim</option>\n+ <option value="orgtrack">orgtrack</option>\n+ <option value="pcassay">pcassay</option>\n+ <option value="pccompound">pccompound</option>\n+ <option value="pcsubstance'..b' </when>\n+ <when value="xml">\n+ <param name="mode" type="select" label="Mode" argument="-mode">\n+ <option value="">None</option>\n+ </param>\n+ </when>\n+ </conditional>\n+ </when>\n+ <when value="">\n+ <param name="db_manual" type="text" label="Database" argument="-db"/>\n+ <param name="format" type="text" label="Format" argument="-format"/>\n+ <param name="mode" type="text" label="Mode" argument="-mode"/>\n+ </when>\n+</conditional>\n+\n+\n+\n+\n+ <section name="ranges" title="Set Ranges" expanded="False">\n+ <param label="Seq Start" name="seq_start" type="integer" optional="True" min="0" argument="-seq_start"/>\n+ <param label="Seq End" name="seq_stop" type="integer" optional="True" min="0" argument="-seq_stop"/>\n+ <param label="strand" name="strand" type="text" argument="-strand"/>\n+ <param label="Complexity" name="complexity" type="integer" optional="True" min="0" argument="-complexity"/>\n+ <param label="Chr Start" name="chr_start" type="integer" optional="True" min="0" argument="-chr_start"/>\n+ <param label="Chr End" name="chr_stop" type="integer" optional="True" min="0" argument="-chr_stop"/>\n+ </section>\n+ </inputs>\n+ <outputs>\n+ <data format="txt" name="output_result">\n+ <change_format>\n+ <when input="db.format" value="xml" format="xml"/>\n+ <when input="db.format.mode" value="xml" format="xml"/>\n+ <when input="db.format.mode" value="json" format="json"/>\n+ <when input="db.format" value="tabular" format="tabular"/>\n+ <when input="db.format" value="fasta" format="fasta"/>\n+ <when input="db.format" value="fasta_cds_aa" format="fasta"/>\n+ <when input="db.format" value="fasta_cds_na" format="fasta"/>\n+ <when input="db.format" value="gene_fasta" format="fasta"/>\n+ </change_format>\n+ </data>\n+ </outputs>\n+ <tests>\n+ <test>\n+ <param name="db|db" value=""/>\n+ <param name="db|db_manual" value="sra"/>\n+ <param name="db|format" value="runinfo"/>\n+ <param name="query|source" value="id"/>\n+ <param name="query|id" value="SRX8542266"/>\n+ <output name="output_result">\n+ <assert_contents>\n+ <has_text_matching expression="Run" />\n+ </assert_contents>\n+ </output>\n+ </test>\n+ </tests>\n+ <help><![CDATA[\n+NCBI Entrez EFetch\n+==================\n+\n+Responds to a list of UIDs in a given database with the corresponding data\n+records in a specified format.\n+\n+Example Queries\n+---------------\n+\n+Fetch PMIDs 17284678 and 9997 as text abstracts:\n+\n++----------------------+--------------------------------------+\n+| Parameter | Value |\n++======================+======================================+\n+| NCBI Database to Use | PubMed |\n++----------------------+--------------------------------------+\n+| ID List | 17284678 9997 |\n++----------------------+--------------------------------------+\n+| Output Format | Abstract |\n++----------------------+--------------------------------------+\n+\n+Fetch FASTA for a transcript and its protein product (GIs 312836839 and 34577063)\n+\n++----------------------+--------------------------------------+\n+| Parameter | Value |\n++======================+======================================+\n+| NCBI Database to Use | Protein |\n++----------------------+--------------------------------------+\n+| ID List | 312836839 34577063 |\n++----------------------+--------------------------------------+\n+| Output Format | Fasta |\n++----------------------+--------------------------------------+\n+\n+@DISCLAIMER@\n+ ]]></help>\n+ <expand macro="citations"/>\n+</tool>\n'

diff -r 000000000000 -r 91efba463050 macros.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/macros.xml Tue Mar 22 22:30:08 2022 +0000

[

@@ -0,0 +1,102 @@
+<macros>
+    <token name="@TOOL_VERSION@">13.3</token>
+    <xml name="requirements">
+        <requirements>
+            <requirement type="package" version="@TOOL_VERSION@">entrez-direct</requirement>
+        </requirements>
+    </xml>
+    <token name="@ECONTACT@"><![CDATA[
+        #set $__contact_email__ = ';'.join( str( $__admin_users__ ).split( ',' ) )
+        #if str( $__user_email__ ):
+            #set $__contact_email__ = $__contact_email__ + ";" + str( $__user_email__ )
+        #end if
+        econtact -email "${ __contact_email__ }" -tool "galaxy_ncbi_entrez_direct" > /dev/null ;
+        ]]>
+    </token>
+    <token name="@DISCLAIMER@"><![CDATA[
+Usage Guidelines and Requirements
+=================================
+
+Frequency, Timing, and Registration of E-utility URL Requests
+-------------------------------------------------------------
+
+In order not to overload the E-utility servers, NCBI recommends that users
+limit large jobs to either weekends or between 9:00 PM and 5:00 AM Eastern time
+during weekdays. Failure to comply with this policy may result in an IP address
+being blocked from accessing NCBI.
+
+Minimizing the Number of Requests
+---------------------------------
+
+If a task requires searching for and/or downloading a large number of
+records, it is much more efficient to use the Entrez History to upload
+and/or retrieve these records in batches rather than using separate
+requests for each record. Please refer to Application 3 in Chapter 3
+for an example. Many thousands of IDs can be uploaded using a single
+EPost request, and several hundred records can be downloaded using one
+EFetch request.
+
+
+Disclaimer and Copyright Issues
+-------------------------------
+
+In accordance with requirements of NCBI's E-Utilities, we must provide
+the following disclaimer:
+
+Please note that abstracts in PubMed may incorporate material that may
+be protected by U.S. and foreign copyright laws. All persons
+reproducing, redistributing, or making commercial use of this
+information are expected to adhere to the terms and conditions asserted
+by the copyright holder. Transmission or reproduction of protected
+items beyond that allowed by fair use (PDF) as defined in the copyright
+laws requires the written permission of the copyright owners. NLM
+provides no legal advice concerning distribution of copyrighted
+materials. Please consult your legal counsel. If you wish to do a large
+data mining project on PubMed data, you can enter into a licensing
+agreement and lease the data for free from NLM. For more information on
+
+The `full disclaimer <https://www.ncbi.nlm.nih.gov/home/about/policies/>`__ is available on
+their website
+
+Liability
+~~~~~~~~~
+
+For documents and software available from this server, the
+U.S. Government does not warrant or assume any legal liability or
+responsibility for the accuracy, completeness, or usefulness of any
+information, apparatus, product, or process disclosed.
+
+Endorsement
+~~~~~~~~~~~
+
+NCBI does not endorse or recommend any commercial
+products, processes, or services. The views and opinions of authors
+expressed on NCBI's Web sites do not necessarily state or reflect those
+of the U.S. Government, and they may not be used for advertising or
+product endorsement purposes.
+
+External Links
+~~~~~~~~~~~~~~
+
+Some NCBI Web pages may provide links to other Internet
+sites for the convenience of users. NCBI is not responsible for the
+availability or content of these external sites, nor does NCBI endorse,
+warrant, or guarantee the products, services, or information described
+or offered at these other Internet sites. Users cannot assume that the
+external sites will abide by the same Privacy Policy to which NCBI
+adheres. It is the responsibility of the user to examine the copyright
+and licensing restrictions of linked pages and to secure all necessary
+permissions.
+        ]]></token>
+    <xml name="citations">
+        <citations>
+            <citation type="bibtex">@Book{ncbiEDirect,
+          author = {Jonathan Kans},
+          title = {Entrez Direct: E-utilities on the UNIX Command Line},
+          year = {2013},
+          publisher = {National Center for Biotechnology Information, Bethesda, Maryland},
+          note = {http://www.ncbi.nlm.nih.gov/books/NBK179288/}
+            }</citation>
+        </citations>
+    </xml>
+</macros>