view versioned_data.xml @ 1:5c5027485f7d draft

Uploaded correct file
author damion
date Sun, 09 Aug 2015 16:07:50 -0400
parents
children
line wrap: on
line source

<tool id="versioned_data" name="Versioned data retrieval" version="0.1.03">
	<description>Retrieve versioned sequence files and/or their blast, bowtie, etc. database indexes</description>
	<macros>
		<token name="@BINARY@">versioned_data.py</token>
		<import>bccdc_macros.xml</import>
	</macros>
	<expand macro="requirements" />
	<command interpreter="python">
		#assert $__user__, Exception( 'You must be logged in to use this tool.' )
		versioned_data.py
		#if $globalRetrievalDate.strip() > ''
			-d "$globalRetrievalDate"
		#end if
		-r 
		"
		#for $v in $versions:
			${v.database},
			#for $r in $v.retrieval:
				${r.retrievalId}
			#end for
			,
			#for $w in $v.workflows:
				${w.workflow}
			#end for
			|
		#end for
		"		
		-o "$log"
		-O "$__app__.security.encode_id($log.id)"
		--api_info_path "$api_info_path" ##Actually a file path to configfile that holds api key
	</command>
	<!--  #:$log.hid:$log.id dataset_id -->
	<expand macro="stdio" />
	
	<inputs>
		<!-- Implement as datepicker? http://www.learnfaceit.org/for-developers/adding-parameter-types-to-tool -->
		<param name="globalRetrievalDate" type="text" label="Global retrieval date [YYYY-MM-DD]" help="The recall system will use this date to try to select the appropriate versions below.  Leave empty to select current versions." size="25" />

		<param name="api_info" display="radio" type="drill_down" label="For user with Galaxy API Key" dynamic_options="vdb_init_tool_user(__trans__)" />

		<repeat name="versions" title="Data Source" min="1" max="15">

			<param name="database" type="select" label="Data" dynamic_options="vdb_get_databases()" multiple="false" />

			<repeat name="retrieval" title="Retrieval" min="0" max="1">
				<param name="retrievalId" label="Version date/id" type="select" dynamic_options="vdb_get_versions(database, globalRetrievalDate)"/>
			</repeat>
			
			<repeat name="workflows" title="Workflow" min="0" max="5" > 
				<param name="workflow" type="select" label="Name" dynamic_options="vdb_get_workflows(database)" />
			</repeat>

		</repeat>
				
	</inputs>
	
	<configfiles>
		<configfile name="api_info_path">${__user__.api_keys[0].key}
			$api_info
		</configfile>
	</configfiles>

	<outputs>
		<data name="log" format="txt" label="Versioned Data Retrieval" />
    </outputs>

	<code file="versioned_data_form.py" />
	
	<tests>
		<test>
			<param name="db_type" value="nucl"/>
			<!-- ... -->
		</test>
	</tests>

    <help>
    
.. class:: infomark


**What it does**

This tool retrieves links to current or past versions of fasta or other types of  
data from a cache kept in the Galaxy data library called "Versioned Data".  It then places 
them into one's current history so that subsequent tools can work with that data.  

For example, after using this tool to select a version of the NCBI nt database, a blast search can be carried out on it by selecting "BLAST database from your history" from the "Subject database/sequences" field of the NCBI BLAST+ search tool.

You can select one or more files or databases by version date or id. This list 
is supplied from the Shared Data > Data Libraries > Versioned Data folder that has
been set up by an administrator.

The Workflows section allows you to select one or more pre-defined workflows 
to execute on the versioned data.  The results are placed in your history for use 
by other tools or workflows.
  
A caching system exists to cache the versioned data or workflow data that the tool generates.  
If you request versioned data or derivative data that isn't cached, it may take time to regenerate. 

The top-level "Global retrieval date [YYYY-MM-DD]" field that the form starts with will be applied to 
all selected databases.  This can be overriden by a retrieval date or version that 
you supply for a particular database.  Leave it and any "Retrievals" inputs empty if you just need the latest version of selected databases.

-------

.. class:: warningmark

**Note**

Again, some past database versions can take time to regenerate if there is no cached version available, for example NCBI nt is a 50+ gigabyte file that needs to be read through to get a fasta version, and a makeblastdb workflow on top of that can take hours on the first call.  Access to cached versions is immediate.

Setup of versioned data sources and workflow options can only be done by a Galaxy administrator.

-------

**References**

If you use this Galaxy tool in work leading to a scientific publication please
cite the following paper:

*Reference coming soon...*

    </help>
</tool>