annotate doc/galaxy_tool.md @ 1:5c5027485f7d draft

Uploaded correct file
author damion
date Sun, 09 Aug 2015 16:07:50 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
1 # The Galaxy Versioned Data Tool
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
2
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
3 This tool retrieves links to current or past versions of fasta or other types of data from a cache kept in the Galaxy data library called "Versioned Data". It then places them into the current history so that subsequent tools can work with that data. A blast search can be carried out on a version of a fasta database from a year ago for example.
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
4
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
5 ![Galaxy Versioned Data Tool](versioned_data_retrieval.png)
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
6
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
7 You can select one or more files by version date or id. (This list is supplied from the Shared Data > Data Libraries > Versioned Data folder that has been set up by a Galaxy administrator).
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
8
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
9 In the versioned data tool, user selects a data source, and then selects a version to retrieve (by date or version id).
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
10 If a cached version of that database exists, it is linked into user's history.
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
11 Otherwise a new version of it is created, placed in cache, and linked into history.
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
12 The Versioned Data form starts with an optional top-level "Global retrieval date" which is applied to all selected databases. This can be overridden by a retrieval date or version that you supply for a particular database.
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
13
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
14 Finally, if you just select a data source to retrieve, but no global retrieval date or particular versions, the most recent version of the selected data source will be retrieved.
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
15
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
16 The caching system caches both the versioned data and workflow data that the tool generates. If you request versioned data or derivative data that isn't cached, then (depending on the size of the archive) it may take time to regenerate.
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
17
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
18
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
19
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
20 ## Generation of workflow data
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
21
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
22 The Workflows section allows you to select one or more pre-defined workflows to execute on the versioned data. Currently this includes any workflow that begins with the phrase "Versioned: ". The results are placed in your history for use by other tools or workflows.
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
23
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
24 Currently workflow parameters must be entirely specified ("canned"), when the workflow is created/updated, rather than being specified at runtime. This means that a separate workflow with fixed settings must be predefined for each desired retrieval process (e.g. a blastdb with regions of low complexity filtered out, which requires a few steps to execute -dustmasker + makeblastdb etc).
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
25
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
26 Any user that needs more specific parameters for a reference database creation can just invoke the tools/steps after using the Versioned Data tool to retrieve the raw fasta data. The only drawback in this case is that the derivative data can't be cached - it has to be redone each time the tool is run.