annotate doc/data_provenance.md @ 2:269d246ce6d0 draft default tip

Uploaded
author damion
date Fri, 23 Oct 2015 17:53:29 -0400
parents 5c5027485f7d
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
1 ## Data Provenance and Reproducibility
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
2
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
3 When a user selects particular version id or date for versioned data retrieval, this is recorded for future reference, and can be seen in a history item's "View details" (info icon) report, in the "Input Parameters" section. But if a user left the global date field blank or didn't select a particular version of a data source, they or another user can still rerun a Versioned Data retrieval to recreate the results by noting the original history item's view details "Created" date and entering it into the global retrieval date of the form.
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
4
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
5 ![A history dataset has a detailed view link](history_view_details.png)
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
6
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
7 ![Data provenance information is available in the detail view](history_dataset_details.png)
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
8
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
9 Also, particular dates/versions of a Versioned Data history item's retrieved data are shown in its "Edit Attributes" (pencil icon) report in the "Info" field.
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
10
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
11 Because Galaxy also preserves the version id of any galaxy tool it runs (e.g. the makeblastdb version #), rerunning a history/workflow that has these tools should also apply the appropriate software version to generate the secondary data as well.
5c5027485f7d Uploaded correct file
damion
parents:
diff changeset
12 However, the tool version ids contained within a workflow are not recorded by the versioned data tool per se.; they exist only in the selected workflow's design template, so some care must be taken to freeze or version any workflows used to generate derivative data.