comparison restore_attributes.xml @ 22:95a05c1ef5d5

update to devshed revision aaece207bd01
author Richard Burhans <burhans@bx.psu.edu>
date Mon, 11 Mar 2013 11:28:06 -0400
parents
children 91e835060ad2
comparison
equal deleted inserted replaced
21:d6b961721037 22:95a05c1ef5d5
1 <tool id="gd_restore_attributes" name="Restore Attributes" version="1.0.0">
2 <description>: Fill in missing properties for a gd_snp dataset</description>
3
4 <command interpreter="python">
5 cp.py "$dst" "$output"
6 </command>
7
8 <inputs>
9 <param name="src" type="data" format="gd_snp" label="SNP dataset to copy attributes from" />
10 <param name="dst" type="data" format="gd_snp" label="SNP dataset to receive attributes" />
11 </inputs>
12
13 <outputs>
14 <data name="output" format="gd_snp" metadata_source="src" />
15 </outputs>
16
17 <help>
18
19 **Dataset formats**
20
21 All of the input and output datasets are in gd_snp_ format. (`Dataset missing?`_)
22
23 .. _gd_snp: ./static/formatHelp.html#gd_snp
24 .. _Dataset missing?: ./static/formatHelp.html
25
26 -----
27
28 **What it does**
29
30 This tool copies metadata information from one SNP dataset to another, leaving
31 the actual SNP data itself unchanged. Datasets in gd_snp format have a number
32 of "extra" properties associated with them, such as the focus species (which
33 may be different from the reference assembly), names of individuals, column
34 numbers containing certain data fields, etc. These values are stored in the
35 dataset's metadata, in addition to the more usual attributes like dataset name,
36 assembly build, and so forth. You can see some of these by clicking on the
37 pencil icon for the dataset.
38
39 The Genome Diversity tools need this information to perform their tasks.
40 However, these additional attributes may be lost if the datatype is changed.
41 For example, suppose you want to see which SNPs overlap some other dataset in
42 your history, like coding regions or TAL1 binding sites. The Intersect tool
43 only works on datasets that are in interval format, so you might use the Compute
44 tool to append a new column with the End position of the SNP (= Start + 1),
45 then use the pencil icon to change the datatype to "interval". This works
46 great for doing the intersection, but if you then want to run one of the Genome
47 Diversity tools on the resulting SNPs, there's a problem: you can change the
48 datatype back to gd_snp easily enough, but the extra attributes have been lost
49 in the conversion to interval.
50
51 As long as the proper values of the lost attributes have not changed, then this
52 tool can restore them by copying from the old gd_snp dataset in your history.
53 In the above example, appending a column does not change the numbering of the
54 earlier columns, and deleting rows via Intersect does not affect the extra
55 attributes either. Note that all of the metadata is copied, not just the extra
56 attributes specific to gd_snp (though standard items like the assembly build,
57 the number of lines, and the name for the output dataset are updated
58 automatically by the Galaxy framework).
59
60 </help>
61 </tool>