Mercurial > repos > nml > csvtk_sample
comparison sample.xml @ 0:447272175720 draft default tip
"planemo upload for repository https://github.com/shenwei356/csvtk commit 3a97e1b79bf0c6cdd37d5c8fb497b85531a563ab"
author | nml |
---|---|
date | Tue, 19 May 2020 17:12:29 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:447272175720 |
---|---|
1 <tool id="csvtk_sample" name="csvtk-sample" version="@VERSION@+@GALAXY_VERSION@"> | |
2 <description> random proportion of dataset</description> | |
3 <macros> | |
4 <import>macros.xml</import> | |
5 </macros> | |
6 <expand macro="requirements" /> | |
7 <expand macro="version_cmd" /> | |
8 <command detect_errors="exit_code"><![CDATA[ | |
9 | |
10 ################### | |
11 ## Start Command ## | |
12 ################### | |
13 | |
14 csvtk sample --num-cpus "\${GALAXY_SLOTS:-1}" | |
15 | |
16 ## Add additional flags as specified ## | |
17 ####################################### | |
18 $global_param.illegal_rows | |
19 $global_param.empty_rows | |
20 $global_param.header | |
21 $global_param.lazy_quotes | |
22 | |
23 ## Set Tabular input/output flag if input is tabular ## | |
24 ####################################################### | |
25 #if $in_1.is_of_type("tabular"): | |
26 -t -T | |
27 #end if | |
28 | |
29 ## Set Input ## | |
30 ############### | |
31 '$in_1' | |
32 | |
33 ## other ## | |
34 ########### | |
35 -p '$proportion' | |
36 -s '$seed' | |
37 $line_number | |
38 | |
39 ## To output ## | |
40 ############### | |
41 &> sampled | |
42 | |
43 ]]></command> | |
44 <inputs> | |
45 <expand macro="singular_input" /> | |
46 <param name="proportion" type="float" argument="-p" value="0.5" | |
47 min="0" | |
48 max="1" | |
49 label="Proportion of Data to Sample" | |
50 /> | |
51 <param name="seed" type="integer" argument="-s" value="1900" | |
52 label="Random Seed" | |
53 help="Specify a seed number to sample data with" | |
54 /> | |
55 <param name="line_number" type="boolean" checked="false" argument="-n" | |
56 truevalue="-n" | |
57 falsevalue="" | |
58 label="Create column with original line numbers of sampled data" | |
59 /> | |
60 <expand macro="global_parameters" /> | |
61 </inputs> | |
62 <outputs> | |
63 <data format_source="in_1" name="sampled" from_work_dir="sampled" label="${proportion} of ${in_1.name} sampled" /> | |
64 </outputs> | |
65 <tests> | |
66 <test> | |
67 <param name="in_1" value="plot.csv" /> | |
68 <param name="proportion" value="0.5" /> | |
69 <param name="seed" value="11" /> | |
70 <output name="sampled" value="sampled_1.csv" /> | |
71 </test> | |
72 <test> | |
73 <param name="in_1" value="plot.csv" /> | |
74 <param name="proportion" value="0.7" /> | |
75 <param name="seed" value="11" /> | |
76 <param name="line_number" value="true" /> | |
77 <output name="sampled" value="sampled_2.csv" /> | |
78 </test> | |
79 </tests> | |
80 <help><![CDATA[ | |
81 | |
82 Csvtk - Sample Help | |
83 ------------------- | |
84 | |
85 Info | |
86 #### | |
87 | |
88 Csvtk-sample samples a random (as defined by the seed) proportion of a dataset that can be used further. | |
89 | |
90 .. class:: warningmark | |
91 | |
92 Single quotes are not allowed in text inputs! | |
93 | |
94 @HELP_INPUT_DATA@ | |
95 | |
96 | |
97 Usage | |
98 ##### | |
99 | |
100 To run csvtk-sample, all you need is a valid (as defined above) CSV or TSV file. | |
101 | |
102 **Example** | |
103 | |
104 Input table: | |
105 | |
106 +-------+--------+ | |
107 | Group | Length | | |
108 +=======+========+ | |
109 | 1 | 1500 | | |
110 +-------+--------+ | |
111 | 2 | 1000 | | |
112 +-------+--------+ | |
113 | 1 | 1500 | | |
114 +-------+--------+ | |
115 | 3 | 2000 | | |
116 +-------+--------+ | |
117 | |
118 To get a 0.5 proportion (50% sample) of the population, our input would be 0.5 for the proportion (-p) and then some random seed. | |
119 | |
120 Our output could then look as such: | |
121 | |
122 +-------+--------+ | |
123 | Group | Length | | |
124 +=======+========+ | |
125 | 1 | 1500 | | |
126 +-------+--------+ | |
127 | 3 | 2000 | | |
128 +-------+--------+ | |
129 | |
130 If we used the same seed, input, and proportion with the "Create column with original line numbers of sampled data" | |
131 set to yes, we would get the following table: | |
132 | |
133 +---+-------+--------+ | |
134 | n | Group | Length | | |
135 +===+=======+========+ | |
136 | 1 | 1 | 1500 | | |
137 +---+-------+--------+ | |
138 | 4 | 3 | 2000 | | |
139 +---+-------+--------+ | |
140 | |
141 -------- | |
142 | |
143 | |
144 @HELP_COLUMNS@ | |
145 | |
146 | |
147 @HELP_END_STATEMENT@ | |
148 | |
149 | |
150 ]]></help> | |
151 <expand macro="citations" /> | |
152 </tool> |