comparison FCStxtMergeDownsample.xml @ 1:3c0e4179be7a draft default tip

"planemo upload for repository https://github.com/ImmPortDB/immport-galaxy-tools/tree/master/flowtools/merge_ds_flowtext commit 7858e5b085fc3c60c88fe87b2f343969d50d9b1e"
author azomics
date Mon, 22 Jun 2020 17:42:26 -0400
parents
children
comparison
equal deleted inserted replaced
0:426650130311 1:3c0e4179be7a
1 <tool id="fcstxt_merge_downsample" name="Merge and downsample" version="1.1+galaxy0">
2 <description>txt-converted FCS files into one text file based on headers</description>
3 <requirements>
4 <requirement type="package" version="0.17.1">pandas</requirement>
5 </requirements>
6 <stdio>
7 <exit_code range="2" level="fatal" description="Non-numeric data. See stderr for more details." />
8 <exit_code range="3" level="warning" description="Selected columns do not exist in all files" />
9 <exit_code range="4" level="fatal" description="Run aborted - too many errors" />
10 <exit_code range="6" level="fatal" description="Please provide integers for columns you want to merge on." />
11 <exit_code range="7" level="fatal" description="Please provide a comma separated list of integers for columns you want to merge on." />
12 <exit_code range="8" level="fatal" description="Please provide a numeric value [0,1] for the downsampling factor." />
13 <exit_code range="9" level="fatal" description="There are no columns in common to all files." />
14 </stdio>
15 <command><![CDATA[
16 python '$__tool_directory__/FCStxtMergeDownsample.py' -o '${output_file}' -d '${factorDS}'
17 #if $columns
18 -c '${columns}'
19 #end if
20 #for $f in $input
21 -i '${f}'
22 #end for
23 ]]>
24 </command>
25 <inputs>
26 <param format="flowtext,txt,tabular" name="input" type="data_collection" collection_type="list" label="Text files Collection"/>
27 <param name="factorDS" type="text" label="Downsample by:" value="i.e.:0.1 or 10%" optional="true" help="1 by default (no downsampling)."/>
28 <param name="columns" type="text" label="Merge columns:" value="i.e.:1,2,5" optional="true" help="By default, will merge on the columns in common to all files.">
29 </param>
30 </inputs>
31 <outputs>
32 <data format="flowtext" name="output_file" label="Merge flowtext on ${input.name}"/>
33 </outputs>
34 <tests>
35 <test>
36 <param name="input">
37 <collection type="list">
38 <element name="input1.txt" value="test1/input1.txt"/>
39 <element name="input2.txt" value="test1/input2.txt"/>
40 <element name="input3.txt" value="test1/input3.txt"/>
41 </collection>
42 </param>
43 <param name="factorDS" value=".8"/>
44 <param name="columns" value="i.e.:1,2,5"/>
45 <output name="output_file" file="merge1.flowtext" compare="sim_size"/>
46 </test>
47 <test>
48 <param name="input">
49 <collection type="list">
50 <element name="input1.txt" value="test2/input1.txt"/>
51 <element name="input2.txt" value="test2/input2.txt"/>
52 <element name="input3.txt" value="test2/input3.txt"/>
53 </collection>
54 </param>
55 <param name="factorDS" value="i.e.:0.1 or 10%"/>
56 <param name="columns" value="1,2,3"/>
57 <output name="output_file" file="merge2.flowtext" compare="sim_size"/>
58 </test>
59 </tests>
60 <help><![CDATA[
61 This tool downsamples and merges multiple txt-converted FCS files into one text file.
62
63 -----
64
65 **Input files**
66
67 This tool requires collections of txt, flowtext or tabular files as input.
68
69 **Downsampling**
70
71 By default, files are not downsampled. If a downsampling factor is provided, each file in the input dataset collection will be downsampled randomly without replacement as follows:
72
73 - If n is between 0 and 1, the size of the output will be n times that of the input files.
74 - If n is between 1 and 100, the size of the output will be n% that of the input files.
75
76 .. class:: infomark
77
78 Downsampling is implemented such that each file will contribute an equal number of event to the aggregate.
79
80 .. class:: warningmark
81
82 At this time, up-sampling is not supported. If the number provided is greater than 100, the tool will exit.
83
84 **Output file**
85
86 The output flowtext file contains is a concatenation of the input files provided all data after the header contains only numbers. By default, only columns existing in all input files (as assessed by the header) are concatenated. The user can specify columns to merge, bypassing the headers check. If a downsampling factor is provided, the corresponding proportion of each input file ONLY will be read in (and checked for errors).
87
88 .. class:: warningmark
89
90 Potential errors are logged to stderr. If the number of errors reaches 10, the run will be aborted. If a file contains non-numeric data, the run will be aborted.
91
92 .. class:: infomark
93
94 Tip: Three tools in the Flow File Tools section can help prepare files for merging and/or downsampling:
95
96 - Check headers tool provides a list of headers for all files in a collection of text, flowtext or tabular files.
97 - Remove, rearrange and/or rename columns tool allows manipulation of the columns of a file or a set of files.
98 - Check data tool identifies the lines in a file containing non-numeric data.
99
100 -----
101
102 **Example**
103
104 *File1*::
105
106 Marker1 Marker2 Marker3
107 34 45 12
108 33 65 10
109 87 26 76
110 24 56 32
111 95 83 53
112 74 15 87
113 ... ... ...
114
115 *File2*::
116
117 Marker4 Marker5 Marker3
118 19 62 98
119 12 36 58
120 41 42 68
121 76 74 53
122 62 34 45
123 93 21 76
124 ... ... ...
125
126 *Output*
127
128 .. class:: infomark
129
130 If run without specifying the columns::
131
132 Marker3
133 12
134 10
135 76
136 32
137 53
138 87
139 98
140 58
141 68
142 53
143 45
144 76
145 ...
146
147 .. class:: infomark
148
149 If run specifying columns 1,2,3::
150
151 Marker1 Marker2 Marker3
152 34 45 12
153 33 65 10
154 87 26 76
155 24 56 32
156 95 83 53
157 74 15 87
158 19 62 98
159 12 36 58
160 41 42 68
161 76 74 53
162 62 34 45
163 93 21 76
164 ... ... ...
165
166 .. class:: infomark
167
168 If run specifying columns 1,2,3 and with a downsampling factor of 0.5::
169
170 Marker1 Marker2 Marker3
171 34 45 12
172 24 56 32
173 95 83 53
174 19 62 98
175 12 36 58
176 62 34 45
177 ... ... ...
178 ]]>
179 </help>
180 <citations>
181 <citation type="doi">10.1038/srep02327</citation>
182 </citations>
183 </tool>