comparison join_files_on_column_fuzzy.xml @ 3:22ec3c1a20cd draft default tip

planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/text_processing/join_files_on_column_fuzzy commit 3419a5a5e19a93369c8c20a39babe5636a309292
author bgruening
date Tue, 29 May 2018 15:34:31 -0400
parents f2068690addc
children
comparison
equal deleted inserted replaced
2:f2068690addc 3:22ec3c1a20cd
34 <param argument="--header" type="boolean" checked="false" truevalue="--header" falsevalue="" label="Does the input files contain a header line" /> 34 <param argument="--header" type="boolean" checked="false" truevalue="--header" falsevalue="" label="Does the input files contain a header line" />
35 <param argument="--add_distance" type="boolean" checked="false" truevalue="--add_distance" falsevalue="" label="Add an addional column with the calculated distance." /> 35 <param argument="--add_distance" type="boolean" checked="false" truevalue="--add_distance" falsevalue="" label="Add an addional column with the calculated distance." />
36 36
37 <param name="merge_mode_select" type="select" label="Choose the mode of merging."> 37 <param name="merge_mode_select" type="select" label="Choose the mode of merging.">
38 <option value="closest" selected="True">Best match (in case of multiple best matches, only the first one is reported)</option> 38 <option value="closest" selected="True">Best match (in case of multiple best matches, only the first one is reported)</option>
39 <option value="distance">Matching with a defined distance</option> 39 <option value="distance">All matches within the defined distance</option>
40 </param> 40 </param>
41 <param name="units" display="radio" type="select" value="ppm_value" label="Choose the metrics of your distance" 41 <param name="units" display="radio" type="select" value="ppm_value" label="Choose the metrics of your distance"
42 help="ppm is useful for very small differences"> 42 help="ppm is useful for very small differences">
43 <option value="absolute" selected="True">Absolute distance</option> 43 <option value="absolute" selected="True">Absolute distance</option>
44 <option value="ppm" >Distance in ppm</option> 44 <option value="ppm" >Distance in ppm</option>
115 </test> 115 </test>
116 </tests> 116 </tests>
117 <help> 117 <help>
118 <![CDATA[ 118 <![CDATA[
119 119
120 Join two files on a common column. It is possible to provide an allowed difference between both values (currently only numbers) 120 Join two files on a common column. It is necessary to provide an allowed difference between both values as the maximum absolute difference or maximum parts per million (ppm) for matching.
121 as the absolute differece or as PPM.
122 121
123 Two modes are available: 122 Two modes are available:
124 123
125 1. In the **best match** mode only the rows are merged for the most similar (or identical) values. In case of multiple best matches, only the first one is reported. 124 1) In the **best match** mode: For each value in file 1 only the best matching value of file 2 is reported. In case of multiple best matches, only the closest match is reported.
125 2) In the **all matches** mode: All matches within the defined distance are reported.
126
127 Be aware that file 1 is the template file and therefore the same value in file 2 can be matched to multiple values in file 1
126 128
127 2. The **Matching with a defined distance** option will offer you the possibility 129
128 to provide a distance between the two values of the columns. Is the calculates distance smaller or equal than the given distance the columns will be joined. You can specify the allowed distance as an absolute distance or as PPM. 130 ------
131
132 **Example**
133
134 **Input file 1** ::
135
136 1
137 2
138 3
139 4
140 5
141
142
143 **Input file 2** ::
144
145 1.1
146 1.2
147 2.2
148 3.3
149 4.4
150
151
152 **Joined file1 and 2** with best match and absolute distance 0.3::
153
154 1 1.1 0.1
155 2 2.2 0.2
156 3 3.3 0.3
157
158 **Joined file1 and 2** with all matches and absolute distance 0.3::
159
160 1 1.1 0.1
161 1 1.2 0.2
162 2 2.2 0.2
163 3 3.3 0.3
129 164
130 165
131 ]]> 166 ]]>
132 </help> 167 </help>
133 <citations> 168 <citations>