Mercurial > repos > nml > csvtk_awklike_filter
view awklike-filter.xml @ 0:20e8be1464f5 draft default tip
"planemo upload for repository https://github.com/shenwei356/csvtk commit 3a97e1b79bf0c6cdd37d5c8fb497b85531a563ab"
author | nml |
---|---|
date | Tue, 19 May 2020 17:14:07 -0400 |
parents | |
children |
line wrap: on
line source
<tool id="csvtk_awklike_filter" name="csvtk-advanced-filter" version="@VERSION@+@GALAXY_VERSION@"> <description> rows by awk-like artithmetic/string expressions</description> <macros> <import>macros.xml</import> </macros> <expand macro="requirements" /> <expand macro="version_cmd" /> <command detect_errors="exit_code"><![CDATA[ ################### ## Start Command ## ################### csvtk filter2 --num-cpus "\${GALAXY_SLOTS:-1}" ## Add additional flags as specified ## ####################################### $global_param.illegal_rows $global_param.empty_rows $global_param.header $global_param.lazy_quotes ## Set Tabular input/output flag if first input is tabular ## ############################################################# #if $in_1.is_of_type("tabular"): -t -T #end if ## Set input files ## ##################### $in_1 ## Specify fields to filter ## ############################## -f '$in_text' ## Specific inputs ## ##################### $line_number ## To output ## ############### > filtered ]]></command> <inputs> <expand macro="singular_input"/> <param name="in_text" type="text" optional="false" argument="-f" label="Awk-like artithmetic/string expression"> <help> <![CDATA[ Examples: - '$age>12' - '$1 > $3' - '$name=="abc"' - '$1 % 2 == 0' More info is available in the help section below. The ' character is invalid and will be replaced, thus you must surround strings with double quotes ("string") instead. ]]> </help> <expand macro="text_sanitizer" /> </param> <param name="line_number" type="boolean" checked="false" truevalue="-n" falsevalue="" argument="-n" label="Print initial line number as the first column" /> <expand macro="global_parameters" /> </inputs> <outputs> <data format_source="in_1" from_work_dir="filtered" name="filtered" label="${in_1.name} filtered by ${in_text}" /> </outputs> <tests> <test> <param name="in_1" value="frequency.tsv" /> <param name="in_text" value="$3>1" /> <output name="filtered" file="filtered.tsv" ftype="tabular" /> </test> </tests> <help><![CDATA[ Csvtk - Filter2 Help -------------------- Info #### Csvtk advanced filter (also called filter2) outputs rows that satisfy the input awk-like artithmetic/string expressions. Please see the `documentation <https://github.com/Knetic/govaluate/blob/master/MANUAL.md>`_ for further details and examples on how to write expressions. .. class:: warningmark Single quotes are not allowed in text inputs! .. class:: note If your wanted column header has a space in it, use the column number. Example: Use $1 if column #1 is called "Colony Counts" Supported operators and types: - Modifiers: + - / * & | ^ ** % >> << - Comparators: > >= < <= == != =~ !~ - Logical ops: || && - Numeric constants, as 64-bit floating point (12345.678) - String constants (double quotes: "foobar") - Date constants (double quotes) - Boolean constants: true false - Parenthesis to control order of evaluation ( ) - Arrays (anything separated by , within parenthesis: (1, 2, "foo")) - Prefixes: ! - ~ - Ternary conditional: ? : - Null coalescence: ?? ---- @HELP_INPUT_DATA@ Usage ##### **Ex. Filter2 on one column:** Suppose we had the following table: +---------------+------------+----------+ | Culture Label | Cell Count | Dilution | +===============+============+==========+ | ECo-1 | 2523 | 1000 | +---------------+------------+----------+ | LPn-1 | 100 | 1000000 | +---------------+------------+----------+ | LPn-2 | 4 | 1000 | +---------------+------------+----------+ If we wanted to find all samples with the label LPn, we could use the filter expression '$1 =~ "LPn*"' to get the following output: +---------------+------------+----------+ | Culture Label | Cell Count | Dilution | +===============+============+==========+ | LPn-1 | 100 | 1000000 | +---------------+------------+----------+ | LPn-2 | 4 | 1000 | +---------------+------------+----------+ Note how $1 was used to get column 1 due to it containing a space ---- **Ex2. Filter2 with multiple inputs:** Same input table +---------------+------------+----------+ | Culture Label | Cell Count | Dilution | +===============+============+==========+ | ECo-1 | 2523 | 1000 | +---------------+------------+----------+ | LPn-1 | 100 | 1000000 | +---------------+------------+----------+ | LPn-2 | 4 | 1000 | +---------------+------------+----------+ Now if we use the expression '$1 =~ "LPn*" && $Dilution > 1000' to filter on, we would pull out the only row that satisfies both conditions: +---------------+------------+----------+ | Culture Label | Cell Count | Dilution | +===============+============+==========+ | LPn-1 | 100 | 1000000 | +---------------+------------+----------+ ---- @HELP_COLUMNS@ @HELP_END_STATEMENT@ ]]></help> <expand macro="citations" /> </tool>