Galaxy | Tool Preview

Filter Tabular (version 5.0.0)
Filter Tabular Input Lines
Filter Tabular Input Lines 0

Filter Tabular

Filter a tabular dataset by applying line filters as it is being read. Multiple filters may be used with each filter using the result of the previous filter.

Inputs

A tabular dataset.

Outputs

A filtered tabular dataset.

Input Line Filters

As a tabular file is being read, line filters may be applied.

- skip leading lines              skip the first *number* of lines
- comment char                    omit any lines that start with the specified comment character
- by regex expression matching    *include/exclude* lines the match the regex expression
- select columns                  choose to include only selected columns in the order specified
- regex replace value in column   replace a field in a column using a regex substitution (good for date reformatting)
- prepend a line number column    each line has the ordinal value of the line read by this filter as the first column
- append a line number column     each line has the ordinal value of the line read by this filter as the last column
- prepend a text column           each line has the text string as the first column
- append a text column            each line has the text string as the last column
- normalize list columns          replicates the line for each item in the specified list *columns*
Line Filtering Example

(Six filters are applied as the following file is read)

Input Tabular File:

#People with pets
Pets FirstName           LastName   DOB       PetNames  PetType
2    Paula               Brown      24/05/78  Rex,Fluff dog,cat
1    Steven              Jones      04/04/74  Allie     cat
0    Jane                Doe        24/05/78
1    James               Smith      20/10/80  Spot


Filter 1 - append a line number column:

#People with pets                                                 1
Pets FirstName           LastName   DOB       PetNames  PetType   2
2    Paula               Brown      24/05/78  Rex,Fluff dog,cat   3
1    Steven              Jones      04/04/74  Allie     cat       4
0    Jane                Doe        24/05/78                      5
1    James               Smith      20/10/80  Spot                6

Filter 2 - by regex expression matching [include]: '^\d+' (include lines that start with a number)

2    Paula               Brown      24/05/78  Rex,Fluff dog,cat   3
1    Steven              Jones      04/04/74  Allie     cat       4
0    Jane                Doe        24/05/78                      5
1    James               Smith      20/10/80  Spot                6

Filter 3 - append a line number column:

2    Paula               Brown      24/05/78  Rex,Fluff dog,cat   3  1
1    Steven              Jones      04/04/74  Allie     cat       4  2
0    Jane                Doe        24/05/78                      5  3
1    James               Smith      20/10/80  Spot                6  4

Filter 4 - regex replace value in column[4]: '(\d+)/(\d+)/(\d+)' '19\3-\2-\1' (convert dates to sqlite format)

2    Paula               Brown      1978-05-24  Rex,Fluff dog,cat   3  1
1    Steven              Jones      1974-04-04  Allie     cat       4  2
0    Jane                Doe        1978-05-24                      5  3
1    James               Smith      1980-10-20  Spot                6  4

Filter 5 - normalize list columns[5,6]:

2    Paula               Brown      1978-05-24  Rex       dog       3  1
2    Paula               Brown      1978-05-24  Fluff     cat       3  1
1    Steven              Jones      1974-04-04  Allie     cat       4  2
0    Jane                Doe        1978-05-24                      5  3
1    James               Smith      1980-10-20  Spot                6  4

Filter 6 - append a line number column:

2    Paula               Brown      1978-05-24  Rex       dog       3  1  1
2    Paula               Brown      1978-05-24  Fluff     cat       3  1  2
1    Steven              Jones      1974-04-04  Allie     cat       4  2  3
0    Jane                Doe        1978-05-24                      5  3  4
1    James               Smith      1980-10-20  Spot                6  4  5