Galaxy | Tool Preview

Filter Tabular (version 3.3.0)
Filter Tabular Input Lines
Filter Tabular Input Lines 0

Filter Tabular

Filter a tabular dataset by applying line filters as it is being read. Multiple filters may be used with each filter using the result of the previous filter.

Inputs

A tabular dataset.

Outputs

A filtered tabular dataset.

Input Line Filters

As a tabular file is being read, line filters may be applied:

  • skip leading lines - skip the first number of lines
  • comment char - omit any lines that start with the specified comment character
  • by regex expression matching - include/exclude lines that match the regex expression
  • select columns - choose to include only selected columns in the order specified
  • select columns by indices/slices - indices or slices of the columns to keep (python_list indexing)
  • regex replace value in column - replace a field in a column using a regex substitution (good for date reformatting)
  • regex replace value in column - add a new column using a regex substitution of a column value
  • prepend a line number column - each line has the ordinal value of the line read by this filter as the first column
  • append a line number column - each line has the ordinal value of the line read by this filter as the last column
  • prepend a text column - each line has the text string as the first column
  • append a text column - each line has the text string as the last column
  • prepend the dataset name - each line has the dataset name as the first column
  • append the dataset name - each line has the dataset name as the last column
  • normalize list columns - replicates the line for each item in the specified list columns
Line Filtering Example

(Six filters are applied as the following file is read)

Input Tabular File:

#People with pets
Pets FirstName           LastName   DOB       PetNames  PetType
2    Paula               Brown      24/05/78  Rex,Fluff dog,cat
1    Steven              Jones      04/04/74  Allie     cat
0    Jane                Doe        24/05/78
1    James               Smith      20/10/80  Spot


Filter 1 - append a line number column:

#People with pets                                                 1
Pets FirstName           LastName   DOB       PetNames  PetType   2
2    Paula               Brown      24/05/78  Rex,Fluff dog,cat   3
1    Steven              Jones      04/04/74  Allie     cat       4
0    Jane                Doe        24/05/78                      5
1    James               Smith      20/10/80  Spot                6

Filter 2 - by regex expression matching [include]: '^\d+' (include lines that start with a number)

2    Paula               Brown      24/05/78  Rex,Fluff dog,cat   3
1    Steven              Jones      04/04/74  Allie     cat       4
0    Jane                Doe        24/05/78                      5
1    James               Smith      20/10/80  Spot                6

Filter 3 - append a line number column:

2    Paula               Brown      24/05/78  Rex,Fluff dog,cat   3  1
1    Steven              Jones      04/04/74  Allie     cat       4  2
0    Jane                Doe        24/05/78                      5  3
1    James               Smith      20/10/80  Spot                6  4

Filter 4 - regex replace value in column[4]: '(\d+)/(\d+)/(\d+)' '19\3-\2-\1' (convert dates to sqlite format)

2    Paula               Brown      1978-05-24  Rex,Fluff dog,cat   3  1
1    Steven              Jones      1974-04-04  Allie     cat       4  2
0    Jane                Doe        1978-05-24                      5  3
1    James               Smith      1980-10-20  Spot                6  4

Filter 5 - normalize list columns[5,6]:

2    Paula               Brown      1978-05-24  Rex       dog       3  1
2    Paula               Brown      1978-05-24  Fluff     cat       3  1
1    Steven              Jones      1974-04-04  Allie     cat       4  2
0    Jane                Doe        1978-05-24                      5  3
1    James               Smith      1980-10-20  Spot                6  4

Filter 6 - select columns by indices/slices: '1:6'

Paula               Brown      1978-05-24  Rex       dog
Paula               Brown      1978-05-24  Fluff     cat
Steven              Jones      1974-04-04  Allie     cat
Jane                Doe        1978-05-24
James               Smith      1980-10-20  Spot