Galaxy | Tool Preview

Trim (version 0.0.1)
0 = process entire line
Only positive positions allowed. 1 = do not trim the beginning
Use negative position to indicate position starting from the end. 0 = do not trim the end
If set to 'Yes', the tool will not trim evenly numbered lines (0, 2, 4, etc...). This allows for trimming the seq and qual lines, only if they are not spread over multiple lines (see warning below)
Lines beginning with these are not trimmed

What it does

Trims specified number of characters from a dataset or its field (if dataset is tab-delimited).


Example 1

Trimming this dataset:

1234567890
abcdefghijk

by setting Trim from the beginning up to this position to 2 and Remove everything from this position to the end to 6 will produce:

23456
bcdef

Example 2

Trimming column 2 of this dataset:

abcde 12345 fghij 67890
fghij 67890 abcde 12345

by setting Trim content of this column only to 2, Trim from the beginning up to this position to 2, and Remove everything from this position to the end to 4 will produce:

abcde  234 fghij 67890
fghij  789 abcde 12345

Example 3

Trimming column 2 of this dataset:

abcde 12345 fghij 67890
fghij 67890 abcde 12345

by setting Trim content of this column only to 2, Trim from the beginning up to this position to 2, and Remove everything from this position to the end to -2 will produce:

abcde  23 fghij 67890
fghij  78 abcde 12345

Trimming FASTQ datasets

This tool can be used to trim sequences and quality strings in FASTQ datasets. This is done by selected Yes from the Is input dataset in FASTQ format? dropdown. If set to Yes, the tool will skip all even numbered lines (see warning below). For example, trimming last 5 bases of this dataset:

@081017-and-081020:1:1:1715:1759
GGACTCAGATAGTAATCCACGCTCCTTTAAAATATC
+
II#IIIIIII$5+.(9IIIIIII$%*$G$A31I&&B

cab done by setting Remove everything from this position to the end to 31:

@081017-and-081020:1:1:1715:1759
GGACTCAGATAGTAATCCACGCTCCTTTAAA
+
II#IIIIIII$5+.(9IIIIIII$%*$G$A3

Note that headers are skipped.

WARNING: This tool will only work on properly formatted FASTQ datasets where (1) each read and quality string occupy one line and (2) '@' (read header) and "+" (quality header) lines are evenly numbered like in the above example.