What it does
This tool performs find & replace operation on a specified column in a given file.
The pattern to find uses the extended regular expression syntax (same as running 'awk --re-interval').
TIP: If you need more complex patterns, use the awk tool.
Examples of Find Patterns
- HELLO The word 'HELLO' (case sensitive).
- AG.T The letters A,G followed by any single character, followed by the letter T.
- A{4,} Four or more consecutive A's.
- chr2[012]\t The words 'chr20' or 'chr21' or 'chr22' followed by a tab character.
- hsa-mir-([^ ]+) The text 'hsa-mir-' followed by one-or-more non-space characters. When using parenthesis, the matched content of the parenthesis can be accessed with 1 in the replace pattern.
Examples of Replace Patterns
- WORLD The word 'WORLD' will be placed whereever the find pattern was found.
- FOO-&-BAR Each time the find pattern is found, it will be surrounded with 'FOO-' at the begining and '-BAR' at the end. & (ampersand) represents the matched find pattern.
- \1 The text which matched the first parenthesis in the Find Pattern.
Example 1
Find Pattern: HELLO
Replace Pattern: WORLD
Every time the word HELLO is found, it will be replaced with the word WORLD. This operation affects only the selected column.
Example 2
Find Pattern: ^(.{4})
Replace Pattern: &\t
Find the first four characters in each line, and replace them with the same text, followed by a tab character. In practice - this will split the first line into two columns. This operation affects only the selected column.
Extened Regular Expression Syntax
The select tool searches the data for lines containing or not containing a match to the given pattern. A Regular Expression is a pattern descibing a certain amount of text.
- ( ) { } [ ] . * ? + ^ $ are all special characters. \ can be used to "escape" a special character, allowing that special character to be searched for.
- ^ matches the beginning of a string(but not an internal line).
- ( .. ) groups a particular pattern.
- { n or n, or n,m } specifies an expected number of repetitions of the preceding pattern.
- {n} The preceding item is matched exactly n times.
- {n,} The preceding item ismatched n or more times.
- {n,m} The preceding item is matched at least n times but not more than m times.
- [ ... ] creates a character class. Within the brackets, single characters can be placed. A dash (-) may be used to indicate a range such as a-z.
- . Matches any single character except a newline.
- * The preceding item will be matched zero or more times.
- ? The preceding item is optional and matched at most once.
- + The preceding item will be matched one or more times.
- ^ has two meaning:
- matches the beginning of a line or string.
- indicates negation in a character class. For example, [^...] matches every character except the ones inside brackets.
- $ matches the end of a line or string.
- | Separates alternate possibilities.
Note: AWK uses extended regular expression syntax, not Perl syntax. \d, \w, \s etc. are not supported.