What it does
This tool finds $ replaces text in an input dataset.
The pattern to find can be a simple text string, or a perl regular expression string (depending on pattern is a regex check-box).
When using regular expressions, the replace pattern can contain back-references ( e.g. \1 )
This tool uses Perl regular expression syntax.
Examples of *regular-expression* Find Patterns
- HELLO The word 'HELLO' (case sensitive).
- AG.T The letters A,G followed by any single character, followed by the letter T.
- A{4,} Four or more consecutive A's.
- chr2[012]\t The words 'chr20' or 'chr21' or 'chr22' followed by a tab character.
- hsa-mir-([^ ]+) The text 'hsa-mir-' followed by one-or-more non-space characters. When using parenthesis, the matched content of the parenthesis can be accessed with 1 in the replace pattern.
Examples of Replace Patterns
- WORLD The word 'WORLD' will be placed whereever the find pattern was found.
- FOO-$&-BAR Each time the find pattern is found, it will be surrounded with 'FOO-' at the begining and '-BAR' at the end. $& (dollar-ampersand) represents the matched find pattern.
- $1 The text which matched the first parenthesis in the Find Pattern.
Example 1
Find Pattern: HELLO
Replace Pattern: WORLD
Regular Expression: no
Replace what: entire line
Every time the word HELLO is found, it will be replaced with the word WORLD.
Example 2
Find Pattern: ^chr
Replace Pattern: (empty)
Regular Expression: yes
Replace what: column 11
If column 11 (of every line) begins with ther letters 'chr', they will be removed. Effectively, it'll turn "chr4" into "4" and "chrXHet" into "XHet"
Perl's Regular Expression Syntax
The Find & Replace tool searches the data for lines containing or not containing a match to the given pattern. A Regular Expression is a pattern descibing a certain amount of text.
- ( ) { } [ ] . * ? + \ ^ $ are all special characters. \ can be used to "escape" a special character, allowing that special character to be searched for.
- ^ matches the beginning of a string(but not an internal line).
- ( .. ) groups a particular pattern.
- { n or n, or n,m } specifies an expected number of repetitions of the preceding pattern.
- {n} The preceding item is matched exactly n times.
- {n,} The preceding item ismatched n or more times.
- {n,m} The preceding item is matched at least n times but not more than m times.
- [ ... ] creates a character class. Within the brackets, single characters can be placed. A dash (-) may be used to indicate a range such as a-z.
- . Matches any single character except a newline.
- * The preceding item will be matched zero or more times.
- ? The preceding item is optional and matched at most once.
- + The preceding item will be matched one or more times.
- ^ has two meaning:
- matches the beginning of a line or string.
- indicates negation in a character class. For example, [^...] matches every character except the ones inside brackets.
- $ matches the end of a line or string.
- \| Separates alternate possibilities.
- \d matches a single digit
- \w matches a single letter or digit or an underscore.
- \s matches a single white-space (space or tabs).