Regular expression and escape character usage with the RXReplace() expression function in Spotfire

Regular expression and escape character usage with the RXReplace() expression function in Spotfire

book

Article ID: KB0070954

calendar_today

Updated On:

Products Versions
Spotfire Analyst All Versions

Description

When using regular expressions and the RXReplace() function, you will often need to use escape characters that have special meaning in the regular expression language. Escape characters are needed on both the regular expression level and on the Spotfire expression level. This can be difficult to manage since multiple escape characters may be required. The usual metacharacters that need to be escaped are {}[]()^$.|*+? and \
 

Issue/Introduction

This article describes regular expression and escape character usage with the RXReplace() expression function in Spotfire.

Resolution

Escape Character in Regular Expression:
For example, in regular expressions, a period "." is a special character which matches any single character. So if you want to match the literal period in this string:
String1.String2
You would need to escape "." in the regular expression with a \ character. The regular expression to match the literal period would be:
\.

Escape Characters in Spotfire expressions:
In order to match this same "." character in the Spotfire expression language, you would need to escape that \ escape character with an additional \ character. So your Spotfire expression with the RXReplace() function would be:
RXReplace([Column 1],"\\.","myReplacementText","g")

This would match all "." characters and replace it with 'myReplacementText'.


How to match the \ character in Spotfire expressions:

Since \ is a special character in regular expressions and in the RXReplace() function, this requires multiple levels of escape characters to match a literal \ character. If you want to match the literal \ in this string:

SomeText\What I want to capture\SomeOtherText

You would need to escape  \  in the regular expression with a \ character. In regular expressions, you can match the \ character with:

\\
In the Spotfire expression language, you would need to escape both of the \ characters that are required in the basic regular expression. The Spotfire expression with the RXReplace() function would be:
RXReplace([Column 1],"\\\\","myReplacementText","g")
This would match all \ characters and replace it with 'myReplacementText'.  \\\\ in the RXReplace() function matches a single literal \ character.

Here is a more complex example which replaces all text between the two \ characters in the above sample string:
RXReplace([Column 1],"^.*\\\\(.*)\\\\.*$","$1","")

Explanation:
^ - Matches the starting position within the string.
.* - Matches any number of any character (. = any character, * = matches the preceding element zero or more times).
\\\\ - Matches a literal \
(.*) - Matches any number of any characters and remembers the string so that we can return it with $1, $2, $3, etc.
\\\\ - Matches a literal \
.* - Matches any number of any character.
$ - Matches the ending position of the string.

For more information on using regular expressions in general, please refer to the many online resources on the topic, such as http://en.wikipedia.org/wiki/Regular_expression

Additional Information

Regular Expression:

Doc: Text Functions