BlackWaspTM

This web site uses cookies. By using the site you accept the cookie policy.This message is for compliance with the UK ICO law.

Regular Expressions
.NET 1.1+

Escaping Text for Regular Expressions

The eighteenth part of the Regular Expressions in .NET tutorial continues to look at the methods provided by the Regex class. This article considers the process of escaping and unescaping text.

Escaping Text

Sometimes you will want to build a regular expression based upon user input. For example, you might wish to perform a search through multiple documents to find the ones that contain the entered text. This might be achieved with a pattern that contains the user's input and some wildcards and quantifiers.

In such a situation, it is important that the entered information is treated as literal text. However, the user may use elements that the regular expression engine recognises and treats as control characters, causing unexpected results when the pattern is matched. To avoid this, you can escape the control characters.

So that you do not need to write your own code to escape the characters, the Regex class includes a static method for this purpose. You can call the Escape method, passing the string to process to its parameter. The escaped text is returned as a new string.

To demonstrate, try running the code below. As the output shows, the comma and the full stop (period) are escaped with backslashes.

string input = "Hello, world.";
string escaped = Regex.Escape(input);

Console.WriteLine(escaped);

// Outputs: "Hello,\ world\."

Unescaping Text

You can reverse the process with the Unescape method. This looks for escaped characters and replaces them with the original text. In most cases this works perfectly. However, as some character sequences can be ambiguous, there are situations where the unescaped string is not identical to the original. In addition, if the string to unescape could not have been created by Escape, the method may throw an exception.

The following sample escapes a string, then unescapes the result. The output shows that the original text and the unescaped version match.

string input = "Hello, world.";
string escaped = Regex.Escape(input);
string unescaped = Regex.Unescape(escaped);

Console.WriteLine(unescaped);

// Outputs: "Hello, world."
6 January 2016