BlackWaspTM

This web site uses cookies. By using the site you accept the cookie policy.This message is for compliance with the UK ICO law.

Regular Expressions
.NET 1.1+

Regular Expression Character Escapes

The third part of the Regular Expressions in .NET tutorial starts to look at the special characters that can be used to build regular expression patterns for matching. This article describes character escapes, which allow matching of special items such as control characters.

Matching ASCII and Unicode Characters

The final group of character escapes allows you to match characters using their ASCII or Unicode values. There are several options, depending upon whether you wish to match using ASCII or Unicode values, and if you want to specify the characters to find using either octal or hexadecimal.

To find an ASCII character using octal, you simply use the backslash followed by three octal digits. For example, the following code finds all of the lower case letter "e" characters, using the octal code 157.

string input = "Hello World!";

foreach (Match match in Regex.Matches(input, @"\157"))
{
    Console.WriteLine("Matched at index {0}", match.Index);
}

/* OUTPUT
 
Matched at index 4
Matched at index 7
 
*/

To find ASCII codes using hexadecimal, the two hexadecimal digits should follow \x. The following code repeats the previous example using the hexadecimal code, 6F.

string input = "Hello World!";

foreach (Match match in Regex.Matches(input, @"\x6f"))
{
    Console.WriteLine("Matched at index {0}", match.Index);
}

/* OUTPUT
 
Matched at index 4
Matched at index 7
 
*/

As the Unicode character set is much larger than ASCII, more digits are required to specify a code. You can search for a Unicode character using four hexadecimal digits after the \u character escape.

string input = "Hello World!";

foreach (Match match in Regex.Matches(input, @"\u006f"))
{
    Console.WriteLine("Matched at index {0}", match.Index);
}

/* OUTPUT
 
Matched at index 4
Matched at index 7
 
*/
8 September 2015