BlackWaspTM

This web site uses cookies. By using the site you accept the cookie policy.This message is for compliance with the UK ICO law.

Regular Expressions
.NET 1.1+

Regular Expressions Instance Methods

The nineteenth and final part of the Regular Expressions in .NET tutorial looks at an alternative to the static methods employed earlier in the series. Instance methods provide the same functionality but with different caching features.

Regex Class

Throughout the regular expressions tutorial we've used static methods to perform matching operations, substitutions and string splitting. You can also execute the same functionality using instances of the Regex class. The results are exactly the same but the usage and the caching of compiled patterns varies somewhat.

To create an instance of the Regex class you call one of several constructors. Each requires that you provide a string containing the regular expression that will be used for matching. You can also add a second parameter of the RegexOptions type that specifies the options that should be applied by the Regex. NB: Once instantiated, Regex objects are immutable; you cannot modify either the pattern or the options.

Regex pattern1 = new Regex("[A-Z]{1,3}[0-9]{1,3}");
Regex pattern2 = new Regex("[A-Z]{1,3}[0-9]{1,3}", RegexOptions.IgnoreCase);

The instance methods for matching, substitution and splitting use the same names as their static counterparts. As the Regex object already defines the regular expression and the options, these arguments are omitted.

The following sample code provides the same functionality as the IP address matching example in the article that describes quantifiers. The only difference is the use of a Regex instance instead of static method calls.

string input = @"Some of these are valid IP addresses:
 
0.1.2.3
10.20.30.40
100.200.300.400
192.0.0.256
192.0.0.255
192.0.0.249
10.1.30.150
1.01.1.1
99.00.99.00";

string ipAddress =
    @"((2[0-4]\d|25[0-5]|1\d{2}|[1-9]\d|\d)\.){3}(2[0-4]\d|25[0-5]|1\d{2}|[1-9]\d|\d)";
Regex ipRegex = new Regex(ipAddress);

foreach (Match match in ipRegex.Matches(input))
{
    Console.WriteLine("Matched '{0}' at index {1}", match.Value, match.Index);
}

/* OUTPUT
     
Matched '0.1.2.3' at index 41
Matched '10.20.30.40' at index 50
Matched '192.0.0.25' at index 80
Matched '192.0.0.255' at index 93
Matched '192.0.0.249' at index 106
Matched '10.1.30.150' at index 119
              
*/

Regular Expression Caching

As mentioned in the article describing regular expression options, you can compile your regular expressions, or you can allow the engine to convert them to custom opcodes. When using the static methods, the processed regular expressions are cached centrally so that they can be reused more quickly.

When you are using Regex instances, the same conversion or compilation occurs but the processed regular expression is cached within the instance. This is important because when the object goes out of scope, the cached version is lost. If you create multiple objects for the same pattern, the cache will not be used and each will require compiling individually. You should be aware of this, particularly if using Regex instances within loops, where you should create the instance outside of the loop and reuse it wherever possible.

This difference in caching behaviour may lead you to think that the static methods always give better performance. You should note, however, that the cache size is limited. By default, only fifteen regular expressions will be cached. When you know how many items you need to retain, you can modify the size using the static CacheSize property. You can also read this property to find the current size of the cache. If you use a greater number of regular expressions with static methods, older cache items will be removed to make space for the newer ones and some repeated calls may require recompilation. This problem does not exist when using instance methods.

11 January 2016