 .NET 2.0+Interpreter Design PatternThe interpreter pattern is a design pattern that is useful when developing domain-specific languages or notations. The pattern allows the grammar for such a notation to be represented in an object-oriented fashion that can easily be extended.
What is the Interpreter Pattern?
The interpreter pattern is a Gang of Four design pattern. This is a behavioural pattern as it defines a manner for controlling communication between classes or entities. The interpreter pattern is used to define the grammar for instructions that form part of a language or notation, whilst allowing the grammar to be easily extended.
The interpreter pattern performs activities base upon a hierarchy of expressions. Each expression is terminal, meaning that it is a standalone structure that can be immediately evaluated, or non-terminal, meaning that it is composed of one or more expressions. The tree structure is similar to that defined by the composite design pattern, with terminal expressions being leaf objects and non-terminal expressions being composites. The tree contains the expressions to be evaluated and is usually generated by a parser. The parser itself is not a part of the interpreter pattern.
The interpreter design pattern is useful for simple languages where performance is not critical. As the grammar becomes more complex, the number of different expression types, each represented by its own class, can become unwieldy and lead to unmanageable class hierarchies. This can also slow the processing of the expressions. For these reasons, the pattern is considered to be inefficient and is rarely used. However, it should not be discounted for some situations.
An example of the use of the interpreter design pattern could be the processing of mathematical problems provided in a simplified Polish notation. This notation defines a mathematical operator followed by two values, for example "+ 5 6". In this case, the + symbol indicates that the two following values should be summed, giving 11. The notation allows multiple operators and values to be included in the string, for example "+ - 6 5 7". In this case, the subtraction would be applied to the 6 and 5 to give 1. The addition would then be applied to the calculated 1 and the 7 for a final result of 8. Polish notation is useful because it does not require the use of parentheses to avoid ambiguity.
A simple parser could be used to convert integer values from the Polish notation into terminal expressions and the operators into non-terminal expressions. The operator classes would contain two expressions, each of which may be terminal or non-terminal. Once the parser has created this hierarchical structure, the interpreter design pattern could be used to calculate the result of the original Polish notation problem. The hierarchy of expressions for "+ - 6 5 7" would be as follows:

Implementing the Interpreter Pattern

The UML class diagram above shows an implementation of the interpreter design pattern. The items in the diagram are described below:
- Client. The client class represents the consumer of the interpreter design pattern. Client objects build the tree of expressions that represent the commands to be executed, often with the help of a parser class. The Interpret method of the top item in the tree is then called, passing any context object, to execute all of the commands in the tree.
- Context. The context class is used to store any information that needs to be available to all of the expression objects. If no global context is required this class is unnecessary.
- ExpressionBase. This abstract class is the base class for all expressions. It defines the Interpret method, which must be implemented for each subclass.
- TerminalExpression. Terminal expressions are those that can be interpreted in a single object. These are created as concrete subclasses of the ExpressionBase class.
- NonterminalExpression. Non-terminal expressions are represented using a concrete subclass of ExpressionBase. These expressions are aggregates containing one or more further expressions, each of which may be terminal or non-terminal. When a non-terminal expression class' Interpret method is called, the process of interpretation includes calls to the Interpret method of the expressions it holds.
The following shows the basic code of the interpreter design pattern implemented using C#. In this case the client builds the expression tree without a parser. The code uses C# 3.0 automatically implemented property syntax. For earlier versions of the language you should expand these property declarations to include a backing variable.
public class Client
{
public void BuildAndInterpretCommands()
{
Context context = new Context("the context");
NonterminalExpression root = new NonterminalExpression();
root.Expression1 = new TerminalExpression();
root.Expression2 = new TerminalExpression();
root.Interpret(context);
}
}
public class Context
{
public string Name { get; set; }
public Context(string name)
{
Name = name;
}
}
public abstract class ExpressionBase
{
public abstract void Interpret(Context context);
}
public class TerminalExpression : ExpressionBase
{
public override void Interpret(Context context)
{
Console.WriteLine("Terminal for {0}.", context.Name);
}
}
public class NonterminalExpression : ExpressionBase
{
public ExpressionBase Expression1 { get; set; }
public ExpressionBase Expression2 { get; set; }
public override void Interpret(Context context)
{
Console.WriteLine("Nonterminal for {0}.", context.Name);
Expression1.Interpret(context);
Expression2.Interpret(context);
}
}
Example Interpreter
Earlier in this article an example of the interpreter pattern was described. This explained that the pattern could be used to process Polish notation mathematical functions. In the remainder of this article we will implement the grammar required for a simplified Polish notation that permits the use of integer values and the addition (+) and subtraction (-) operators. We will also create a parser that converts a string containing Polish notation into a hierarchy of expressions.
Creating the Grammar
The grammar for the Polish notation interpreter will be very simple. There will be one terminal expression that can be used to represent an integer. Two non-terminal expressions will be created to represent the addition and subtraction operators. You could easily add further operators with additional classes. No context class will be necessary as no global information will be required.
To create the base class, open a new console application project and add the following code:
public abstract class ExpressionBase
{
public abstract int Evaluate();
}
The base class includes one method, which evaluates an expression and returns the calculated value as an integer. As there is no Context class, the method needs no parameters.
To add the terminal expression class that will be used to represent integers, add the following code:
public class IntegerExpression : ExpressionBase
{
int _value;
public IntegerExpression(int value)
{
_value = value;
}
public override int Evaluate()
{
return _value;
}
public override string ToString()
{
return _value.ToString();
}
}
As you can see, when an IntegerExpression object is created, the value that it represents is passed into the constructor. When the expression is evaluated this value is returned. To make debugging easier and to allow the expression to be output to the console the ToString method has been overridden. This returns a string representation of the value.
Next we can add the two non-terminal expression classes:
public class AdditionExpression : ExpressionBase
{
ExpressionBase _expr1;
ExpressionBase _expr2;
public AdditionExpression(ExpressionBase expr1, ExpressionBase expr2)
{
_expr1 = expr1;
_expr2 = expr2;
}
public override int Evaluate()
{
int value1 = _expr1.Evaluate();
int value2 = _expr2.Evaluate();
return value1 + value2;
}
public override string ToString()
{
return string.Format("({0} + {1})", _expr1, _expr2);
}
}
public class SubtractionExpression : ExpressionBase
{
ExpressionBase _expr1;
ExpressionBase _expr2;
public SubtractionExpression(ExpressionBase expr1, ExpressionBase expr2)
{
_expr1 = expr1;
_expr2 = expr2;
}
public override int Evaluate()
{
int value1 = _expr1.Evaluate();
int value2 = _expr2.Evaluate();
return value1 - value2;
}
public override string ToString()
{
return string.Format("({0} - {1})", _expr1, _expr2);
}
}
The non-terminal expression classes are quite simple. When creating a new object two expressions are passed to the constructor. These are held within the class for use during evaluation. Each may represent an integer value or another operator with further expressions to be interpreted recursively. When the Evaluate method is called the two expressions are evaluated before the addition or subtraction occurs and the result is returned. Again, the ToString method is overridden to allow the expression to be viewed easily. The ToString method may also include recursive calls.
The code so far completes the key classes of the interpreter design pattern. However, in this case we will also create a parser that can convert strings such as "+ 5 6" into the grammar understood by the interpreter. The parser will be very simple and will not including any validation of the input string. It will split the string into a list of commands and recursively convert each symbol into an expression.
The code for the parser is as follows:
public class Parser
{
public ExpressionBase Parse(string polish)
{
List<string> symbols = new List<string>(polish.Split(' '));
return ParseNextExpression(symbols);
}
public ExpressionBase ParseNextExpression(List<string> symbols)
{
int value;
if (int.TryParse(symbols[0], out value))
{
symbols.RemoveAt(0);
return new IntegerExpression(value);
}
else
{
return ParseNonTerminalExpression(symbols);
}
}
private ExpressionBase ParseNonTerminalExpression(List<string> symbols)
{
string symbol = symbols[0];
symbols.RemoveAt(0);
ExpressionBase expr1 = ParseNextExpression(symbols);
ExpressionBase expr2 = ParseNextExpression(symbols);
switch (symbol)
{
case "+":
return new AdditionExpression(expr1, expr2);
case "-":
return new SubtractionExpression(expr1, expr2);
default:
string message = string.Format("Invalid Symbol ({0})", symbol);
throw new InvalidOperationException(message);
}
}
}
Testing the Interpreter
To test the interpreter and its parser we can evaluate some Polish notation strings from the Main method of the program. Add the following code to the Main method and execute the program to see the results. Try modifying the command strings to see the results. Remember that all values and operators should be separated with a single space.
Parser parser = new Parser();
string[] commands = new string[]
{
"+ 5 6",
"- 6 5",
"+ - 4 5 6",
"+ 4 - 5 6",
"+ - + - - 2 3 4 + - -5 6 + -7 8 9 10"
};
foreach (string command in commands)
{
ExpressionBase expression = parser.Parse(command);
Console.WriteLine("{0} = {1}", expression, expression.Evaluate());
}
/* OUTPUT
(5 + 6) = 11
(6 - 5) = 1
((4 - 5) + 6) = 5
(4 + (5 - 6)) = 3
(((((2 - 3) - 4) + ((-5 - 6) + (-7 + 8))) - 9) + 10) = -14
*/
|