.NET 2.0+

Interpreter Design Pattern

by Richard Carr, published at http://www.blackwasp.co.uk/Interpreter.aspx

The interpreter pattern is a design pattern that is useful when developing domain-specific languages or notations. The pattern allows the grammar for such a notation to be represented in an object-oriented fashion that can easily be extended.

What is the Interpreter Pattern?

The interpreter pattern is a Gang of Four design pattern. This is a behavioural pattern as it defines a manner for controlling communication between classes or entities. The interpreter pattern is used to define the grammar for instructions that form part of a language or notation, whilst allowing the grammar to be easily extended.

The interpreter pattern performs activities base upon a hierarchy of expressions. Each expression is terminal, meaning that it is a standalone structure that can be immediately evaluated, or non-terminal, meaning that it is composed of one or more expressions. The tree structure is similar to that defined by the composite design pattern, with terminal expressions being leaf objects and non-terminal expressions being composites. The tree contains the expressions to be evaluated and is usually generated by a parser. The parser itself is not a part of the interpreter pattern.

The interpreter design pattern is useful for simple languages where performance is not critical. As the grammar becomes more complex, the number of different expression types, each represented by its own class, can become unwieldy and lead to unmanageable class hierarchies. This can also slow the processing of the expressions. For these reasons, the pattern is considered to be inefficient and is rarely used. However, it should not be discounted for some situations.

An example of the use of the interpreter design pattern could be the processing of mathematical problems provided in a simplified Polish notation. This notation defines a mathematical operator followed by two values, for example "+ 5 6". In this case, the + symbol indicates that the two following values should be summed, giving 11. The notation allows multiple operators and values to be included in the string, for example "+ - 6 5 7". In this case, the subtraction would be applied to the 6 and 5 to give 1. The addition would then be applied to the calculated 1 and the 7 for a final result of 8. Polish notation is useful because it does not require the use of parentheses to avoid ambiguity.

A simple parser could be used to convert integer values from the Polish notation into terminal expressions and the operators into non-terminal expressions. The operator classes would contain two expressions, each of which may be terminal or non-terminal. Once the parser has created this hierarchical structure, the interpreter design pattern could be used to calculate the result of the original Polish notation problem. The hierarchy of expressions for "+ - 6 5 7" would be as follows:

Polish Notation Expression Tree

Implementing the Interpreter Pattern

Interpreter Design Pattern UML

The UML class diagram above shows an implementation of the interpreter design pattern. The items in the diagram are described below:

Client. The client class represents the consumer of the interpreter design pattern. Client objects build the tree of expressions that represent the commands to be executed, often with the help of a parser class. The Interpret method of the top item in the tree is then called, passing any context object, to execute all of the commands in the tree.
Context. The context class is used to store any information that needs to be available to all of the expression objects. If no global context is required this class is unnecessary.
ExpressionBase. This abstract class is the base class for all expressions. It defines the Interpret method, which must be implemented for each subclass.
TerminalExpression. Terminal expressions are those that can be interpreted in a single object. These are created as concrete subclasses of the ExpressionBase class.
NonterminalExpression. Non-terminal expressions are represented using a concrete subclass of ExpressionBase. These expressions are aggregates containing one or more further expressions, each of which may be terminal or non-terminal. When a non-terminal expression class's Interpret method is called, the process of interpretation includes calls to the Interpret method of the expressions it holds.

The following shows the basic code of the interpreter design pattern implemented using C#. In this case the client builds the expression tree without a parser. The code uses C# 3.0 automatically implemented property syntax. For earlier versions of the language you should expand these property declarations to include a backing variable.

public class Client
{
    public void BuildAndInterpretCommands()
    {
        Context context = new Context("the context");
        NonterminalExpression root = new NonterminalExpression();
        root.Expression1 = new TerminalExpression();
        root.Expression2 = new TerminalExpression();
        root.Interpret(context);
    }
}


public class Context
{
    public string Name { get; set; }
    
    public Context(string name)
    {
        Name = name;
    }
}


public abstract class ExpressionBase
{
    public abstract void Interpret(Context context);
}


public class TerminalExpression : ExpressionBase
{
    public override void Interpret(Context context)
    {
        Console.WriteLine("Terminal for {0}.", context.Name);
    }
}


public class NonterminalExpression : ExpressionBase
{
    public ExpressionBase Expression1 { get; set; }

    public ExpressionBase Expression2 { get; set; }

    public override void Interpret(Context context)
    {
        Console.WriteLine("Nonterminal for {0}.", context.Name);
        Expression1.Interpret(context);
        Expression2.Interpret(context);
    }
}

Next: Page 2

27 July 2009