Attribute Grammars

Before there was Ohm, people came up with different ways to associate semantics with syntax. One technique involved attaching properties to grammar symbols, and this got the silly name “attribute grammars.” Let’s see what these are.

Types of Attribute Grammars

We can write attribute grammars with either:

These will be explained by example. We’ll use examples that we have seen previously in Ohm.

Example: Infix to Postfix Conversion

Here is an attribute grammar that produces a postfix string from an infix string, using semantic functions. We'll use this grammar for the source language (the infix):

E → E "+" T | E "-" T | T
T → T "*" F | T "/" F | F
F → "-" F | "(" E ")" | intlit | id

For the target language, we’ll use "~" for unary negation in the postfix formulation in order to avoid parentheses. As long as all operators have a fixed arity, parentheses are not necessary.

Grammar RulesSemantic Rules
E → E1 "+" T E.post := E1.post • T.post • "+"
E → E1 "-" T E.post := E1.post • T.post • "-"
E → T E.post := T.post
T → T1 "*" F T.post := T1.post • F.post • "*"
T → T1 "/" F T.post := T1.post • F.post • "/"
T → F T.post := F.post
F → "-" F1 F.post := F1.post • "~"
F → ( E ) F.post := E.post
F → intlit F.post := image(intlit)
F → id F.post := image(id)

See how it works? Each symbol gets some properties (called attributes) as necessary, and we make rules that show how to assign attribute values. There’s a lot of theory here that we won’t cover, like whether attributes are synthezied or inherited, but you should work on gaining a basic understanding of what attribute grammars look like.

Now here the specification using action routines. For these, it's best to change up the grammar just a bit.

E → T (addop T)*
T → F (mulop F)*
F → "-" F | "(" E ")" | intlit | id
E     → T {E.post := T.post} (addop T {E.post •= T.post • addop.text})*
T     → F {T.post := F.post} (mulop F {T.post •= F.post • mulop.text})*
F     → - F1 {F.post := F1.post • "~"}
F     → "(" E {F.post := E.post} ")"
F     → id {F.post := image(id)}
F     → intlit {F.post := image(intlit)}
addop → "+" {addop.text := "+"} | "-" {addop.text := "-"}
mulop → "*" {mulop.text := "*"} | "/" {mulop.text := "/"}

There are a number of existing tools that make use of attribute grammars. One is JavaCC. If interested, here’s a complete JavaCC program that can do infix to postfix conversion.

InfixToPostfix.jj
// This spec generates a class called InfixToPostfix.java that contains a
// method that takes in a string representing an infix expression and returns
// a string representing the equivalent postfix expression.  We assume the
// usual conventions of left associativity and higher precedence of
// multiplicative operators over the additive ones, with unary negation
// having the highest precendence of all.

options {STATIC = false;}
PARSER_BEGIN(InfixToPostfix)
import java.io.*;

public class InfixToPostfix {

    public static String toPostfix(String infix) throws Exception {
        InfixToPostfix converter = new InfixToPostfix(new StringReader(infix));
        return converter.S();
    }

    // Throwaway "exerciser" for developing.  Reads lines from standard input
    // and writes the postfix form of each to standard output.
    public static void main(String[] args) throws Exception {
        BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
        while (true) {
            String line = in.readLine();
            if (line == null) break;
            System.out.println(toPostfix(line));
        }
    }
}

PARSER_END(InfixToPostfix)

// E -> T (addop T)*
// T -> F (mulop F)*
// F -> - F | id | intlit | "(" E ")"

SKIP: {" " | "\t" | "\n" | "\r"}
TOKEN: {
  <ADDOP: "+" | "-"> | <MULOP: "*" | "/"> | "~"
  | "x" | <INTLIT: (["0"-"9"])+>
}

String S(): {String s;}
{
  s=E() <EOF> {return s;}
}

String E(): {String left, right; Token op;}
{
  left=T() (op=<ADDOP> right=T() {left = left + right + op.image + " ";})*
  {return left;}
}

String T(): {String left, right; Token op;}
{
  left=F() (op=<MULOP> right=F() {left = left + right + op.image + " ";})*
  {return left;}
}

String F(): {String s; Token t;}
{
  "x" {return "x ";}
  | t=<INTLIT> {return t.image + " ";}
  | "(" s=E() ")" {return s;}
  | "~" s=F() {return s + "~ ";}
}

Example: Postfix to Infix Conversion

An attribute grammar that gives the highly parenthesized form of a postfix expression in infix, using semantic functions:

Grammar RulesSemantic Rules
E → E1 E2 binop E.infix := "(" • E1.infix • binop.text • E2.infix • ")"
E → E1 "~" E.infix := "(-" • E1.infix • ")"
E → id E.infix := image(id)
E → intlit E.infix := image(intlit)
binop → + binop.text := "+"
binop → - binop.text := "-"
binop → * binop.text := "*"
binop → / binop.text := "/"

Now, with action routines. For these we switch to an LL grammar, by removing the left recursion. In place of

  E → E E binop | E uop | id | intlit

we have

  E → (id | intlit)(E binop | uop)*
E → (id {E.infix = image(id)} | intlit {E.infix = image(intlit)})
     (
       E1 binop {E.infix = "(" • E.infix • binop.text • E1.infix • ")"}
     |
       "~" {E.infix = "(~" + E.infix • ")"}
     )*
binop → "+" {binop.text := "+"}
binop → "-" {binop.text := "-"}
binop → "*" {binop.text := "*"}
binop → "/" {binop.text := "/"}

Now here it is in JavaCC

PostfixToInfix.jj
// This spec generates a class called PostfixToInfix.java that contains a
// method that takes in a string representing an postfix expression and returns
// a string representing the equivalent infix expression.

options {STATIC = false;}
PARSER_BEGIN(PostfixToInfix)
import java.io.*;

public class PostfixToInfix {

    public static String toInfix(String postfix) throws Exception {
        PostfixToInfix converter = new PostfixToInfix(new StringReader(postfix));
        return converter.S();
    }

    // Throwaway "exerciser" for developing.  Reads lines from standard input
    // and writes the infix form of each to standard output.
    public static void main(String[] args) throws Exception {
        BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
        while (true) {
            String line = in.readLine();
            if (line == null) break;
            System.out.println(toInfix(line));
        }
    }
}

PARSER_END(PostfixToInfix)

// E -> (id | intlit) (E binop | "~")*

SKIP: {" " | "\t" | "\n" | "\r"}
TOKEN: {
  <BINOP: "+" | "-" | "*" | "/"> | "~" | "x" | <INTLIT: (["0"-"9"])+>
}

String S(): {String s;}
{
  s=E() <EOF> {return s;}
}

String E(): {String e1, e2; Token b, t;}
{
  (t=<INTLIT> {e1 = t.image;} | t="x" {e1 = "x";})
  (
    e2=E() b=<BINOP> {e1 = "(" + e1 + b.image + e2 + ")";}
  |
    "~" {e1 = "(~" + e1 + ")";}
  )*
  {return e1;}
}

Example: Polynomial Differentiation

An attribute grammar for finding the derivative of polynomials, using semantic functions:

Grammar RulesSemantic Rules
P → P1 + T P.deriv := P1.deriv • "+" • T.deriv
P → P1 - T P.deriv := P1.deriv • "-" • T.deriv
P → T P.deriv := T.deriv
P → - T P.deriv := "-" • T.deriv
T → C T.deriv := "0"
T → C x T.deriv := C.val
T → C x ^ E T.deriv := product(E.val,C.val) • "x^" • pred(E.val)
T → x T.deriv := "1"
T → x ^ E T.deriv := E.val • "x^" • pred(E.val)
C → const C.val := valueof(const)
E → const E.val := valueof(const)
E → - const E.val := "-" • valueof(const)

With action routines:

P → {P.deriv := ""} ["-" {P.deriv := "-"}]
     T {P.deriv •= T.deriv}
     (
         ("+" {P.deriv •= "+"} |"-" {P.deriv •= "-"})
         T {P.deriv •= T.deriv}
     )*

T → {c := 1; e := 1} [N {c := N.value}]
     "x" ["^" ["-" {e := -1}] N {e *= -N.value}]
     {T.deriv := product(c,e) • "x^" • pred(e)}
  |  N {T.deriv := "0"}