Version 2 of A lambda calculus interpreter with arithmetic

Updated 2006-09-10 04:37:04 by NEM

NEM 10 Sept 2006: The lambda calculus is one of the most elegant, and earliest, models of computing around. Its simplicity makes it ideal as a language to implement a little interpreter, while it is equivalent in power to a universal Turing machine. The original lambda calculus consists of just three syntactic constructs:

Variables: x, y, z etc Abstraction: (λ<var>.<expr>) Application: (<expr> <expr>)

Abstraction creates a lambda term, which denotes a single-parameter anonymous function (similar to a proc with no name and only one parameter). Application supplies an argument to a lambda term. Variables in the body of a lambda term are replaced by the argument to that lambda. We will omit the details of how evaluation takes place (Beta-reduction and Alpha-conversion), and instead use a slightly different evaluation strategy that works for what we want. We will also add primitive support for integers and arithmetic into our interpreter. You can actually encode integers using just lambda abstractions (via Church numerals) and basic arithmetic, but the encoding is very inefficient and a bit clumsy to use.

In order to define our interpreter succinctly, we will reuse the simple algebraic types that I used in a little learning decision tree (slightly altered):

 proc datatype {name = args} { uplevel 1 [linsert $args 0 |] } 
 proc | {name args} { proc $name $args { info level 0 } }
 proc func {name cons args} {
     set params [lrange $args 0 end-1]
     set body   [lindex $args end]
     set chead  [lindex $cons 0]
     set ctail  [lrange $cons 1 end]
     proc $name:$chead [concat $ctail $params] $body
     proc $name {con args} [format {
         uplevel 1 %s:$con $args
     } $name]
 }

With this in place, we can now define our datatype of terms in language we wish to interpret. This is the abstract syntax of our language:

 datatype Term = Lambda String Term        ;# abstraction: param body
               | App Term Term             ;# application: lambda arg
               | Var String                ;# variable reference
               | IntLit Integer            ;# integer literals
               | Add Term Term             ;# addition
               | Sub Term Term             ;# subtraction
               | Mul Term Term             ;# multiplication
               | Div Term Term             ;# division
               | Closure Env String Term   ;# closures...

The first three of these constructors are all that is required for the traditional lambda calculus. We have added integer literals and some primitive arithmetic operations. We've also added a constructor for closures, which form part of our evaluation strategy. Basically, instead of performing alpha-conversion and beta-reduction, we instead capture the environment (i.e. the mapping from variable names to values) when we evaluate a lambda abstraction, and then we restore and extend this environment when we apply an abstraction. This achieves the same effect as normal lambda calculus evaluation iff we ensure that we are never passed unbound variable references. To do this, we employ call-by-value semantics, where we make sure that all arguments are evaluated before being passed to a lambda. Anyway, let's see what the core of our interpreter looks like -- our "Eval" function:

 # Eval :: Term -> Env -> Integer
 func Eval [Lambda param body] env  { Closure $env $param $body }
 func Eval [App f x] env            { Apply [Eval $f $env] [Eval $x $env] }
 func Eval [Var name] env           { dict get $env $name }
 func Eval [IntLit i] env           { return $i }
 func Eval [Add a b] env            { + [Eval $a $env] [Eval $b $env] }
 func Eval [Sub a b] env            { - [Eval $a $env] [Eval $b $env] }
 func Eval [Mul a b] env            { * [Eval $a $env] [Eval $b $env] }
 func Eval [Div a b] env            { / [Eval $a $env] [Eval $b $env] }
 func Eval [Closure e v b] env      { Closure $e $v $b }

The advantages of our datatype machinery should be fairly clear from this example. If we look at the cases, the evaluation strategy becomes clear. Evaluating a lambda abstraction forms a closure by capturing the current environment (which is just a dictionary, passed as the second argument to Eval). Evaluating an application calls a helper function Apply, after first evaluating both of its arguments. We always need to evaluate the $f argument here, in order to resolve the lambda term that is being applied. However, we could leave the argument unevaluated -- this is call-by-name evaluation order(*). This would run the risk of name clashes in our interpreter, however, as free variables might be accidentally captured. Evaluating a variable reference simply returns the value of the variable in the environment. Integers and closures evaluate to themselves (they are in normal form), and our arithmetic operations perform the appropriate calculation. This covers all of the basic evaluation. Let's look at the other main function in our interpreter: Apply. This is the function that supplies an argument to a lambda abstraction. The only valid term here is a closure:

 func Apply [Closure env param body] arg {
     Eval $body [dict set env $param $arg]
 }

Here we see how our use of closures makes things simple once again. All we do to apply a lambda function is to grab the environment from the closure, extend it by binding the argument given to the parameter name, and finally evaluate the body in this new environment.

We now have the core of our interpreter implemented, consisting of the somewhat magical pair of Eval/Apply so beloved of SICP [L1 ]. The simplicity and power of this pair of mutually recursive functions is truly one of the gems of computer science.

All that remains is to define our arithmetic helper functions and our interpreter is complete. This is a simple one-liner in Tcl:

 foreach op {+ - * /} { proc $op {a b} "expr {\$a $op \$b}" }

We can now start interpreting some expressions. Firstly, let's try some simple arithmetic:

 # 2 + 3
 % Eval [Add [IntLit 2] [IntLit 3]] {}
 5
 % Eval [Mul [IntLit 2] [IntLit 3]] {}
 6

OK, that seems to work. Now, let's create a function that squares it's argument:

 % set square [Lambda x [Mul [Var x] [Var x]]]
 Lambda x {Mul {Var x} {Var x}}
 % Eval [App $square [IntLit 4]] {}
 16
 % Eval [App $square [IntLit 20]] {}
 400

OK, let's get a bit more complicated -- we will create a curried function that takes a number and then returns a function that will add the original number to its argument. In other words [adder 1] will return a function that adds 1 to its argument:

 % set adder [Lambda x [Lambda y [Add [Var x] [Var y]]]]
 Lambda x {Lambda y {Add {Var x} {Var y}}}
 % Eval [App [App $adder [IntLit 1]] [IntLit 12]] {}
 13
 % set add1 [Eval [App $adder [IntLit 1]] {}]
 Closure {x 1} y {Add {Var x} {Var y}}
 % Eval [App $add1 [IntLit 12]] {}
 13

This is really quite sophisticated behaviour, and our interpreter is only a dozen or so lines!

(*) Actually, our evaluation order isn't quite call-by-value, and the variant isn't call-by-name, but I'll leave the actual definitions for another time. For the curious, there are several good books on the lambda calculus, and many functional programming text books cover it. I recommend L. C. Paulson's "ML for the Working Programmer" which develops a nice lambda calculus interpreter supporting various evaluation strategies, as well as a wonderful tactical theorem prover for the predicate logic.


[ Category Example | Category Language | Functional Programming ]