Combinator Engine

KBK: After burning myself on hot curry, I started wasting entirely too much time on an engine that builds up recursive function theory from a minimalist set of primitives.

In this page, we'll be developing an implementation of a function that computes Fibonacci numbers. We'll do it in a language with:

  • no if statement.
  • no while loops. In fact, no control structures at all, except as shown below.
  • no named variables.
  • no named user functions (Well, we'll use named functions for shorthand, but they're solely string substitutions. They don't take arguments. We *could* just expand them recursively.)
  • not even any arithmetic! (Well, hardly any, we cheat a little bit on input and output...)

The only operators are S and K, defined as

S a b c = a c { b c }
K a b   = a

And that's all. We'll allow for simple expansion of named functions, so that we can write

define I S K K

but that's just shorthand, we could simply spell things out everywhere.


KBK 2002-12-17: reworked most of the discussion. Introduced the Church numerals and the reaching combinators, in an effort to bring things down to irreducible basics.

In order to keep my sanity, I occasionally found myself needing to trace what the engine did, so my source file begins with

proc traceP args {
    foreach arg $args {
        puts [format %*s%s [info level] {} [uplevel 1 [list subst $arg]]]
    }
}
proc traceP args {}

The fundamental problem with hot curry is that eager evaluation gets in the way of searching for fixed points. Hence we do our own lazy evaluator. The evaluator can be extremely simple, since every function that it recognizes has a single argument! (To Do: Rework this discussion for eager (or strict) evaluation, by replacing the Y combinator below with Schoenfinkel's U combinator, which works equally well in a strict evaluator.)

proc lazyEval args {
    while { [llength $args] > 1 || [string match {{*}} $args] } {
        traceP {reducing $args}
        set nargs [eval [list lreplace $args 0 1] \
            [uplevel 1 [lrange $args 0 1]]]
        traceP {result is $nargs}
        if { [string equal $nargs $args] } break
        set args $nargs
    }
    return $args
}

We borrow some code from hot curry to allow functions to be called with fewer arguments than their formal definitions, The only change here is that function arguments are not evaluated before passing them to the functions; a strict function (that is, one that requires its arguments evaluated) must call lazyEval to evaluate them.

proc curry { name arglist body } {
    interp alias {} $name {} _curry_impl $name [llength $arglist]
    proc ${name}_impl $arglist $body
    trace add command $name delete [list _curry_nuke $name]
}

proc _curry_impl { name len args } {
    set nArgs [llength $args]
    set L_1 [expr { $len - 1 }]
    if { $nArgs >= $len } {
        set toEval [linsert [lrange $args 0 $L_1] 0 ${name}_impl]
        set rest [lrange $args $len end]
        set result [eval $toEval]
        traceP {[linsert [lrange $args 0 $L_1] 0 $name] -> $result}
        set retval [eval [list linsert $rest 0] $result]
        return $retval
    } else {
        list [linsert $args 0 ${name}]
    }
}

proc _curry_nuke { name args } {
    catch {
        rename ${name}_impl {}
    }
}
proc unknown args [string map [list @ [info body unknown]] {
    set arg0 [lindex $args 0]
    if { [llength $arg0] > 1 } {
        return [uplevel 1 $arg0 [lrange $args 1 end]]
    }
@}]

We are often going to want to illustrate the results of lazy evaluation, so let's define a conventional Tcl procedure to do it.

proc demonstrate { string } {
    puts [list $string = [lazyEval $string]]
}

And we define the basic two combinators:

 S f g x = f x { g x }
 K x y = x
curry S {f g x} { 
    list $f $x [list $g $x]
}
curry K { x y } { list $x }

Now, if we've done everything right, we should see that S K K is the identity operation.

demonstrate {S K K x}
demonstrate {S K K { S K K } { S K K x }}
#                      Both of the above expressions evaluate to x.

Let's define a macro facility, so that we can remember these things.

proc macro { name value } {
    curry $name {} $value
}
macro I { S K K }

Exercise 1: Try to prove this yourself, it isn't hard. If you get stuck, see Combinator engine: answers to exercises

The composition operator B f g x = f (g x)

macro B { S { K S } K }

The alternative curry C f x y = f y x

macro C { S { B B S } { K K } }

Apply a function to a constant T x f = f x

macro T { C I }

Duplicate an argument to a function W f x = f x x

macro W { C S I }

Apply a function to itself: M x x = x x

macro M { S I I }

Exercise 2: Prove all of these. If you get stuck, the answers are over in Combinator engine: answers to exercises.

Note that all the above functions consist simply of strings of combinators, and none of the definitions depends on a later definition; it's possible simply to replace all occurrences of I, B, C, T and W with the corresponding strings of S and K. The definitions given above are for convenience only.

Also, since we have proofs of correctness in hand, we can replace these strings of S and K with Tcl functions ("supercombinators"). [This gives us much better performance, which we'll need!] We can set fast to 0 if you want to reasssure yourself that everything works without extra procedures. (It just runs ten times slower.)

set fast 1;
if { $fast } {
    curry I x { set x }
    curry B { f g x } { list $f [list $g $x] }
    curry C { f x y } { list $f $y $x }
    curry T { x f } { list $f $x }
    curry W { f x } { list $f $x $x }
}

While it isn't strictly necessary, it's helpful to have a compiler to abstract free variables from expressions. Essentially, this is translating the lambda calculus to the algebra of S and K.

  • lambda x (f g) ==> S (lambda x f) (lambda x g)
  • lambda x a ==> K a
  • lambda x x ==> I

How does this work? Well, let's start at the bottom of the table and work up. The last line says, "if we want a function that accepts a single argument x and returns it, then I will do the job. (Ix=x.)" Similarly, the next-to-last says, "if we want a function that accepts a single argument x and returns some constant a, then Ka is just the ticket. (Kax=a.)"

The top line is the hardest, but even it isn't too bad. It says, "We want a function that accepts an argument x, and returns the result of some expression f applied to some argument g. How can we proceed? Come up with a function F such that Fx=f, and another function G such that Gx=g. Combine those two functions with S (now you know why it's called a combinator!): SFGx=Fx(Gx)=fg. This recursive decomposition allows us to build up any functional form from its components.

In addition, we'll introduce some optimization. We remove redundant brackets, and we simplify expressions using the following identities:

 * S {K p}              ==> B p
 * B p {K q}            ==> K { p q }
 * B p I                ==> p
 * S p I                ==> W p
 * S p {K q}            ==> C p q
 * C I                  ==> T
 * W I                  ==> M

Exercise 3: Prove these identities. Combinator engine: answers to exercises

set ::patterns {
    { S { K %p } }                     { B %p }
    { B %p { K %q } }                  { K { %p %q } }
    { B %p I }                         %p
    { S %p I }                         { W %p }
    { S %p { K %q } }                  { C %p %q }
    { C I }                            T
    { W I }                            M
}

Here, then, is our compiler for lambda-expressions:

proc lambda { x f } {
    traceP {[info level 0]}
    if { [llength $f] > 1 } {
       # lambda x.f g = S (lambda x.f) (lambda x.g)
       set retval [list S \
           [lambda $x [lrange $f 0 end-1]] \
           [lambda $x [lindex $f end]]]
    } elseif { [string equal $f $x] } {
        # lambda x.x = I
        set retval I
    } else {
        # lambda x.f = K f
        set retval [list K $f]
    }
    return [opt $retval]
}

The following procedure strips redundant braces from an expression

proc unbrace list {
    while { [llength $list] == 1 && [string match {{*}} $list] } {
        set list [lindex $list 0]
    }
    return $list
}

If braces surround the first element of an expression to be reduced, they, too are redundant. The following procedure removes them as well.

proc unbrace0 list {
    set list [unbrace $list]
    while { [llength [lindex $list 0]] > 1
        || [string match {{*}} [lindex $list 0]] } {

        set list [eval [list lreplace $list 0 0] [unbrace [lindex $list 0]]]
    }
    return $list
}

Here is the optimizer. It runs all the match patterns against an expression, and applies any that match. It does this repeatedly until nothing further matches.

proc opt { list } {
    traceP {[info level 0]}
    set improved 1
    while { $improved } {
        set list [unbrace0 $list]
        set improved 0
        foreach { pattern replacement } $::patterns {
            array unset matchVars
            if { [matches $list $pattern matchVars] } {
                set list [change $list $pattern $replacement matchVars]
                set improved 1
                break
            }
       }
   }
   return $list
}

This procedure tests whether a given expression matches a given pattern and puts the substituents in the array matchArray

proc matches { list pattern matchArray } {
    upvar 1 $matchArray matchVars
    traceP {[info level 0]}
    if { [llength $pattern] > [llength $list] } {
        return 0
    }
    foreach component $pattern candidate $list {
        if { [string equal {} $component] } {
            break
        }
        if { [llength $component] > 1 } {
            set candidate [unbrace0 $candidate]
            if { [llength $candidate] != [llength $component]
                || ! [matches $candidate $component matchVars] } {

                return 0
            }
        } elseif { [string match {%*} $component] } {
            if { [info exists matchVars($component)] } {
                if { [string compare $candidate $matchVars($component)] } {
                    return 0
                }
            } else {
                set matchVars($component) $candidate
            }
        } else {
            if { [string compare $candidate $component] } {
                return 0
            }
        }
    }
    traceP YES!!!
    return 1
}

This procedure takes an expression that matches a given pattern and substitutes it with the given replacement; matchArray holds the substituents.

proc change { list pattern replacement matchArray } {
    upvar 1 $matchArray matchVars
    traceP {[info level 0]}
    set retval [eval [list lreplace $list 0 [expr { [llength $pattern] - 1}]] \
        [makeReplacement $replacement matchVars]]
    traceP {[info level 0] returns $retval}
    return $retval
}

This procedure recursively expands the replacement pattern and plugs in the substituents.

proc makeReplacement { replacement matchArray } {
    traceP {[info level 0]}
    upvar 1 $matchArray matchVars
    if { [llength $replacement] > 1 } {
        set retval {}
        foreach item $replacement {
            lappend retval [makeReplacement $item matchVars]
        }
    } elseif { [string match {%*} $replacement] } {
        set retval $matchVars($replacement)
    } else {
        set retval $replacement
    }
    traceP {[info level 0] returns [unbrace0 $retval]}
    return [unbrace0 $retval]
}

Now, let's use our handy new [lambda] to make some more combinators. We want to define the reaching combinators:

# S' p q r s  =  p { q s } { r s }
set sPrime [lambda p [lambda q [lambda r [lambda s { p { q s } { r s } }]]]]
puts "S' = $sPrime"
# We'll define the function using the definition printed above.
macro S' { B { B S } B }

# And we can add another optimization rule:
lappend ::patterns { S { B %p %q } }     { S' %p %q }

# C' p q r s  = p q s r

set cPrime [lambda p [lambda q [lambda r [lambda s { p q s r }]]]]
puts "C' = $cPrime"
macro C' { B C }
lappend ::patterns { S' %p %q { K %r } } { C' %p %q %r }
lappend ::patterns { B C }               C'

# B* p q r s = p { q { r s } }

set bStar [lambda p [lambda q [lambda r [lambda s { p { q { r s } } }]]]]
puts "B* = $bStar"
macro B* { C { B B { B B B } } B }
lappend ::patterns { B %p { B %q %r } }   { B* %p %q %r }

# Just as with the other combinators, we squeeze out a bit more performance by writing Tcl functions instead:

if { $fast } {
    curry S' { p q r s } { list $p [list $q $s] [list $r $s] }
    curry C' { p q r s } { list $p $q $s $r }
    curry B* { p q r s } { list $p [list $q [list $r $s]] }
}

Our functions dispense with the [if] statement; instead, a Boolean value is a combinator with two arguments. true evaluates its first argument and discards the second; false discards the first and evaluates the second. (If we had no if).

set true [lambda x [lambda y x]]
set false [lambda x [lambda y y]]
puts "true = $true; false = $false"
macro true K
macro false { K I }

if 0 {
    We can check this on paper: Consider:
        true a b == K a b 
                 == a
        false a b == K I a b
                  == I b
                  == b

}

#We can also check it electronically:
demonstrate {true yes no}
#                      prints yes
demonstrate {false yes no}
#                      prints no

if 0 {
    Now, believe it or not, we have enough machinery, starting from just S and
    K and using nothing else, to define a form of arithmetic.  We will define a
    family of combinators, called the ''Church numerals'' after the logician
    Alonzo Church. 

    A Church numeral ''n'' is a combinator that applies a given function ''n''
    times to a given argument:

        0 f x = x
        1 f x = f x
        2 f x = f ( f x )
        n f x = f ( f ( f ( ... ( x ) ... ) ) )
                \______  ______/
                       \/
                    n times
        Since 

            K I f x = I x = x

        , we can say that 

            0 == K I
}

macro 0 { K I }

if 0 {
    Now comes the cheating: we need something to apply ''n'' times.  Let's
    define a function 'inc' that adds one to its argument.  This function will
    be used only to generate output. We aren't going to cheat and use it to do
    arithmetic.
}


curry inc {n} { expr {[lazyEval $n] + 1} }

# And we can increment 0 zero times, showing that it's, of course still zero.

demonstrate {0 inc 0}
#                      prints 0

if 0 {
    We can add one to a number without cheating.  We can define an operation,
    ''+1'', that adds one to a Church numeral by applying the target function
    one more time.

     +1 = S B
     +1 f n x = S B f n x
              = B f { n f } x
              = f { n f x }
}

macro +1 { S B }

# And in doing so, we've learned to count!

macro 1 {+1 0}
macro 2 {+1 1}
macro 3 {+1 2}

demonstrate {3 inc 0}
#                      prints 3

if 0 {
    Knowing how to count, we can figure out how to add.  Adding ''m'' to ''n''
    means that we increment ''n'' ''m'' times.  In other words, we apply the
    ''+1'' function ''m'' times.  We'll start using our [[lambda]] procedure to
    help come up with appropriate definitions.
}

set vPlus [lambda m [lambda n { m +1 n }]]
puts [list vPlus = $vPlus]
#                      prints vPlus = {T +1}
macro + {T +1}

if 0 {
    Let's prove by hand that + does what we want:

        T +1 m n = m +1 n = ( +1 applied to n m times )

    And we can test it out with some actual integers:
}

macro 4 {+ 2 2}
macro 5 {+ 2 3}
macro 8 {+ 4 4}
demonstrate {4 inc 0}
demonstrate {5 inc 0}
demonstrate {8 inc 0}
#                      prints 4, 5, and 8, rather unsurprisingly.

if 0 {
    Wow, we know how to add, starting from only ''S'' and ''K''. Let's add on
    multiplication.  Just as we add by repeated increments, we multiply by
    repeated addition; if we start with zero and add ''n'' ''m'' times, we
    should get ''mn'':
}

set vTimes [lambda m [lambda n {m {+ n} 0}]]
puts [list vTimes = $vTimes]
#                      prints vTimes = {C {C' {C B +}} 0}
macro * { C { C' { C B + } } 0 }

# I'm not going to bother proving this one; let's just assume that it works and try a few multiplications:

demonstrate {* 2 3 inc 0}
demonstrate {* 3 3 inc 0}

if 0 {
    Subtraction is trickier, so much so that I'll digress for a moment into
    some more basics.  We are going to need to form ''ordered pairs'' (''x'',
    ''y'') and take them apart again.  We have enough stuff about to do this:
    an ordered pair (''x'', ''y'') is a function that accepts a Boolean ''s''
    and returns ''x'' if ''s'' is true, ''y'' otherwise.
}

set pair [lambda x [lambda y [lambda s { s x y }]]]
puts "pair = $pair"
macro pair { C' T }

# The head of a pair is the result of passing 'true' to it as a parameter;
# the tail of the pair is the result of passing 'false'.

set hd [lambda pair {pair true}]
set tl [lambda pair {pair false}]
puts "hd = $hd; tail = $tl"
macro hd { T true }
macro tl { T false }

# A few quick demonstrations:

demonstrate {hd {pair a b}}
demonstrate {tl {pair a b}}
demonstrate {hd {pair a {pair b c}}}
demonstrate {hd {tl {pair a {pair b c}}}}
demonstrate {tl {tl {pair a {pair b c}}}}

if 0 {
    And now we can define more interesting functions, like factorial. 
    Factorial 0 is 1; factorial ''n'' is ''n'' times factorial ''n''-1. 
    Let's define the iteration step as a function that accepts the ordered pair (''n'', ''n''!)
    and returns the next ordered pair (''n''+1, [[''n''+1]]! ). 
    The factorial function extracts the tail of the ''n''th ordered pair.
}

set factIter [lambda p { factIter2 { + 1 { hd p } } { tl p } }]
puts "factIter = $factIter"
set factIter2 [lambda n [lambda f { pair n { * f n } }]]
puts "factIter2 = $factIter2"
set factorial [lambda n {tl {n factIter {pair 0 1}}}]
puts "factorial = $factorial"
macro factIter { S { B* factIter2 { + 1 } hd } tl }
macro factIter2 { S' B pair { C * } }
macro factorial { B tl { C { T factIter } { pair 0 1 } } }
demonstrate { factorial 0 inc 0 }
demonstrate { factorial 1 inc 0 }
demonstrate { factorial 2 inc 0 }
demonstrate { factorial 3 inc 0 }
# The next one is just TOO slow.
# demonstrate { factorial 4 inc 0 }

if 0 {
    Now that we have pairs (and therefore lists), we can start on subtraction,
    using a trick invented by the logician Stephen Kleene. 
    We are going to construct the list of numbers ''n'', ''n''-1, ..., 0. 
    ''n''-''k'' is the ''k''th element of this list.
}

# The list of zero such numbers has a head of 0:

set list0 [lazyEval { pair 0 nil }]
puts "list0 = $list0"
macro list0 { C' T 0 nil }

# We can make the ''n''th list inductively from the ''n''-1st, by tacking ''n'' on the front:

set nextList [lambda list { pair { +1 { hd list } } list }]
puts "nextList = $nextList"
macro nextList { W { B* pair +1 hd } }
 
# The list of integers from ''n'' down to zero is simply ''nextList'' applied ''n'' times to ''list0'':

set downFrom [lambda n { n nextList list0 }]
puts "downFrom = $downFrom"
macro downFrom { C { T nextList } list0 }

# And, as promised, ''n-k'' is the ''k''th element of the list descending from ''n'':

set minus [lambda n [lambda k { hd { k tl { downFrom n } } }]]
puts "minus = $minus"
macro - { B* { B hd } { C { T tl } } downFrom }

# Does minus work?

demonstrate { - 8 3 inc 0 }
#  prints { - 8 3 inc 0 } = 5

# Let's also use another trick to test a number for zero.

set isZero [lambda n { n { K false } true }]
puts "isZero = $isZero"
macro zero? {C {T {K false}} true}
demonstrate {zero? 0 yes no}
demonstrate {zero? 1 yes no}
demonstrate {zero? 8 yes no}

Exercise 4: Why does this trick work? (Combinator engine: answers to exercises)

if 0 {
    OK, now we're going to start on the Fibonacci numbers.  We could use a
    trick like the one we used for factorial to iterate, but let's try to do it
    recursively, with the rules:

        fib 0 = 0
        fib 1 = 1
        fib n = fib( n-1 ) + fib( n-2 )

    Wait a second.  How on Earth are we going to do that recursion without
    named functions?  We need some sort of trick to pass a function to itself.

    What if we had a combinator Y that had the property:

        Y f x = f ( Y f ) x ?
    With this combinator, we can do a recursive function by having it accept
    itself as a parameter; it actually gets (Yf) instead of f, but that's ok,
    because Yf will just be expanded again.

    Say again? Well, let's do the recursive definition of factorial as a warm-up:
}

set rFact [lambda f [lambda n {
   zero? n 
   1
   {
        * n { f { - n 1 } }
   }
}]]
puts "rFact = $rFact"
macro rFact { B* { S { C zero? 1 } } { S * } { C B { C - 1 } } }

if 0 {
    Now,
        Y rfact 3 = rfact ( Y rfact ) 3
                  = * 3 ( Y rfact 2 )
                  = * 3 ( rfact ( Y rfact ) 2 )
                  = * 3 ( * 2 ( Y rfact 1 ) )
                  = * 3 ( * 2 ( rfact ( Y rfact ) 1 ) )
                  = * 3 ( * 2 ( * 1 ( Y rfact 0 ) ) )
                  = * 3 ( * 2 ( * 1 ( rfact ( Y rfact ) 0 ) ) )
                  = * 3 ( * 2 ( * 1 1 ) )
                  = * 3 ( * 2 1 )
                  = * 3 2
                  = 6

    But where can we find this magical Y combinator? Well, it turns out that
    there is such a beast:
}


curry Y {} { M { B { S I } M } }

if 0 {
    What??? Well, let's run through it:

        Y f x = M { B { S I } M } f x
              = B { S I } M { B { S I } M } f x
              = S I { M { B { S I } M } f x
              = S I Y f x                       <- because Y = M { B { S I } M }
              = I f { Y f } x
              = f { Y f } x

    and this is ''exactly'' what we want.
}

demonstrate { Y rFact 3 inc 0 }

if 0 {
    And the 'rFact' here is just a convenience, not really a named function.
    It still works if we expand it in line.
}

demonstrate { Y { B* { S { C zero? 1 } } { S * } { C B { C - 1 } } } 3 inc 0 }

# And now, as promised, the Fibonacci sequence, defined recursively:

set fib [lambda fib [lambda n {
    zero? n 
    0 {
        zero? { - n 1 }
        1 { 
            + { fib { - n 1 } } { fib { - n 2 } }
        }

    }
}]]
puts "fib = $fib"
curry fib {} $fib
for { set i 0 } { $i < 5 } { incr i } {
    demonstrate [list Y fib $i inc 0]
}

# All these Church numerals are horribly slow to evaluate, of course. 
# Let's define more conventional integers:

for { set i 0 } { $i < 9 } { incr i } {
    catch { rename $i {} }
}
curry + { x y } { [expr { [lazyEval $x] + [lazyEval $y] }] }
curry - { x y } { [expr { [lazyEval $x] - [lazyEval $y] }] }
curry * { x y } { [expr { [lazyEval $x] * [lazyEval $y] }] }
curry +1 { x } { [expr { 1 + [lazyEval $x] } ] }
curry zero? { x y n } { 
    if { [lazyEval $x] == 0 } {
        return $y
    } else {
        return $n
    }
}

if 0 {
    We want integers to continue to iterate, so that all the macros we defined
    above will still work.  Another patch to [[unknown]] does so:
}

proc unknown args [string map [list @ [info body unknown]] {
    set arg0 [lindex $args 0]
    if { [llength $arg0] == 1 && [string is integer $arg0] } {
        if { [llength $args] == 1 } {
            return $arg0
        } else {
            return [linsert $args 0 n-times]
        }
    }
@}]

curry n-times { n f x } {
    set n [lazyEval $n]
    for { set i 0 } { $i < $n } { incr i } {
        set y $f
        lappend y $x
        set x $y
    }
    return $x
}

# And we see that Tcl's integers behave as Church numerals with our new 'unknown':

for { set i 0 } { $i < 10 } { incr i } {
    demonstrate [list Y fib $i]
    demonstrate [list Y rFact $i]
    demonstrate [list factorial $i]
}

Exercise 5. How would you compare two Church numerals to see if one is greater than the other?

Exercise 6. With the language we've developed so far, write a procedure to compute the greatest common divisor of two numbers.

(Combinator engine: answers to exercises)


Other stuff:

  • The Unlambda language is based on S, K, and an apply operator.