Brute force meets Goedel

Richard Suchenwirth 2004-08-12 - Never afraid of anything (as long as everything is a string), a discussion in the Tcl chatroom brought me to try the following: let the computer write ("discover") its own software, only given specifications of input and output. In truly brute force, up to half a million programs are automatically written and (a suitable subset of them) tested to find the one that passes the tests.

To make things easier, this flavor of "software" is in a very simple RPN language similar to, but much smaller than, the one presented in Playing bytecode: stack-oriented like Forth, each operation being one byte (ASCII char) wide, so we don't even need whitespace in between. Arguments are pushed on the stack, and the result of the "software", the stack at end, is returned. For example, in

 ebc ++ 1 2 3

execution of the script "++" should sum its three arguments (1+(2+3)), and return 6.

Here's the "bytecode engine" (ebc: execute byte code), which retrieves the implementations of bytecodes from the global array cmd:

proc ebc {code argl} {
    set ::S $argl
    foreach opcode [split $code ""] {
        eval $::cmd($opcode)
    }
    set ::S
}

Let's now populate the bytecode collection. The set of all defined bytecodes will be the alphabet of this little RPN language. It may be interesting to note that this language has truly minimal syntax - the only rule is: each script ("word") composed of any number of bytecodes is well-formed. It just remains to check whether it does what we want.

Binary expr operators can be treated generically:

foreach op {+ - * /} {
    set cmd($op) [string map "@ $op" {swap; push [expr {[pop] @ [pop]}]}]
}
#-- And here's some more hand-crafted bytecode implementations
set cmd(d) {push [lindex $::S end]} ;# dup
set cmd(q) {push [expr {sqrt([pop])}]}
set cmd(^) {push [swap; expr {pow([pop],[pop])}]}
set cmd(s) swap
 
#-- The stack routines imply a global stack ::S, for simplicity
interp alias {} push {} lappend ::S
proc pop {}  {K [lindex $::S end] [set ::S [lrange $::S 0 end-1]]}
proc K {a b} {set a}
proc swap {} {push [pop] [pop]}

Instead of enumerating all possible bytecode combinations beforehand (which grows exponentially by alphabet and word length), I use this code from Mapping words to integers to step over their sequence, uniquely indexed by an increasing integer. This is something like the Goedel number of the corresponding code. Note that with this mapping, all valid programs (bytecode sequences) correspond to one unique non-negative integer, and longer programs have higher integers associated:

proc int2word {int alphabet} {
    set word ""
    set la [llength $alphabet]
    while {$int > 0} {
        incr int -1
        set word  [lindex $alphabet [expr {$int % $la}]]$word
        set int   [expr {$int/$la}]
    }
    set word
}

Now out for discovery! The toplevel proc takes a paired list of inputs and expected output. It tries in brute force all programs up to the specified maximum Goedel number and returns the first one that complies with all tests:

proc discover0 args {
    set alphabet [lsort [array names ::cmd]]
    for {set i 1} {$i<10000} {incr i} {
        set code [int2word $i $alphabet]
        set failed 0
        foreach {inputs output} $args {
            catch {ebc $code $inputs} res
            if {$res != $output} {incr failed; break}
        }
        if {!$failed} {return $code}
    }
}

But iterating over many words is still pretty slow, at least on my 200 MHz box, and many useless "programs" are tried. For instance, if the test has two inputs and wants one output, the stack balance is -1 (one less out than in). This is provided e.g. by one the binary operators +-*/. But the program "dd" (which just duplicates the top of stack twice) has a stack balance of +2, and hence can never pass the example test. So, on a morning dogwalk, I thought out this strategy:

  • measure the stack balance for each bytecode
  • iterate once over very many possible programs, computing their stack balance
  • partition them (put into distinct subsets) by stack balance
  • perform each 'discovery' call only on programs of matching stack balance

Here's this version. Single bytecodes are executed, only to measure their effect on the stack. The balance of longer programs can be computed by just adding the balances of their individual bytecodes:

proc bc'stack'balance bc {
    set stack {1 2} ;# a bytecode will consume at most two elements
    expr {[llength [ebc $bc $stack]]-[llength $stack]}
}
proc stack'balance code {
    set res 0
    foreach bc [split $code ""] {incr res $::balance($bc)}
    set res
}

The partitioning will run for some seconds (depending on nmax - I tried with several ten thousand), but it's needed only once. The size of partitions is further reduced by excluding programs which contain redundant code, that will have no effect, like swapping the stack twice, or swapping before an addition or multiplication. A program without such extravaganzas is shorter and yet does the same job, so it will have been tested earlier anyway.

proc partition'programs nmax {
    global cmd partitions balance
    #-- make a table of bytecode stack balances
    set alphabet [array names cmd]
    foreach bc $alphabet {
        set balance($bc) [bc'stack'balance $bc]
    }
    array unset partitions ;# for repeated sourcing
    for {set i 1} {$i<=$nmax} {incr i} {
        set program [int2word $i $alphabet]
        #-- "peephole optimizer" - suppress code with redundancies
        set ok 1
        foreach sequence {ss s+ s*} {
            if {[string first $sequence $program]>=0} {set ok 0}
        }
        if {$ok} {
            lappend partitions([stack'balance $program]) $program
        }
    }
    set program ;# see how far we got
}

The discoverer, Second Edition, determines the stack balance of the first text, and tests only those programs of the same partition:

proc discover args {
    global partitions
    foreach {in out} $args break
    set balance [expr {[llength $out]-[llength $in]}]
    foreach code $partitions($balance) {
        set failed 0
        foreach {input output} $args {
            catch {ebc $code $input} res
            if {$res != $output} {incr failed; break}
        }
        if {!$failed} {return $code}
    }
}

But now for the trying. The partitioning helps very much in reducing the number of candidates. For the 1000 programs with Goedel numbers 1..1000, it retains only a fraction for each stack balance:

  • -2: 75
  • -1: 155 (this and 0 will be the most frequently used)
  • 0: 241
  • 1: 274
  • 2: 155
  • 3: 100

Simple starter - discover the successor function (add one):

% discover 5 6  7 8
dd/+

Not bad: duplicate the number twice, divide by itself to get the constant 1, and add that to the original number. However, it fails to work if we add the successor of 0 as another test case:

% discover 5 6  7 8  0 1

Nothing coming - because zero division made the last test fail. If we give only this test, another solution is found:

% discover 0 1
d^

"Take x to the x-th" power" - pow(0,0) gives indeed 1, but that's not the generic successor function.

More experiments to discover the hypot() function:

% discover {4 3} 5
d/+

Hm - the 3 is duplicated, divided by itself (=1), which is added to 4. Try to swap the inputs:

% discover {3 4} 5
q+

Another dirty trick: get square root of 4, add to 3 - presto, 5. The correct hypot() function would be

d*sd*+q

but my program set (nmax=30000) ends at 5-byte codes, so even by giving another test to force discovery of the real thing, it would never reach a 7-byte code. OK, I bite the bullet, set nmax to 500000, wait 5 minutes for the partitioning, and then:

% discover {3 4} 5  {11 60}  61
sd/+

Hm.. cheap trick again - it was discovered that the solution is just the successor of the second argument. Like in real life, test cases have to be carefully chosen. So I tried with another a^2+b^2=c^2 set, and HEUREKA! (after 286 seconds):

% discover {3 4} 5  {8 15} 17  
d*sd*+q

After partitioning, 54005 programs had the -1 stack balance, and the correct result was on position 48393 in that list...

And finally, with the half-million set of programs, here's a solution for the successor function too:

% discover  0 1  4711 4712
ddd-^+

"d-" subtracts top of stack from itself, pushing 0; the second duplicate to the 0-th power gives 1, which is added to the original argument. After some head-scratching, I find it plausible, and possibly it is even the simplest possible solution, given the poorness of this RPN language.


aa suggests that the simplest solution is

dd/+

RS: Yes, but as discussed above, it doesn't work for the successor of 0 (zero division)...


DKF: If you really want a challenge, add something that can operate like a while loop... j


Lessons learned:

  • Brute force is simple, but may demand very much patience (or faster hardware)
  • The sky, not the skull is the limit what all we can do with Tcl :)

RHS: I added the following operators:

  • <, > left, right shift by 1
  • &, | logical and, or
  • ~ logical not (unary)

I also made a few optimizations to the code. Specifically, I chanced the peephole optimization you had to use regexp instead of foreach (since I had more optimizations to add, like <>, ~~, s|, s&). Anyways, I came up with the following amusing "formulas"

  • Calculate if a number is odd: dd*&
  • Calculate is a number is even: d><s/ (I would guess it would have found dd*&~ eventually too :)

Larry Smith 2005-03-05 - Now, if you were to add a genetic algorithm to replace the brute-force generator, you'd have a very interesting little tool to find algorithms. I've often wondered if such a program could find a reverse algorithm for crypto trapdoor schemes - e.g. could it use the input and output of a known crypto algorithm to come up with a code breaker?