Decision trees as expressions

if 0 {Richard Suchenwirth 2003-08-01 - Decision trees allow classification of an input in few steps. One can represent trees as nested lists, but it is also possible to use expr's peculiar x?y:z operator ("if x then y else z"), which can be nested, to represent a decision tree in such a way that it can be traversed in a single call to expr, and hence presumably more efficient than recursive Tcl code traversal of a nested list.

Assume we want to classify an integer between 0 and 3 with the two "Boolean" features

  • even: divisible by 2 with zero remainder
  • prime: >1 and divisible only by 1 and itself

The decision tree as nested list looks like this ("yes" branches come first):

 {even {prime 2 0} {prime 3 1}}

which might be evaluated boringly as

   if {$even} {
      if {$prime} {
         return 2
      } else {
         return 0
      }
   } else {
      if {$prime} {
         return 3
      } else {
         return 1
      }
   }

Now the following recursive procedure converts it into an expression in expr language, which looks like this:

 $even? $prime? "2": "0": $prime? "3": "1"

The conditions (features) have been prefixed with a dollar sign for variable substitution; the result strings have been quoted so they can have other than numeric values. }

 proc dtree2expr dtree {
    if {[llength $dtree]==3} {
        foreach {a b c} $dtree break
        return "\$$a? [dtree2expr $b]: [dtree2expr $c]"
    } else {
        return \"$dtree\"
    }
 }

if 0 {Testing the thing goes with this silly framework, where the prime test was done very cheaply, but sufficient for the task at hand. If all went well, it should report its input back if called with 0..3, and make errors on other input... }

 proc classifyNumber x {
    #-- "feature extraction"
    set even [expr {$x%2==0}]
    set prime [expr {[lsearch {2 3 5 7 11 13 ..} $x]>=0}]

    #-- "classification"
    set dtree {even {prime 2 0} {prime 3 1}}
    expr [dtree2expr $dtree]
 }

Arts and crafts of Tcl-Tk programming