Version 23 of dicthash: Yet another lightweight object system

Updated 2009-04-25 17:34:47 by tcleval

I was using unknown today to add some syntax sugar to dicts and somehow came up with a simple, lightweight object system that has the feel of javascript objects.

The basic idea is that everything's a value (how very tclish ;-). Methods are simply lambda expressions stored as elements in a dict. Also, I started by stealing the object syntax from Tk widget hierarchies but ended up with a very javascript-like syntax.

Usage Summary

  # start with a dict:
  set foo {
    location {x 0 y 0}
    heading 0
    move {{distance} {
      %this.location.x = [expr {
        [%this.location.x]+($distance*cos([%this.heading]))
      }]
      %this.location.y = [expr {
        [%this.location.y]+($distance*sin([%this.heading]))
      }]
    }}
  }

  puts [%foo.location.x] ;# get value
  %foo.heading = 1       ;# assign value
  %foo.move.apply 100    ;# call method

  set newfoo [%foo + {name "tortise"}] ;# dict merge
  %foo += {name "hare"}                ;# dict merge

Syntax

All commands that begins with % are assumed to be the variable names of proper dicts. The basic syntax of a dicthash command is:

  %varname.key.key

Where varname is the variable name pointing to a dict and the keys are nested keys referring to elements in the dict. Since this is tcl, the fact that $ substitution stops at "." can be exploited to substitute key names:

  %varname.$key

When a dicthash command is passed two arguments and the first argument is = then a dict set is applied to the dict:

  %varname.key.key = $value

When a dicthash command is passed two arguments and the first argument is + then a dict merge is applied to the dict:

  set newvar [%varname + $somedict]

When a dicthash command is passed two arguments and the first argument is += then a dict merge is applied to the dict and the result is stored back into the dict:

  %varname += $somedict

If the last key is apply then the value of the next-to-last key is assumed to be a lambda expression and is invoked automatically by calling the apply command:

  %varname.method.apply $argument $argument

I toyed with the idea of implementing method calls as %foo.move($value) but that felt very un-tclish to me. So I opted to use apply as the keyword for method invocation.

Within dicthash methods (lambda expressions which are elements of the dict), this is a keyword referring to the dicthash object (in other words, the current dict. Just what you'd expect):

  set mydict {
    set {{key value} {
      %this.$key = $value
    }}
    lappend {{key args} {
      dict lappend this $key {*}$args
    }}
    dump {{} {
      puts $this
    }}
  }
  %mydict.set.apply somekey "some value"
  %mydict.lappend.apply anotherkey 1 2 3 4
  %mydict.dump.apply

Here's the implementation

  # syntax sugar for dict:
  proc dicthash {cmd args} {
    uplevel 1 [string map "CMD $cmd ARGS {{$args}}" {
      set path [split [string range CMD 1 end] .]
      set varname [lindex $path 0]
      set path [lrange $path 1 end]
      
      upvar 1 $varname var
      
      if {[lindex $path end] == "apply"} {
        set path [lrange $path 0 end-1]
        set script [dict get $var {*}$path]
        set body [lindex $script 1]

        # Improved implementation of "this":
        set body "upvar 1 var this;$body"

        lset script 1 $body
        if {ARGS == ""} {
          return [apply $script]
        } else {
          return [apply $script {*}ARGS]
        }
      } else {
        switch -- [llength ARGS] {
          0 {
            return [dict get $var {*}$path]
          }
          2 {
            set op [lindex ARGS 0]
            set val [lindex ARGS 1]
            
            if {$op == "="} {
              return [dict set var {*}$path $val]
            } elseif {$op == "+"} {
              if {$path == ""} {
                return [dict merge $var $val]
              } else {
                error "invalid dict merge"
              }
            } elseif {$op == "+="} {
              if {$path == ""} {
                return [set var [dict merge $var $val]]
              } else {
                error "invalid dict merge"
              }
            }
          }
        }
      }
      error "unsupported operation on CMD"
    }]
  }
  proc unknown {cmd args} {
    if {[string index $cmd 0] == "%"} {
      return [dicthash $cmd {*}$args]
    } else {
      error "unknown: $cmd $args"
    }
  }

EIAS

A dicthash "object" is nothing more than a dict. Therefore an object also has a natural string representation. This somehow feels extremely tclish to me :-)

It also means that an object can be modified after instantiation, just like javascript. So for example, taking the example of the "foo" object above, to add a new method to draw it on a canvas you simply assign a lambda expression to it:

  # add a new method:
  dict set foo draw {{canvas} {
    set x [%this.location.x]
    set y [%this.location.y]
    $canvas create oval \
      [expr {$x-5}] [expr {$y-5}] \
      [expr {$x+5}] [expr {$y+5}] \
      -fill red
  }}

  # or using dicthash sugar:
  %foo.draw = {{canvas} {
    set x [%this.location.x]
    set y [%this.location.y]
    $canvas create oval \
      [expr {$x-5}] [expr {$y-5}] \
      [expr {$x+5}] [expr {$y+5}] \
      -fill red
  }}

  # now you can draw foo:
  pack [canvas .c] -fill both -expand 1
  %foo.draw.apply .c

Inheritance

Originally I didn't think this supports inheritance. After all, I wrote this and I haven't implemented inheritance yet! It turns out that dicthash is a pure prototype base object system. Much more so than javascript thanks to tcl's strict value semantics (also known as everything is a string).

In a prototype based object system you don't inherit. Instead you clone from your parent object and then extend yourself. In tcl this is trivial. In the foo example above I've already shown how newfoo "inherits" from foo. So in dicthash (since objects are simply dicts) inheritance is simply:

  # dict2 "inherits" from dict1:
  set dict2 $dict1

Also, as bonus and because of the excellent design of the dict API, multiple "inheritence" is simply:

  # dict2 "inherits" from dict1 and dict0:
  set dict2 [dict merge $dict1 $dict0]

  # or in dicthash notation:
  set dict2 [%dict1 + $dict0]

A more elaborate example of "inheritance":

  set mammal {
    class mammal
    species unknown
    speak {{} {}}
  }

  # dogs are a type of mammal:
  set dog [%mammal + {
    species dog
    speak {{} {puts Bark!}}
  }]

  # so are humans:
  set human [%mammal + {
    species human
    name ""
    speak {{} {puts "Hello. My name is [%this.name]."}}
  }]

  # create instances:
  set fido [%dog + {name Fido}]
  set charlie [%human + {name Charlie}]

  # hear them speak:
  %fido.speak.apply    ;# Bark!
  %charlie.speak.apply ;# Hello. My name is Charlie.

Further Developments/Discussion

Martyn Smith This looks very interesting, Have you looked at RSs Let unknown know page, especially the namespace unknown command in 8.5 which would be perfect for your unknown patch, I think just replacing unknown could cause problems later. Excellent idea.

slebetman Yeah, should use a better method of patching unknown. This is very experimental at this stage. There are some things about the syntax I don't quite like. I'm just putting things down on this page to try and flesh out my ideas and play around with what's possible and of course to preempt my forgetfullness.


Method Call Syntax

slebetman 23 April 2009: I'm not so happy with the method call syntax. %foo.bar.apply just looks ugly. The syntax makes me expect "apply" to be an element in bar. There are several options I'm considering:

  1. [%foo->bar $argument] Dot gets/sets values arrow makes method calls.
  2. [%foo.(bar) $argument] Parens around method name calls the method.
  3. [%foo bar $argument] Treating method name as a subcommand calls the method.

All of the above make clear distinctions between accessing the value of the dict element and making a method call. And all look better to me than the clumsy [%foo.bar.apply] syntax.

Personally I'm partial to number 3, [%foo bar]. First because it's already an established convention in tcl (subcommands calls methods). Second it opens the door to unifying methods and operators (=,+,+= can simply be considered built-in methods) which can potentially simplify the implementation. Also, imagine being able to override = simply by redefining it.

Note: In my own defence, it should be noted that I "invented" this purely by accident. As I said above, I started by wanting some syntax sugar for dicts. That's why the syntax and implementation is a bit clumsy (I can already see that the uplevel can be programmed out).


Alternative/Improved implementation

This changes the method call syntax to:

  %object.key.key method $argument

Also removed the uplevel. All I needed was [upvar 2] to get the dict from the caller's context.

Added a mechanism to re-base "this" in case of nested objects. Javascript does this slightly differently by assuming the method belongs to the second last element of the dict/object, i.e. the nested dict (actually, that's not a bad idea). Still experimenting. I didn't go the javascript/self route because I kind of like the idea of being able to categorise my method definitions in different nested dicts.

Usage Summary

  # it's mostly similar to the original implementation
  
  # start with a hash:
  set foo {
    location {x 0 y 0}
    heading 0
        speed 0
    step {{} {
      %this.location.x = [expr {
        [%this.location.x]+([%this.speed]*cos([%this.heading]))
      }]
      %this.location.y = [expr {
        [%this.location.y]+([%this.speed]*sin([%this.heading]))
      }]
    }}
  }
  
  # objects are nestable:
  set runners [dict create \
    tortise [%foo + {speed 5}] \
    hare    [%foo + {speed 10}]
  ]
  
  %runners.%tortise step ;# call step method belonging to %tortise
  %runners.%hare step    ;# call step method belonging to %hare

Syntax

When the first argument to the dicthash command happens to be an element belonging to the innermost nested dict then it is assumed to be a lambda expression and is called via apply:

  %varname.methodname           ;# returns the lambda expression
  %varname methodname $argument ;# applies the lambda expression

When a key begins with % then the this variable in methods refers to the nested dict pointed to by that key. This is done by scanning the dicthash command from left to right. Therefore, this refers to the final % in the dicthash command:

  %varname.key.%key method ;# call method belonging to %key

Because only the last % matters, you can use % as syntax sugar to indicate which nested dict is actually a dicthash object: %var.key.%object.key.key.%object.

Implementation:

  package provide dicthash 1.0

  proc dicthash {cmd args} {
    set path [split [string range $cmd 1 end] .]
    set varname [lindex $path 0]
    set path [lrange $path 1 end]

    #figure out if we need to re-base our "this"
    set hashindex -1
    for {set i 0} {$i<[llength $path]} {incr i} {
      set  key [lindex $path $i]
      if {[string match %* $key]} {
        set hashindex $i
        lset path $i [string trimleft $key %]
      }
    }

    if {$hashindex == -1} {
      # upvar 2 because uplevel 1 is [unknown]. We want the caller's context:
      upvar 2 $varname var
    } else {
      # We need to re-base "this". This is done by repointing $path and $var
      # So we need to save the path that leads to this nested dict: $prepath
      set prepath [lrange $path 0 $hashindex]
      set path [lrange $path [expr {$hashindex+1}] end]
      
      # Also need to save the base dict somewhere: $basedict
      upvar 2 $varname basedict
      set var [dict get $basedict {*}$prepath]
    }

    if {[llength $args] == 0} {
      return [dict get $var {*}$path]
    } else {
      set subcommand [lindex $args 0]
      set args [lrange $args 1 end]
      
      # check to see if $subcommand is a method:
      if {[catch {dict get $var {*}[list {*}$path $subcommand]} script] == 0} {
      
        # "this" magic:
        set body [lindex $script 1]
        set body "upvar 1 var this;$body"
        lset script 1 $body
        
        set ret [apply $script {*}$args]
        
        # If we are rebased we need to merge $var back 
        # into the base dict since tcl has value semantics:
        if {$hashindex != -1} {
          dict set basedict {*}$prepath $var
        }
        return $ret
      } else {
        # default built in subcommands:
        switch -- $subcommand {
          "=" {
            if {[llength $path]} {
              if {[llength $args]} {
                set ret [dict set var {*}$path {*}$args]
                
                # If we are rebased we need to merge $var back 
                # into the base dict since tcl has value semantics:
                if {$hashindex != -1} {
                  dict set basedict {*}$prepath $var
                }
                return $ret
              } else {
                error "value not specified"
              }
            } else {
              error "key not specified"
            }
          }
          "+" {
            if {[llength $path] == 0} {
              return [dict merge $var {*}$args]
            } else {
              error "invalid dict merge"
            }
          }
          "+=" {
            if {[llength $path] == 0} {
              return [set var [dict merge $var {*}$args]]
            } else {
              error "invalid dict merge"
            }
          }
        }
      }
    }
    error "unsupported operation on $cmd"
  }

  if {[info proc dicthash.unknown] == ""} {
    rename unknown dicthash.unknown
    proc unknown {cmd args} {
      if {[string index $cmd 0] == "%"} {
        return [dicthash $cmd {*}$args]
      } else {
        dicthash.unknown $cmd {*}$args
      }
    }
  }

Related Stuff

RS: See also TOOT, though this system has fancier syntax :^)

NEM: Or neo/Using namespace ensemble without a namespace for how to make a dict/slot based OO system into an efficient command using the -map option of namespace ensemble.

slebetman: Also related: self (not surprising, because javascript (the inspiration for this) is itself based on self). And on things. What's funny is I'm also seeing similarities with Lua and CLOS (with the exception that they are both class-based). And Perl of all things. I'm beginning to suspect that treating (or being able to treat) an associative array as an object (or the basis of an object system) is an emergent attribute of any language which has associative arrays (dict/hash/table etc..) and first class functions.

AMG: Another simple object system: sproc. Yup, it uses associative arrays. It wasn't originally designed to be an object system, but it turns out that's what you get when you have closures (if indeed that is what I have created): code plus data.

tcleval: something similar on stooop sugar (derived from this page)