Version 26 of memoizing

Updated 2006-01-14 15:08:07

I came up with this little proc while lamenting the confusing way that cache functions need to be called in mason [L1 ] - because perl has no way (that I'm aware of) (anon can be done with 'die') for a subroutine to cause its caller to return, cache functions must be called as

  return if $m->cache_self();

RLH 2006-01-14: Perl has had memoization for a while now: http://search.cpan.org/~mjd/Memoize-1.01/Memoize.pm

In tcl this can be done much more elegantly. Just call this memoize proc at the beginning of another proc that is expensive to run and it will save the return value so it doesn't need to be recomputed. This makes use of [info level] to examining the stack as well as [return -code] to cause the calling proc to return.

  proc memoize {args} {
          global memo
          set cmd [info level -1]
          set d [info level]
          if {$d > 2} {
            set u2 [info level -2]
            if {[lindex $u2 0] == "memoize"} {
                    return
            }
          }
          if {[info exists memo($cmd)]} {
                  set val $memo($cmd)
          } else {
                  set val [eval $cmd]
                  set memo($cmd) $val
          }
          return -code return $val
  }

A classic use of this is the recursive fibonacci function:

  proc fibonacci {x} {
          if {$x <= 1} {return 1}
          return [expr [fibonacci [expr $x - 1]] + [fibonacci [expr $x - 2]]]
  }

Because this recomputes all lower values for every number, the performance is O(2^n)

  proc fibonacci-memo {x} {
          memoize
          if {$x <= 1} {return 1}
          return [expr [fibonacci-memo [expr $x - 1]] + [fibonacci-memo [expr $x
 - 2]]]
  }

By saving values that have already been computed by simply calling memoize, the performance becomes O(n)


RS: See also Result caching - but this solution here appears more elegant (though a bit brain-twisting) to me. My only proposal to make it more simple is to inline once-used variables, and use eq for string comparison:

 proc memoize {} {
     global memo
     set cmd [info level -1]
     if {[info level] > 2 && [lindex [info level -2] 0] eq "memoize"} return
     if { ! [info exists memo($cmd)]} {set memo($cmd) [eval $cmd]}
     return -code return $memo($cmd)
 }
 proc fib x {expr {$x <=1? 1 : [fib [expr {$x-1}]] + [fib [expr {$x-2}]]}}
 proc fibm x {memoize; expr {$x <=1? 1 : [fibm [expr {$x-1}]] + [fibm [expr {$x-2}]]}}
 % fib 20
 10946
 % fibm 20
 10946
 % time {fib 32}
 8559000 microseconds per iteration
 % array unset memo
 % time {fibm 32}
 0 microseconds per iteration

But maybe I'm a bit too much on the FP trip that variables are evil :)


male - 2004/23/01: "FP trip" and evil variables? - RS FP: Functional programming. At least in some FP circles, variables that "can vary", that are reassigned values, are considered as harmful as goto in procedural languages is. Some FPers take great Joy in the like-named Forth-related language where you don't even have (named) arguments to functions - "everything's on the stack".


NEM Am I right in assuming that this memoize function will only work with functions in the strict sense - in other words, if your procedure relies on (or generates) side-effects then the cache will not be valid? Generally, not using side-effects is a good thing, but many built-in Tcl commands produce side-effects, and almost all Tk commands do. RS: Right - memoizing caches the result of f(x,y,...) for later calls with the same arguments, and returns the same result. So, e.g., gets stdin should better not be memoized :)


While looking at speeding up the code I got side tracked with Memoizing - Is it a good idea.


Strick 2004-02-25 -- Here is how I've been memoizing. It's actually two slightly different ways, one called "memo" used at call time, and one called "memoproc" used at proc definition time. (: Both tweak with your syntax a bit, in ways you can't do in most languages. :)

First I define "memo", which is used by inserting it in front of the command to be memoized *when it is called*:

   # memoize a function call
   proc memo args {
       if {[info exists ::MEMO($args)]} {
           set ::MEMO($args)
       } else {
           set ::MEMO($args) [uplevel 1 $args]
       }
   }

# Then based on the idea in that code, I define "memoproc", which replaces the word "proc" when a function is defined. The function *must not* use "return" -- so get out your K Combinator and write functional functions!

   # auto-memoize a function -- it should not use return
   proc memoproc {name argv body} {
       set b "set _k_ \[list [list $name]\]; "
       foreach pair $argv {
           append b "lappend _k_ \$[list [lindex $pair 0]]; "
       }

       append b "
       if {\[info exists ::MEMO(\$_k_)\]} {
           set ::MEMO(\$_k_)
       } else {
           set ::MEMO(\$_k_) \[ $body \]
       }
       "

       proc $name $argv $b
   }

# Here's a returnless functional fibonacci to play with, based on the one above:

    proc fibonacci {x} {
             if {$x <= 1} {
                     expr 1
             } else {
                     expr {[fibonacci [expr $x - 1]] + [fibonacci [expr $x - 2]]}
             }
   }

# And here it is named "m-fibonacci", used memoproc:

   memoproc m-fibonacci {x} {
             if {$x <= 1} {
                     expr 1
             } else {
                     expr {[fibonacci [expr $x - 1]] + [fibonacci [expr $x - 2]]}
             }
   }

# Now try it straight:

   foreach n {1 2 3 4 5 6 7 8} {
           puts $n...[fibonacci $n]
           puts $n...[time "fibonacci $n" 10]
   }

# And with memo, inserting 'memo' before the command:

   foreach n {1 2 3 4 5 6 7 8} {
           puts $n...[memo fibonacci $n]
           puts $n...[time "memo fibonacci $n" 10]
   }

# And with memoproc, calling "m-fibonacci' instead:

   foreach n {1 2 3 4 5 6 7 8} {
           puts $n...[m-fibonacci $n]
           puts $n...[time "m-fibonacci $n" 10]
   }

Setok -- For the interested, I've had a Cacheable class for XOTcl available which does the same thing on an object level. Because of the XOTcl filter mechanism you can just dynamically attach it to any class you want. Found it pretty useful ;-)


Zarutian 18. april 2005: Probably a little related but here goes: is there any way to determine if a procedure is pure function? aka has no side-effects?

One way is to check if a procedure contains an command that has an side effect. Which would be more efficent to list all core commands that have side effects or all core commands that dont? (This could probably be better of in an seperate page perhaps called pure function?)

RHS One problem is that every command in Tcl can be redefined at any time. Even if you manage to completely analyze everything called from within a proc and make sure there are no side effects (which would be amazingly difficult, I think), there's nothing preventing one of the commands it uses from being defined before the next time it's called.

I think an easier option would be to be able to define a command as a "pure function", and then make it clear that things will not work "as expected" if that isn't true.


CMcC adds a wrinkle to RS' version (prompted by Lurch on the chat) such that setting an element memo() to some number will delete the memo array in its entirety if its size exceeds $memo() - this means the memo size is limited, fairly cheaply, to some arbitrary maximum:

 proc memoize {} {
    global memo
    if {[info exists memo()]
        && ([array size memo] > $memo())
    } {
        # limit size of memo array
        set max $memo()
        unset memo
        set memo() $max
        return
    }

    set cmd [info level -1]
    if {[info level] > 2 && [lindex [info level -2] 0] eq "memoize"} return
    if { ![info exists memo($cmd)]} {set memo($cmd) [eval $cmd]}
    return -code return $memo($cmd)
 }
 set memo() 100000        ;# set an arbitrary (large) size limit on memoizing

Category Algorithm | Category Concept