Result caching

Richard Suchenwirth 2002-07-19 - If you repeatedly do the same time-consuming calculations (e.g. string similarity, or especially the Q function), it may be a good idea to keep the previous results in a cache and, for every call of the same command after the first, just retrieve the cached value.

This implementation uses an array in caller's scope (he might make it global, or not) whose name is explicitly given, indexed by the command. This way, you could have multiple caches in parallel, and selectively unset some when needed. Note that variable values are not substituted in a braced command - to tie the cached result to the variable values, use list to group the command:

 set result [cached c [list myLongCommand $i $j]]

 proc cached {cacheName command} {
    upvar 1 $cacheName cache
    if {![info exists cache($command)]} {
        set cache($command) [uplevel 1 $command]
    } else {
        set cache($command)
    }
 }
 # ---------------------------------------- testing:
 % time {cached c {stringSimilarity "Tool Command Language" "Tool Command Languages"}}
 422000 microseconds per iteration
 % time {cached c {stringSimilarity "Tool Command Language" "Tool Command Languages"}}
 0 microseconds per iteration
 % array get c
 {stringSimilarity "Tool Command Language" "Tool Command Languages"} 0.953488372093

The second call saves you more than 0.4 seconds...(on second try with cleared cache, it was only 63 ms, though). Note however that caching results makes only sense if the command always returns the same value - caching commands like

 gets $fp
 expr rand()
 clock seconds

is certainly a bad idea... And side effects are of course not produced, as the command isn't executed after the first time.

DKF notes: The above timings aren't very good (there's granularity problems) but on Solaris8 (on a not-very-fast processor) I get:

 % time {cached c {stringSimilarity "Tool Command Language" "Tool Command Languages"}}
 97151 microseconds per iteration
 % time {cached c {stringSimilarity "Tool Command Language" "Tool Command Languages"}} 200
 44 microseconds per iteration

Philip Greenspun calls this concept "memoization"... A heavyweight XOTcl version is on Cacheable class. See also memoizing.


NaviServer has the commands ns_cache_eval[L1 ] and ns_memoize[L2 ], which are thread friendly. The cache-data is global, shared by all threads; the interlocking is transparent. Also, if one thread is computing expensive_function because the cache is empty/stale, and then another thread also checks the cache, the second thread will suspend until the first thread completes the cache update. Timeouts can be specified.


See Arrays as cached functions where you can call $f($x,$y)... or A size-bounded cache.


To capture the values of variables referenced by the command:

proc cached {cacheName command} {
    set index [uplevel 1 [list subst $command]]
    upvar 1 $cacheName cache
    if {![info exists cache($index)]} {
        set cache($index) [uplevel 1 $command]
    } else {
        set cache($index)
    }
}

So we can then call:

set a 1
cache c {expr {$a + 1}}
set a 2
cache c {expr {$a + 1}}

And not get erroneous results.

PYK 2020-01-18: First, I've replaced

"subst {$command}"

, which is incorrect, with

[list subst $command]

, which is correct. See Tcl Quoting, list, and eval for explanations.

Second, even with that correction made, this routine is problematic. subst is a poor stand-in for either a proper Tcl parser or a parser of the expressions that expr understands. It's going to make substitutions that weren't intended, and fail to make substitutions that were intended. Also, the implementation results in each substitution being performed twice, which will have unexpected effects in some cases.

A more correct way to go about it is to implement caching schemes that are customized to the routines they serve.