Dumping interpreter state

Richard Suchenwirth 2002-10-30: A need that comes up every now and then (see Persistent Tcl and Tk applications) is to dump the state of an interpreter to a file so that it can later be restored by sourcing that file. Here is a simple generic test for serializing global variables (scalar or array), interpreter aliases, and procedures, taking care to skip $::env and Tcl internals. Feel free to improve on this ;-)

See Also

serializable Safe Slave Interp
DMTCP (Distributed MultiThreaded Checkpointing)
A tool to transparently checkpoint the state of multiple simultaneous applications, including multi-threaded and distributed applications. It operates directly on the user binary executable, without any Linux kernel modules or other kernel modifications. Tclsh is among the explicitly supported executables.
tkcon
Provides a dump, which captures procedures and variables.

Description

As mentioned on comp.lang.tcl, this is not a complete state dump, just a reasonable facsimile, because it only recreates namespaces, the procedures and variables they hold, and traces on those things. Things like open file descriptors, sockets, commands that are not procedures, running daemons, the internal state of extensions like Tk widgets and Snak, etc., are not captured.

Packages are sort of taken care of in the code below; namespaces would require traversal of the namespace tree, repeating variable and procedure dumping.

proc interp'dump {} {
    set res "\# interpreter status dump\n"
    catch {package require "a non-existing one"}
    foreach package [lsort [package names]] {
        if {![catch {package present $package} version]} {
            append res "package require $package $version" \n
        }
    }
    foreach i [lsort [info globals]] {
        if {$i eq {env}} continue  ;# don't dump environment..
        if {[string match tcl_* $i]} continue ;# ..or Tcl system vars
        if {[string match auto_index $i]} continue ;# ..or Tcl system vars
        if {[array exists ::$i]} {
            append res [list array set $i [array get ::$i]]\n
        } else {
            append res [list set $i [set ::$i]]\n
        }
    }
    foreach proc [lsort [info procs]] {
        if {[string match auto_* $proc] || $proc == "unknown"} {
            continue ;# prevents most of the init.tcl procs from dumping
        }
        if {[string match pkg_* $proc]} continue ;# ..or Tcl system vars
        if {[string match tcl* $proc]} continue ;# ..or Tcl system vars
        append res "proc [list $proc] {"
        set space ""
        foreach i [info args $proc] {
            if [info default $proc $i value] {
                append res "$space{$i [list $value]}"
            } else {
                append res "$space$i"
            }
            set space " "
        }
        append res "} {[info body $proc]}\n"
    }
    foreach alias [lsort [interp aliases {}]] {
        append res "interp alias {} $alias {} [interp alias {} $alias]\n"
    }
    set res
}


if {[info exists argv0] 
    &&
        [namespace qualifiers [namespace current]] eq [namespace qualifiers [namespace which -variable argv]]
    &&
        [file dirname [file normalize [info script]/...]]
        eq
        [file dirname [file normalize $argv0/...]]
} {
   # prepare some playing material
   set scalar hello
   array set arry {foo 1 bar 2 grill 3}
   proc foo {bar} {puts grill-$bar}
   interp alias {} print {} puts stdout
   puts [interp'dump]
}

Smalltalk provides a mechanism for dumping state into a so-called image. Smalltalk images contains all global variables, classes und compiled methods. The image is the internal (byte-coded) state of interpreter.

The example on this page tries instead to create one big Tcl script to reset variables after an application begins. (What is this new created procs, Tcl is dynamic).

I would think more about dumping the interpreter's internal state (binary) and then loading it. Loading byte code should be much faster than compiling the sources anew. the Tcl interpreter could realize such functionality if it could serialize Tcl_Obj (internal Tcl) but I think the current Tcl interpreter does not allow this (or perhaps no one has thought about it). The biggest problem is Tcl extensions, which should also then serialize their states. In principle, if in Tcl everything were a string it would be easy. In fact, often these strings are names of handles (file handle, tk windows handle, etc.). This is not so easy to differentiate what to dump and what needs to be recreated - or for that matter, how to recreate it.

I think TclPro has the ability to save byte-coded procedures, but I do not know how it could be used for such dumping.


Recently Karl Lehenbauer mentioned that TclX has a procedure called showproc which extracts named procedures or all procedures. He also mentioned that it would be neat for a Tcl hacker with some spare time to create a new interp clone command that creates a new intepreter, copying in all of the original interpreter's procedures, variables, open file descriptors, namespaces, packages, etc. This, if done quickly, could be used as something lighter-weight than fork.


XOTcl has a class called Serializer that can dump objects, classes and all the whole workspace (in the right order) into a string which can be used to recreate at some later time [L1 ]. This is used to generate the blueprint of the interpreter state in AOLserver or e.g. for object migration between threads.


Zarutian 2005-05-31: Several procedures to 'capture' the state of a slave interpreter:

Zarutian 2005-06-01: slightly updated with corrections

proc interp_ns {interp namespace script} {
    return [$interp eval [list namespace eval $namespace $script]]
}


proc lexclude {b a} {
    # returns list a where all elements of list b have been removed.
    set tmp [list]
    foreach item $a {
        if {[lsearch $b $item] == -1} {
            lappend tmp $item
        }
    }
    return $tmp
}


proc list_vars {interp namespace} {
    set globals [interp_ns $interp $namespace {info globals}]
    if {$namespace == {}} { set globals {} }
    set vars [interp_ns $interp $namespace {info vars}]
    return [lexclude $globals $vars]
}


proc capture_vars {interp {namespace {}}} {
    set vars [list_vars $interp $namespace]
         set tmp "# variables: \n"
    append tmp "namespace eval [list $namespace] \{\n"
    foreach var $vars {
        if {[interp_ns $interp $namespace [list array exists $var]]} {
            append tmp "array set [list $var] [list [interp_ns $interp $namespace [list array get $var]]]\n"
        } else {
            append tmp "set [list $var] [list [interp_ns $interp $namespace [list set $var]]]\n"
        }
    }
    append tmp "\}"
    return $tmp
}


proc capture_varTraces {interp {namespace {}}} {
    set vars [list_vars $interp $namespace]
         set tmp "# traces on variables: \n"
    append tmp "namespace eval [list $namespace] \{\n"
    foreach var $vars {
        set traces [interp_ns $interp $namespace [list trace info variable $var]]
        foreach trace $traces {
            append tmp "trace add variable [list $var] [list [lindex $trace 0]] [list [lindex $trace 1]]\n"
        }
    }
    append tmp "\}"
    return $tmp
}


proc capture_procs {interp {namespace {}}} {
    set procs [interp_ns $interp $namespace {info procs}]
        set tmp "# procedures: \n"
    append tmp "namespace eval [list $namespace] \{\n"
    foreach proc $procs {
        # dangerous asumption: expect that no variable will be named: {}
        # why: because it's the only way to squease data out of [info default]
        # proposed alt: add an -withDefaults to [info args]
        set args [list]
        foreach arg [interp_ns $interp $namespace [list info args $proc]] {
            if {[interp_ns $interp $namespace [list info default $proc $arg {}]]} {
                lappend args [list $arg [interp_ns $interp $namespace [list set {}]]]
            } else {
                lappend args $arg
            }
            catch { [interp_ns $interp $namespace [list unset {}]] }
        }
        set body [interp_ns $interp $namespace [list info body $proc]]
        append tmp "proc [list $proc] [list $args] [list $body]\n"
    }
    append tmp "\}"
    return $tmp
}


proc capture_all {interp {namespace {}}} {
    set tmp {}
    if {$namespace eq {}} {
        append tmp "# Fascmile of interp state -BEGIN- \n"
    }
    append tmp "[capture_vars  $interp $namespace]\n"
    append tmp "[capture_procs $interp $namespace]\n"
    append tmp "[capture_varTraces $interp $namespace]\n"
 
    set children [$interp eval [list namespace children $namespace]]
    if {[llength $children] > 0} {
        foreach child $children {
            append tmp [capture_all $interp $child]
        }
    }
    if {$namespace == {}} {
        append tmp "# Fascmile of interp state -END- \n"
    }
    return $tmp
}

The following is a short little scriptlet I use to export the procs I've created during a tclsh session --xk2600

proc dumpprocs {} {

  foreach p [info procs] {

     puts "proc $p \{[info args $p]\} \{[info body $p]\}"
     puts "\n"

  }
}