Garbage Collection is the cleanup of resources that are no-longer needed.
garbage collection is used in other languages to address a problem Tcl doesn't have: The dispoal of objects that are no longer reachable because all references to them have been destroyed. In the case of a system that produces no garbage, i.e., no explicit references to objects, a garbage collector becomes pointless. Tcl is such a system. Variables and routines can not be passed as as arguments to other routines, so it becomes a simple matter of deleting a variable, namespace or routine when one is finished with it.
In Tcl, data structures are not built by directly referencing other other structures. Instead, a values are used directly, and copy-on-write is used internally to store the value only once even though it may appear at the script level in multiple places.
The only place at the script level where a reference is created in is when upvar or namespace upvar is used to create a local reference in a routine to a variable outside the routine. This reference is naturally cleaned up when the routine returns.
The following "AI koan" (see The Jargon File for more) points out a fundamental difference between the Tcl and Lisp approaches to when unused memory is reclaimed and the implications this has for what can be a value.
One day a student came to Moon and said: "I understand how to make a better garbage collector. We must keep a reference count of the pointers to each cons."
Moon patiently told the student the following story:
"One day a student came to Moon and said: `I understand how to make a better garbage collector ...
(Editorial note: Pure reference-count garbage collectors have problems with circular structures that point to themselves. On the primitive level Tcl avoids that problem by the principle that everything is a string, since strings don't point to anything.)
At the C level a count of references to each Tcl_Obj is kept within the object itself. Each time the address of the object is stored somewhere the reference count is incremented, and each time the address is released the reference count is decremented.
An extension can implement and register its own Tcl_Obj internal representation in order to more efficiently handle the data it uses, and the standard reference-counting scheme will take care of the storage management.
An alternative is to provide an interface that returns a unique identifier or allows the user to provide a unique name as a handle when creating a new set of resources.
Arjen Markus: There has been much discussion about references to data in Tcl, in order to build complex data structures and such. Inevitably, garbage collection pops up.
This page is meant to show that at some mundane level you can let Tcl do the job for you. The script below will create a counter "object" that keeps its internal state hidden. It juggles a bit with the counter object and then throws it away.
The essence: the internal state is stored via the interp alias mechanism that allows extra arguments and the counter itself is destroyed via the trace.
namespace eval Counter { variable nextid 0 proc makecounter {name initial} { upvar $name vname variable nextid set vname [namespace current]::$nextid uplevel [list trace add variable $name unset [ list [namespace current]::deletecounter $vname]] interp alias {} $vname {} [ namespace current]::increasecounter $vname $initial incr nextid } proc increasecounter {cmdname current} { set result [expr {$current+1}] interp alias {} $cmdname {} [ namespace current]::increasecounter $cmdname $result return $result } proc deletecounter {aliasname counter dummy op} { interp alias {} $aliasname {} # puts "deleted: $counter" } } ;# End of namespace # # Create a counter # Counter::makecounter count1 0 # puts [trace info variable count1] puts "1? [$count1]" puts "2? [$count1]" # # Copy the counter (not a reference, just a harmless alias) # set count2 $count1 puts "3? [$count2]" # # Deleting the alias has no effect # unset count2 puts "4? [$count1]" # # Deleting the true counter does! # set count3 $count1 unset count1 puts "5? [$count3]"
Result:
1? 1 2? 2 3? 3 4? 4 invalid command name "::Counter::0" while executing "$count3" invoked from within "puts "5? [$count3]" " (file "counter.tcl" line 52)
I was pondering the http package; the need to call http::cleanup when done with a token and the potential for leaking memory just seems wrong. So I was thinking about a Tcl-level garbage collector, and came up with the following. I suppose it's a mark & sweep collector of sorts, although it doesn't do any marking or recursive sweep.
proc gc-find pattern { set vars [info vars $pattern] set searchspace [uplevel info vars] foreach var $searchspace { if {[uplevel array exists $var]} { foreach {k v} [array get $var] { check-item $v vars } } else { check-item [uplevel set $var] vars } } return $vars } proc check-item {item vars} { upvar $vars vlist catch { foreach el $item { set s [lsearch -exact $vlist $el] if {$s > -1} { set vlist [lreplace $vlist $s $s] } } } }
One would periodically call it as
foreach tok [gc-find {::http::[0-9]*}] { ::http::cleanup $tok }
It assumes that any tokens will be either an individual item in a list, or a variable by itself, and it doesn't search namespaces other than the root.
RLE 2011-09-22: Has anyone considered that with 8.5+'s dict that the http package could return a result dict instead of a handle to an array? This would result in garbage collection happening automatically when the dict's reference count fell to zero.
AM 2007-12-17: In response to a thread on the comp.lang.tcl group, I experimented a bit with procedure traces. The idea I had was that the usual way of creating objects is to create a new procedure/command. If you want to create a local object, i.e. an object that should only exist during the life-time of a procedure, however, there is no way for Tcl to know that that is what you intended. So there is no way to actually remove it when the procedure returns.
Unless you help it a bit.
And that is what is done in the slightly silly script below:
# gc.tcl -- # An experiment with garbage-collecting "objects" # localobj -- # Create a _local_ object # # Arguments: # name Name of the object/command # # Result: # None # # Side effects: # Creates a new command and a trace on the _calling_ # procedure if needed # proc localobj name { global local_objects # Create the object proc $name {cmd} { if {$cmd eq {run}} { puts "[lindex [info level 0] 0]: Boo!" } else { return -code error "Command unknown: $cmd" } } # Administration: # - Store the command for later GC # - Add a trace to the caller, if this was not done yet # (Take care of global objects though!) if {[info level] > 1} { set caller [lindex [info level -1] 0] if {![info exists local_objects($caller)]} { trace add execution $caller leave [list localobj_destroy $caller] } lappend local_objects($caller) $name } } # localobj_destroy -- # Destroy the caller's local objects # # Arguments: # caller Name of the caller # command Original command (ignored) # code Return code (ignored) # result Result of the command (ignored) # ops Operation (ignored) # # Result: # None # # Side effects: # Destroys all objects created in the caller procedure # proc localobj_destroy {caller command code result ops} { global local_objects foreach obj $local_objects($caller) { rename $obj {} } unset local_objects($caller) } # main -- # Test this # proc myproc {} { localobj cow1 puts Myproc cow1 run cow2 run myproc2 } proc myproc2 {} { # localobj cow1 ;# Hm, would override the other one puts Myproc2 cow1 run ;# cow1 was created by the calling procedure - it is still available. This is a slight flaw ;) cow2 run ;# cow2 was created as a _global_ object, is this a flaw? } localobj cow2 myproc puts Main cow1 ;# Now object "cow1" no longer exists, so we get an error message cow2
KD: Wouldn't it be better to call localobj with the name of a local variable, in which the name of the object will then be stored? In this way, Tcl's inherent rules for destroying local variables can be used to destroy the object itself too:
proc localobj_destroy {name args} { puts "Destroying $name" rename $name {} } proc localobj &name { global handlecounter if {![info exists handlecounter]} {set handlecounter 0} upvar 1 ${&name} name set name handle#[incr handlecounter] puts "Creating variable ${&name} = proc $name" proc $name {args} {puts "executing: [info level 0]"} trace add variable name unset [list [ namespace which localobj_destroy] $name] } #Testing proc myproc2 {} { localobj foo $foo testlocal2 } proc myproc1 {} { localobj foo $foo testlocal1 myproc2 } localobj foo ;# this one is in fact global $foo testglobal myproc1
Result:
Creating variable foo = proc handle#1 executing: handle#1 testglobal Creating variable foo = proc handle#2 executing: handle#2 testlocal1 Creating variable foo = proc handle#3 executing: handle#3 testlocal2 Destroying handle#3 Destroying handle#2
AM 2007-12-18: Some discussion on this solution was lost, due to a problem with the disks. However, consider the following fragment:
proc myproc {} { localobj foo $foo testlocal1 set bar $foo unset foo $bar testlocal2 }
If I understand the code correctly, then this won't work as expected: unsetting foo will cause the associated object to disappear, leaving bar to pick up the pieces.
KD: Yes, that's right. By declaring localobj foo, you are signing a contract that the lifetime of the object is tied to the lifetime of the $foo. Usually that's also the lifetime of the procedure call, unless $foo is unset manually.
AM: Hm, it is not a perfect solution, but it does have its attractive points - mine was inspired by a partial/incorrect understanding of incr Tcl. Your solution restricts objects to the procedure that created them (unless you pass them to a called procedure). Good :)
DKF: One solution is to give objects a method that instructs them to "return themselves to the caller", which gives them a chance to manage their reference counting/traces. Another is to allow creators of an object to specify what variable to couple the lifetime to, which allows for management via upvar; NAP does this via the as method IIRC.