Version 3 of copy-on-write

Updated 2006-09-13 18:32:56

[explain importance of this notion of immutability to proper understanding of Tcl's semantics]

Those who have a little knowledge of C (or other similar languages) have learned to pass arguments to a function by value or by reference.

Tcl scripts have such functionalities : when you need to pass the variable by value, you put a $ before the variable name. And this does not modifiy the original variable.

But when you need to access uplevel variables, you can use upvar in the proc you call, passing just the name of the variable as argument. Then you can access the content of a variable defined in the upper stack level. This feature works especially good with arrays that cannot be passed by value.

But when you come to Tcl C API (the internals that are required to write C extensions), you must know how Tcl passes variables by value, which makes use of the copy-on-write semantics.


Basically the copy-on-write semantics are:

  • when a proc receives a Tcl_Obj, it is in fact a pointer to the original variable, which is stored in a Tcl_Obj structure.
  • when all you need is read the variable, you do not require copying the data (1). The variable is shared.
  • when you need to work with a modified version of this variable, you need (2) to check whether the variable is shared or not
  • if the variable is shared, then you need to duplicate its content. (3)

A surprise

Here are 2 procs, the first gets a list passed by value, the second by reference via upvar.

 # Proc 1
 proc lmul-byval {list mult} {
     foreach i $list {
         lappend out [expr {$i*$mult}]
     }
     return $out
 }

 # Proc 2
 proc lmul-byref {list mult} {
     upvar 1 $list l
     for {set i 0} {$i<[llength $l]} {incr i} {
         lset l $i [expr {[lindex $l $i] * $mult}]
     }
     return
 }

 # Timings
 proc timeit {lsize val mult} {
     for {set i 0} {$i<$lsize} {incr i} {lappend list1 $val;lappend list2 $val}
     puts "By value : [time {set list1 [lmul-byval $list1 $mult]} 10]"
     puts "By reference : [time {lmul-byref list2 $mult]} 10]"
 }
 foreach size {100 1000 10000} {
     puts "TIME : $size elements"
     timeit $size 2 2
 }

Results:


Example in C [and consequences for extension design]

   ???