Block-local variables

Summary

Richard Suchenwirth 2001-03-01: After a weekend without fun project, here's a little midnight study on local variables - inspired by reading on Scheme's let construct. I preferred the name with which sounded even more commonsense to me.

See Also

Local procedures
Locally-scoped command aliases are fun!

Description

Tcl variables are either global, or local to a procedure. You can shorten a variable's lifetime with [unset]. In LISP/Scheme as well as in C (as unequal they otherwise are), you can have variables local to a block of code - in LISP enclosed by let, in C by just bracing a block. And you can have that in Tcl too, as shall presently be demonstrated. The [with] command takes a non-empty list of variable names (with optional initial values, like a [proc] arglist) and a body which must refer to those variables, executes the body in caller's scope, and unsets the "local" variables afterwards.

CLN 2005-03-29: Why "must refer to those variables"? I can see that it would be pointless to have a [let] without referring to the local variables but is such reference enforced?

RS: I think I meant "must set them to a value", because in the cleanup they are [unset], which raises an error if they don't exist. But see below "Years later..." for the current, much simpler version...

proc with {locals body} {
   set cleanup [list unset]
   foreach i $locals {
       set name [lindex $i 0]
       if [uplevel info exists $name] {
           return -code error "local variable $name shadows outer var."
       }
       lappend cleanup $name
       if [llength $i]==2 {
             uplevel 1 set $i ;# uplevel works like concat, so no eval
       }
   }
   set res [uplevel 1 $body]
   uplevel 1 $cleanup
   set res
}
# usage example and test code:
set x 17
with {xa xb {xc 5}} {
    set xa [expr $xc*$xc]
    set xb [incr xa $x]
    puts xb=$xb
}
puts "now we have [info vars x*]" ;# should not show xa, xb, xc

Discussion

The handling of local variables with same name as one in outer scope is more strict than in C (where it may raise a warning) or Lisp (where it's no problem at all - the local temporarily supersedes the outer, which is later available unchanged). Since body is just upleveled, any other treatment seemed dangerous or overly difficult - but feel free to improve on that... Likewise, specifying a local variable which is not set to a value in body, or not giving any locals at all, raises an error on unsetting. As usual, another error is raised when retrieving a variable's value that has not been set before.

How useful this is, is another question. Good efficient code should be written inside procedures, and variables inside procedures are local anyway unless explicitly globalized.


Anonymous: Here's my take on let, where local variables in the block do supersede the ones at higher levels.

proc let {vars body} {
    uplevel [subst -nocommands {
        namespace eval __tmp__ {
            variable __vv {} __var {} __value {}
            if {[namespace parent] != "::"} {
                foreach __vv [info vars [namespace parent]::*] {
                    # variable [namespace tail \$__vv] [set \$__vv]
                    # using variable just copies the var, we want it to reference the other var
                    upvar #0 \$__vv [namespace tail \$__vv]
                }
            }
            foreach __vv [list $vars] {
                foreach {__var __value} \$__vv break
                variable \$__var \$__value
            }
            unset __vv __var __value
            $body
            namespace delete [namespace current]
        }
    }]
}

As a bonus, since it uses namespace you get block-local procedures also. I imagine this still isn't too efficient, creating and destroying the namespace each time through.

The quoting isn't too ugly here, especially considering what its doing.. but is there a clean way to get out of Quoting hell in this case?

oops, there was a bug - using [variable] would copy the variables in the enclosing namespace to the current rather than making the same variables visible. [upvar] fixes that.


[kruzalex] or

proc let {vars body} {
    uplevel [subst -nocommands {
        set cleanup "unset "
        foreach vv [list $vars] {
            foreach {var value} \$vv {append cleanup "[set var] ";}
                set \$var \$value
            }
            $body
            eval \$cleanup
            unset var value
        }
    ]
}

set x 17
let {xa xb {xc 5}} {
    set xa [expr $xc*$xc]
    set xb [incr xa $x]
    puts xb=$xb
}

puts "xb: $xb"    
puts $var
puts $value

RS 2005-03-28: Years later, revisiting this page, I'm surprised how over-engineered my code above (and also the anonymous reply) was. That's partly because they try to emulate more complex scoping, as seen on Lisp or C.

I've come to think that Tcl's strict scoping rule (in a proc, every variable is local, except if declared otherwise) is a good thing. So I thought up lightweight lambdas in two flavors (different only by the setting of the environment) which "play by the rules", and therefore are of course very simple:

proc with {argl body args} {
    if {[llength $argl]!=[llength $args]} {
        error "wrong #args, must be: $argl"
    }
    foreach $argl $args {}
    eval $body
}
#-- Testing:
% with {x y} {expr {hypot($x,$y)}} 3 4
5.0
with list {lindex $list 0} {a b c d}
a

Lars H: Can't help but optimising the above slightly. This should allow byte-compilation of $body, as well as naming an argument body, but OTOH requires explicitly [return]ing any result.

proc with2 {argl body args} {
    if {[llength $argl]!=[llength $args]} {
        error "wrong #args, must be: $argl"
    }
    foreach $argl $args $body
}

RS: Hm... as the test cases below show, an explicit [return] would make N bodies uglier N times... implicitly returning the result is better style at least in functional programming, I'd say.

Lars H: OK, another attempt. No return necessary, body will be byte-compiled, and "body" can appear in the argl.''

proc with3 {argl body args} {
    if {[llength $argl]!=[llength $args]} {
        error "wrong #args, must be: $argl"
    }
    if {"" eq [foreach $argl $args ""]} then $body
}

RS: But isn't that needless convolution? [foreach] is documented to return {}, so the test is a tautology - which leads to the observation that if 1 $body is the canonical tautology (but causes body to be compiled, so we get

proc with4 {argl body args} {
    if {[llength $argl]!=[llength $args]} {
        error "wrong #args, must be: $argl"
    }
    foreach $argl $args {}
    if 1 $body
}

Lars H: You skipped a point.

interp alias {} wrap {} with3 {prefix suffix body} {return $prefix$body$suffix} <TAG> </TAG>

will work as expected, but

interp alias {} wrap {} with4 {prefix suffix body} {return $prefix$body$suffix} <TAG> </TAG>

will not. (Try to figure out why without wraping anything.)


RS: This is not a real lambda, as it does not create a function object for later use - but then again, it doesn't have to worry about garbage collection, the argl and body are just cleaned up automatically after use. And [interp alias] can actually use such a quasi-lambda like a real one:

A concrete use case for [with] might be where you want to curry function calls, but the order of arguments does not put the really wanted one in the end. This demonstrates that it "quacks like a lambda", because it provides something that you can give a name to (with interp alias), and then walks like a function:}

% interp alias {} first {} with L {lindex $L 0}
first
% first {x y z}
x
% interp alias {} hypot {} with {x y} {expr {hypot($x,$y)}}
hypot
% hypot 3 4
5.0

Come to think, we can now code functions without [proc] - turning every command we write into an [interp alias]ed with... but advantages of byte-code compilation may of course get lost in this radical way.

Oh, and should we have auto-expansion of first word, little [with] is coming ever closer to being a "pure-value" lambda:

set tail {with x {lrange $x 1 end}}
{*}$tail $myList ;# possible from Tcl 8.5
$tail $myList         ;# possible when first words of commands are expanded 

Auto-expansion can be had today, if we just let unknown know:

proc know what {proc unknown args $what\n[info body unknown]}
know {
    if {[llength [lindex $args 0]]>1} {
        return [uplevel 1 [lindex $args 0] [lrange $args 1 end]]
    }
}

Testing:

% $tail {a b c d e}
b c d e

Functional composition (even of partial scripts, like "string toupper") also just works:

% with {f g x} {$f [$g $x]} lsort "string toupper" {h e l l o}
E H L L O

Better (composition as a single executable value) with this "composition generator":

proc o {f g} {
    list with {f g x} {$f [$g $x]} $f $g
}
% [o lsort "string toupper"] {T c l}
C L T

The 'let' variation is more in Lisp style. It just takes an alternating {var val var val..} list ("environment"), which in the future might well be a [dict]:

proc let {bindings body} {
    foreach {var val} $bindings {
        set $var $val
    }
    eval $body
}
#-- Testing:
% let {x 3 y 4} {expr {hypot($x,$y)}}
5.0

aspect notes that none of the above constructions of with or let are recursive. That is, you can't write:

% let {x 3} {let {y 4} {expr {$x*$y}}}

.. which is a pretty significant difference from Lisp's let. Here's my attempt at making it closer .. though the handling of shadowed variables leaves a bit to be desired, when you consider [trace]s.

proc empty s { expr {[llength $s] == 0} }

proc let {bindings body} {
    foreach {name val} $bindings {
        if {![empty [uplevel 1 "info locals $name"]]} {
            set _$name [uplevel 1 "set $name"]
        }
        uplevel 1 "set $name $val"
    }
    uplevel 1 $body
    foreach {name val} $bindings {
        if {![empty [info locals _$name]]} {
            uplevel 1 "set $name [set _$name]"
            unset _$name
        } else {
            uplevel 1 "unset $name"
        }
    }
}

.. testing:

proc test-let {} {
    set x 1
    set y 2
    let {x 3} {
        puts [expr {$x*$y}]         ;# 6
        let {y 4} {
            puts [expr {$x*$y}]     ;# 12
            set x 5
            set y 6
            puts [expr {$x*$y}]     ;# 30
        }
        puts [expr {$x*$y}]         ;# 10
    }
    puts [expr {$x*$y}]             ;# 2
}

wdb: All this is obsolete by apply -- here a true one-liner:

% apply {{x y} {uplevel expr "hypot($x,$y)"}} 3 4
5.0
% 

Note that the argument for expr is inside double quotes, not braces, such that the values of x, y are expanded before [uplevel]ling the call.

aspect: still doesn't compose (I'm not sure what the [uplevel] was in aid of, so removed it):

% apply {{x} {apply {{y} {expr {hypot($x,$y)}}} 4}} 3
can't read "x": no such variable

wdb: Tcl has no lexical scope. A closure cannot see the variables of calling level. Instead, I expanded x, y first, then used [uplevel] to execute it on calling level such that the script {expr ...} can see the caller's variables.

NEM: You need closures to get proper composition. See A brief introduction to closures and continuations for event-based programming for one approach. Locally-scoped command aliases are fun! also has some code for Scheme-like lexically scoped commands.

DKF: To expand, almost everything about lexical scoping is A Simple Matter Of Programming. The difficult bits relate to clearing up afterwards and dealing with the string-representation of lambda terms that are created within a lexical scope (it would force them to stop being first-class values according to Tcl's value scheme).

NEM: Depends on what you close over. Immutable closures with snapshot semantics can be easily represented as lambda+dict. The difficulty is in capturing actual variables and not just their current values. That means allowing mutation, aliasing, traces and so forth, which is tricky (and mostly introduces more problems than it solves). Two limited approaches to this are to either keep the environment separate from the lambda and use [dict with]: see dictutils for code that does this. The second approach is to use some form of first-class references, where a reference is a uniquely-named global variable, and you store the name of the variable in the closure rather than the variable itself. The latter approach is more awkward to deal with and also leads to resource cleanup issues/the question of effective garbage collection.

Note also that the question of scoping is distinct from the question of closures. Tcl's anonymous procedures (via [apply]) are closures in so much as they capture exactly the same scope that is captured by a normal [proc]: i.e., the current namespace.