Version 0 of Tilde Substitution

Updated 2011-04-02 06:46:11 by SEH

Tilde substitution considered harmful

SEH: Many Tcl programmers have encountered the situation of writing code that takes lists of arbitrary file names as input, having that code work perfectly well for considerable time, then suddenly seeing it blow up and start throwing errors for no evident reason. Upon looking into the matter, said programmer discovers that the blowups occur when the first character of an input pathname is a tilde, and realizes the horror of tilde substitution; that is, sometimes when a path name starts with a tilde, the interpreter attempts to replace the tilde with a guess at a home directory value, which may or may actually exist, whether or not such substitution might make sense in context.

Fixing a tilde substitution problem in a single instance might be straightforward, but if you don't want it to happen again you may find it notoriously difficult to make your code really bulletproof against it. And this is not just frustrating and time-consuming, it feels non-tclish.

One of the great strengths of Tcl is the fact that its straightforward evaluation rules and the list data type usually make it trivially easy to handle arbitrary data without ambiguity. Tilde substitution undermines this strength and makes it impossible to treat file name information with the kind of non-ambiguity a Tcl programmer comes to expect.

It may be impossible to eliminate tilde substitution behavior in the interpreter for reasons of backward compatibility, but I think it's enough of a problem to justify adding an easy way to turn it off, such as a global- or namespace-level configuration option.

Until that day, I hope to use this page to collect and contribute techniques and code to combat tilde substitution problems. For example, below is my attempt to write a wrapper for the glob command that neutralizes tilde substitution behavior and thus allows arbitrary option values and pattern input with predictable results:

# Should work the same as glob, but with tilde substitution disabled.
# Pity one has to go to such lengths.  Suggestions for briefer
# solutions welcomed.

proc tglob args {
        set pathedit 0
        set argIndex [llength $args]
        set newArgs [list]
        for {set i 0} {$i < $argIndex} {incr i} {
                set arg [lindex $args $i]
                switch -glob -- $arg {
                        -d* -
                        -p* {
                                lappend newArgs $arg
                                incr i
                                set arg [lindex $args $i]
                                if {[string index $arg 0] eq "~"} {
                                        set arg "./$arg"
                                        set pathedit 1
                                }
                                lappend newArgs $arg
                        }
                        -j* -
                        -n* -
                        -ta* {
                                lappend newArgs $arg
                        }
                        -ty* {
                                lappend newArgs $arg
                                incr i
                                set arg [lindex $args $i]
                                lappend newArgs $arg
                        }
                        -- {
                                incr i
                                break
                        }
                        default break
                }
                
        }
        set args [lrange $args $i end]
        foreach tindex [lsearch -all $args ~*] {
                set arg [lindex $args $tindex]
                set arg \{~\}[string range $arg 1 end]
                set args [lreplace $args $tindex $tindex $arg] 
        }
        set newArgs [concat $newArgs $args]
#        if {[catch {glob {*}$newArgs} result]} {error $result}
        if {[catch {eval glob $newArgs} result]} {error $result}
        if {$pathedit} {
                foreach path [lsearch -all $result ./*] {
                        set arg [lindex $result $path]
                        set arg [string range $arg 2 end]
                        set result [lreplace $result $path $path $arg]
                }
        }
        return $result
}