[AMG]: [Everything is a string] is nice and all, but for many applications it's important to have a special value that's outside the allowable domain. If the domain of values is numbers, any non-numeric string (e.g., "") will do, so "" can be used to signify that the user didn't specify a number. [C] strings can't contain '''NUL''' and therefore are free to reserve '''NUL''' as a terminator or field separator. [Unix] filenames reserve '''/''' and '''NUL''', so '''/''' is available to separate path components and NUL can be used with '''find -print0''', '''xargs -0''', and '''cpio -0''' to separate filenames in a list. (The more common practice of separating filenames with whitespace breaks whenever whitespace is used in filenames.) But if the allowable domain of values is any string at all, no string can be reserved for a special purpose. Since [Tcl] has nothing that is not a string, the only remaining solution is to have a separate, out-of-band way of tracking the special case. Returning to the C example, if a program needs to support having '''NUL''' in the middle of a string, it must either encode the string using a possibly fragile quoting scheme, or it can use a separate variable to track its length. As for the Unix filename example, if a filename needs to contain a '''/''', it absolutely must be encoded, for instance as '''%2F''', but then the quote character must also be encoded ('''%25'''). This is because Unix filenames have no room for an out-of-band channel. (By the way, [KDE] uses this encoding scheme to support '''/''' in filenames.) In Tcl, a separate variable can be used, such as a variable that's false when the user didn't specify a string. This can be very cumbersome and isn't always viable (again, when the domain is all strings). Two examples are default arguments and SQL nulls. Foolproof tracking of the former requires the [proc] to accept [args] and do its own defaulting; '''[[[llength] $args]]''' serves as the out-of-band channel. Tracking the latter may require asking the database to prepend a special character to all non-null string results; basically the first character is the out-of-band communication channel identifying the nullity of the result. A more straightforward option is to '''SELECT''' the '''NOTNULL''' of the string columns whose values could be null. ---- [jhh] proposes a possible solution in [TIP] 185 [http://tip.tcl.tk/185]. Basically, '''{null}!''' is recognized by the parser as a null, which is ''not'' a string; it is distinct from all possible strings. '''"{null}!"''' is, of course, a seven-character-long string, and it's also a one-element list whose sole element is a null. I ([AMG]) have several strong comments regarding the TIP: * I prefer to say "null" instead of "null string" because I feel that a null is not a string at all. It's the one thing that isn't a string! I guess we'll need to change our motto. :^) * Likewise, I'd rather not tack the null management functionality onto the [[[string]]] command. * I think I'd prefer a '''[[null]]''' command for generating nulls and testing for nullity. It's best not to use the '''==''' and '''!=''' [expr] operators for this purpose; null isn't equal to anything, not even null. * We can ditch the '''{null}!''' syntax in favor of using the '''[[null]]''' command to generate nulls, but then '''[[null]]''' cannot be implemented in pure script. This might be an important concern for [safe interps]. * Automatic compatibility with "null-dumb" commands is a mistake; it's the responsibility of the script to perform this interfacing. * When passed a null, the '''Tcl_GetType()''' and '''Tcl_GetTypeFromObj()''' functions should return '''TCL_ERROR''' or '''NULL''' (in the case of '''Tcl_GetString()''' and '''Tcl_GetStringFromObj()'''). * Most commands should be "null-dumb". Only make a command handle nulls when it is clear how they should be interpreted. * The non-object Tcl commands can probably represent nulls as null pointers ('''(void*)0''' or '''NULL'''). If for some reason that can't work, reserve a special address for nulls by creating a global variable. Feel free to argue. :^) ---- [AMG]: Here's a silly and inefficient proc to help me play around with the ideas presented above: proc foobar {varname {value {null}!}} { upvar 1 $varname var if {![null $value]} { set var $value } return $var } This proc should behave the same as [[[set]]]. You will notice that I used '''{null}!''' even though in my above comments I suggested removing it in favor of always using '''[[null]]''' to obtain nulls. But it turns out that's not feasible in the above code; it would only result in '''$value''' defaulting to the string '''"[[null]]"'''. To get the desired behavior, I'd have to write '''[[[list] varname [[list value [[null]]]]]]''', which is far from readable. (With [Tcl 9.0 Wishlist] #67, it becomes '''(varname (value [[null]]))''', which I can live with.) That's one black mark against my idea... A more worrying problem is that '''[[foobar]]''' can't be used to set a variable to null! Why? Because the domain of '''$value''' ''includes'' all strings ''and'' null, there is (once again) no possible value outside the domain that can be used to indicate that a special condition occurred and cannot be "forged" by the caller. So what are nulls good for again? I'm up to two black marks now. It's not looking good. It seems nulls aren't as useful as originally hoped. (Notice the use of the passive voice.) But are they still good for something? The reason '''[[foobar]]''' doesn't work in the above case is that it is being driven by the script, and the script is capable of producing nulls. If its input instead came from a file or socket, it would be just fine because reading from a channel will never result in a null. Of course, at this point I'm reminded of [taint]ing, which might be a better solution. ---- [wdb] When switching from Lisp to Tcl, the lack of some special value such as ''NULL'' was one of the drawbacks I decided that I can live with it. It is the price of the simplicity I am willing to pay. There are more than one cases where something similar is resolved by some trade-off: * In the [switch] statement, the word [default] impacts the '''value''' "default". * In [proc]'s arg list, the word [args] impacts the choice of argument names. * In [Snit] and [Itcl], the argument #auto or %AUTO% impacts the choice of instance name. * And so on. Extending the value range of type string leads to the consequence of leaving the principle [eias]. It is possible, and sometimes even desirable, to extend it. If so, ask yourself, if Tcl is your right choice anymore. If you ask me: I prefer the ''state as is''. The drawbacks are known, and as mentioned above, I can live with them. [AMG]: [switch] can select on the value "default" if "default" is not the last option given. [proc] can accept an argument named "args" if it's not the last one in the list (although see [Tcl 9.0 Wishlist] #77). I'm just pointing out that these "keywords" only have special meaning when in combination with some other out-of-band data, which in these cases is list position. One more example is the use of '''-''' to signify an option. To disambiguate, we have '''--''' to partition the argument list into options and non-options (see ['--' in Tcl]). Yes, it's totally true we can live without nulls. The real problem comes when interfacing with systems that ''do'' have nulls. Tcl has no easy and safe way to represent them. Reserving a string will work most of the time, but the Tcl script becomes confused when the reserved string collides with valid data. This may happen by accident or as part of a malicious attack, which means even nonsense strings like "ßÿÑâŖΊ" aren't safe. All the other stuff I said about nulls is just cute, sugary things we can do with them if they were added. ---- [wdb] (again) but if really neccessary, it is possible to introduce typed data to tcl. Just put them in a list the first of which contains the type, and the second the data as follows: set typed_value1 {allowed {hello world}} set typed_value2 {disallowed {bye bye}} This example shows the use of two data types ''allowed'' and ''disallowed''. It allows easily to construct a null value by choice of type ''disallowed''. ---- [[ [Category Language] ]]