dict discussion

Kroc - 2 Dec 2008 : Could you please explain me if this is a bug or a feature:

 1% dict set tclers zolli nick kroc
 zolli {nick kroc}
 2% dict set tclers zolli country france
 zolli {nick kroc country france}
 3% dict get $tclers zolli nick
 kroc
 4% dict set tclers zolli nick
 zolli nick
 5% dict get $tclers zolli nick
 missing value to go with key
 6% dict get $tclers zolli country
 missing value to go with key
 7% dict set tclers zolli nick kroc
 missing value to go with key

I understand I destroy my dict structure and content on line 4 but I don't understand why line 7 doesn't repair it.

NEM 2008-12-02: Because on line 4 you set the 'zolli' key to contain a scalar value ("nick"). Then on line 7 you try to treat what is in the zolli key as a dictionary again, but it cannot be converted to one. Dict set is roughly equivalent to:

proc dict_set {dVar args} {
    upvar 1 $dVar d
    set path [lrange $args 0 end-2]
    set key [lindex $args end-1]
    set val [lindex $args end]
    set d [dict replace [dict get $d {*}$path] $key $val]
}

Kroc - 3 Dec 2008 : I understand why my dict is broken. The things I don't understand are :

why dict let me break the structure of an existing dict on line 4
why it complains for the same thing on line 7

In my mind, if line 4 is possible then line 7 should be possible. But the better should be line 4 raises an error like line 7.

NEM - But what error do you expect line 4 to raise? "Cannot replace dict with non-dict"? That would be quite surprising. Line 4 is fine because it is just replacing one string with another. Line 7 needs to treat the contents as a dict, and so errors.

Lars H: The command Kroc was looking for is probably

 8% dict set tclers zolli {nick kroc}
 zolli {nick kroc}

In a way it's not that line 7 fails that should be surprising, but that the same command succeeded on line 1. Apparently dict set is helpful and initialises missing intermediate dictionaries to being empty when setting an entry, but it can't be that smart when one of the intermediate things is a non-dictionary.

DKF: The empty string is a dictionary; the empty dictionary.

NEM 12 Feb 2007 Moved all this from the dict page in a fit of wiki gnoming.

LES: Hmmm... How is this dict thing different from the current array thing? Why is Tcl copying stuff from Python?

Answer: A dictionary is a value. An array is an indexed collection of variables. So they can be used differently.

Dictionaries can be passed to and returned from procs as values (while arrays have to be faked with passing the name and using upvar). On the other hand, since arrays are variables, they can make use of variable features (like traces on reading, writing, and unsetting).

CN: I wonder if this command wouldn't be even more useful if it allowed to use a -nocomplain switch in 'dict get' and 'dict set'...? I would prefer to write

  set result [dict get -nocomplain $mydict $non-existing-key]

rather than

  if {[dict exists $mydict $non-existing-key]} {
     set result [dict get $mydict $non-existing-key]
  } else {
     set result ""
  }

(The real benefit begins to be felt with multiple keys...)

Similarly, the reference implementation of 'dict set' currently insists that values of subkeys are already present:

  set myvar [dict create]
  dict set myvar alpha beta gamma

gives the error no such key: alpha. The hypothetical -nocomplain version should then behave like

  set myvar [dict create alpha [dict create beta gamma]

Any opinions...?

DKF: A -nocomplain option? Excuse me while I throw up.

CN: So maybe your stomach could use a -nocomplain option, too... doesn't that in itself prove the usefulness of some such switch?

PWQ 29/01/04 And yet, you can incr a dict key that does not exist and not throw an error. Yet another rule to learn for when errors are thrown and when they are not.

KPV 2010/07/11 : I've just wrote a package that extracts data from XML files (GPX). Since GPX files have many optional elements, I thought using dictionaries to store the data that is present as name, value pairs made a lot of sense. But when I actually tried using the result, it turns out that dict get was very unwieldy: what should be just a one line get turns into five. tdom, on the other hand, handles optional attributes really nicely with a default value parameter to use if the attribute is missing. If no default is given and the attribute is missing, then an error is thrown. So how about:

  dict get ?-default defaultValue? dict ?key ...?

schlenk After dkf fixed a bug in the dict code, there isn't any real benefit felt with multiple keys anymore.

 if {[dict exists $mydict $key1 $key2 $key3]} {
     set result [dict get $mydict $key1 $key2 $key3]
 } else {
     set result 0
 }

This works and if CN is lazy to write it, why not just code a trivial proc?

 proc dict_getifexists {dict args} {
       if {![dict exists $dict {*}$args]} {return}
       dict get $dict {*}$args
 }

CN: Of course, I'm already using such a wrapper, but I don't really like it:

It looks so in-efficient: both dict exists and dict get search through the same (possibly long) cascade of sub-dictionaries for the same thing.
I'd think that accessing a potentially non-existing key could be quite common, so why shouldn't it be 'officially supported'?
For aesthetic reasons, the wrapper ought to be a sub-part of the dict ensemble, but I can't do this (currently).

But what about 'dict set'? The TIP isn't very explicit about the handling of non-existing keys (if my memory is correct). When I do

  dict set myvar a b c d e f

shouldn't it be clear then, that I really want to create all of the necessary sub-dicts? I think with the given reference implementation (from the link on the top of this page) I have to do something awkward, like this

  # first make sure all sub dict's exist

  if {![dict exists $myvar a]} {
     dict set myvar a [dict create]
  } 
  if {![dict exists $myvar a b]} {
     dict set myvar a b [dict create]
  } 

  # .. and so on ...

  if {![dict exists $myvar a b c d]} {
     dict set myvar a b c d [dict create]
  }

  # now set value 

  dict set myvar a b c d e f

schlenk You have a point here. One could argue that dict set should behave like similar commands (namespace eval comes to mind) and create the nested dictionaries as needed. The TIP states that it is an error for dict get (and that makes sense, you don't want Perl like cleverness to trick you) to handle noexistent keys, but makes no statement about dict set.

It is fully in line with the usual tcl style, if you try to read/retrieve an array element $foo(bar), where bar does not exist, this fails with an error. lindex with multiple indices behaves different, yes, i missed the point about it, but this may have historical reasons. If you see dicts as analogous to arrays the current (and TIP'ed) behaviour of dict get makes sense, if you see the analogy to lists it doesn't make sense.

I discussed with dkf that dict exist should handle missing dicts in the key lists not as error but simply return 0, so maybe one could discuss the issue of dict set also, as 8.5 is not even in alpha. I see that it is a reasonable request with no really negative side effects. So probably file a documentation bug/Feature Request at SF, as there isn't any statement in the docs about the behaviour of dict set.

fixed lindex statement in above statement and removed now unnecessary counterexample

CN: Given that there's supporting evidence from the list commands ... shouldn't we also appreciate a more tolerant version of dict get, along the line of lindex?

schlenk Maybe it's just me, but i like being able to do input argument checking without much trouble.

If "dict get" returned an empty result for nonexisting keys, i would have to do the following:

 proc foo {dict} {
    set bar [dict get $dict bar]
    if {![dict exists $dict bar]} {error "Bar does not exist"}
 }

If it behaves like it does now:

 proc foo {dict} {
    if {[catch {dict get $dict bar} bar]} { error "Bar does not exist" }
 }

So in the first case one has to do explicit checks to ensure one gets a valid and existing result, as an empty result exists like magic from nowhere. In the second case it blows up without the catch, which is a good thing.

One could now take the lindex analogy and argue. There you have to check with llength all the time, so you don't shoot yourself in the foot.

dict get really does reasonable things. If you are sure about your arguments, think you do not need checking as you know what you pass to your procs, ok. Use it, it will work, as you know your data. If you don't know what gets passed to your procs your much better off to find out early that you have missed argument checking.

PWQ 29/01/04 The contra for using catch is that dict also throws errors if $dict is not valid. So you have to demux the not found errors from the program errors. Not to mention that you need to rethrow the error in case the application has a higher-order catch in effect. So the two-line example you gave above is too simplistic to be considered a fair case. Given that there is already a dict exist, those people who want to distinguish not found from null have a valid way of doing so.

CN: Just for clarity: my suggestion is to support both variants of dict get in a sufficiently official way. It wouldn't help Tcl if a whole variety of mutually incompatible, package-dependent wrappers (like dict_tryget, dict_getifexists, mydictfuncs::dget, etc...) were created in the end.

I also don't buy the analogy with $foo(bar); this blows up because you try to access a non-existing variable (and rightly so, since this is a programming error). But dicts are supposedly (very) different from arrays, and I can think of a million cases where a missing key is definitely not a mistake. For example, would it be (very) bad style to pass named arguments to a proc in this fashion:

  proc myproc {widget-path args} {
     set bg [dict get -default white $args -bg]
     set dir [dict get -default . $args -homedirectory]
     # etc .. 
  }

But thanks, anyway! I guess I'll just file two separate feature requests for this and be prepared that at least one will be rejected. ;-)

schlenk I don't understand your example proc above ... And yes I think it is not good style to pass named args like this. After all there is a semantical difference between an empty value and no value at all. Only in cases where there is no semantic difference would your proposed solution be worth it. So if you get an empty value, does that mean use the default or does that mean use an empty value? You cannot decide without a check with dict exists so what's the point?

Supporting both variants as you propose would be possible, but looks like too much trouble for quite a small result to me.

CN: I don't think your criticism applies to the myproc given above: the hypothetical

   dict get -default white $args -bg

would return "white" if the -bg key is not present in $args; when you have a reasonable default you usually don't want or need to know whether it was explicitly given or not. This is a common situation where people just don't care about the semantic difference that you describe.

schlenk Ok, so your hypothetical dict works like:

   proc dict_getwithdefault {dict default key} {
   expr {[catch {dict get $dict $key} val] ? $default : $val}
   }

So you're right, my criticism would not apply, because the -default would implicitly take care of the problem I mentioned. But you're wrong to say people don't care about the difference. They do, otherwise they would not set a default value. A default is semantically identical (at least for the caller) to a given value in contrast to an undefinded value. If dict merge makes it into the core (and i think it will), you would do it like this, looks a lot cleaner to me.:

   proc myproc {w opts} {
     set default [dict create -bg white -homediretory .]
     set options [dict merge $default $opts]
     # now options are either default values or the given options

   }

CN: Sorry, but I'm not sure I understand your last paragraph: how can something be "identical ... in contrast to something else"? You seem to distinguish

explicit value, which happens to agree with the default value,
no value, interpreted as default,
empty value, as explicit indication that default behaviour is wanted.

Though perfectly reasonable, that's somewhat off topic ... ?

And though I'm all for "dict merge", I still think this need for "explicit initialisation" of keys is unusually patronizing for a Tcl command. It will also separate the initialisation of the values (at the beginning of the proc) from their use (somewhere (far) below)...

<uninformed_remark category="oops"> BTW: your dict_getwithdefault should probably be

   proc dict_getwithdefault {dict default key} {
   expr {[catch {set val [dict get $dict $key]}] ? $default : $val}
   }

</uninformed_remark>

schlenk Nope, the set val is unneeded as catch returns the value in its second argument. I didn't make myself clear in the last paragraph, i just meant the two options you list first. Explicit initialization is common in tcl, think about set, creating procs in nonexisting namespaces, interp eval'ing in non-existing interpreters, getting array values for non-existing keys. Most commands throw errors when you use non-existing or non-initialised values, but your right, some list commands actually work.

DKF: What about this:

 proc dict_getwithdefault {dict default key args} {
    expr {[catch {dict get $dict $key {*}$args} val] ? $default : $val}
 }

BTW, I have explicit initialisation of keys because otherwise there's no way to distinguish whether an empty value was stored in there explicitly or by default. This distinction is often important, or at least it is in code that I write. :^)

CN: I do agree that this is sometimes important. But what if you want to use a dict to keep input that comes from a graphical user interface? An entry widget only gives you the choice of entering a non-trivial text, or leaving it blank. "Unsetting" the value (ie. removing the key) is not a sensible option in this case.

Maybe I should elaborate on this a bit. The point is that (at least to me) a dictionary seems to be a convenient way to represent the entire state of a full hierarchy of menus. For example, a browser-like application might keep its configuration state in a dictionary that initially looks like this

   { identity { lastname Smith firstname Joe email [email protected] } 
     preferences { appearance { showtooltips 0 fontsize 16 } } }

So this initial dictionary is "clean", meaning: no empty values here, and all defaults are represented by non-existing keys. But after doing some user interaction the dict's value could have changed to

   { identity { lastname Smith firstname Joe email "" } 
     preferences { appearance { showtooltips 0 fontsize 16 backgroundcolor "" } 
                   security { enablejava 0 } } }

Here Joe Smith has deliberately removed his email address, and has tried out a background color of "light pink"; however, that didn't look good and so he went back to the default, indicated by "". And the conclusion is that here some defaults are represented by empty values, others by non-existing keys, and the application doesn't care about the difference.

CN: I've just filed corresponding feature requests at sourceforge, but want to add two more comments, nonetheless:

-nocomplain switches do already exist in a few other commands: glob (which IMHO is practically unusable without it), and unset.
A switch can never be mistaken for a dictionary, so there would be no need for the usual "--" meta-switch that marks the end of switches.

DGP Every -nocomplain option in Tcl is a signal that the wrong interface was originally chosen, and by the time that was figured out, we were stuck with keeping the wrong interface for the sake of existing code. The dict interface isn't frozen yet (won't be until 8.5.0) so we still have time to correct any mistakes in it, rather than resort to any -nocomplain ugliness.

If the dict interface is wrong, just propose fixing it and leave any -nocomplain out of it.

Lars H: I'd like to throw in an reason why dict get-with-default for argument processing is not made superfluous by dict merge --- sometimes the default is not known from the start. Suppose we have a proc that reads some tables from an existing file. This proc takes the name of the file and some extra options as arguments. Among these options are -encoding (for specifying the encoding of the file, defaults to utf-8) and -what (for giving the list of tables to read, default is to read all tables in the file). The problem is now that the -encoding option must be interpreted before the default for the -what option is known. Using early defaults and merging the dictionaries then requires the use of a placeholder for the default value, which leads to awkwardness like

 proc read_data_from_file {filename args} {
    set defaults_dict [dict create -encoding utf-8 -what everything]
    set opt_dict [dict merge $defaults_dict $args]
    # ...
    fconfigure $file -encoding [dict get $opt_dict -encoding]
    # ...
    if {[dict get $opt_dict -what] eq "everything"} then {
       set tableL [array names table_indexA]
    } else {
       set tableL [dict get $opt_dict -what]
    }
    foreach table $tableL {
    # ...
 }

Of course, it would be possible to not merge a default for this -what option and instead use dict exists to check for a value, but I find it more elegant to be able to write

 proc read_data_from_file {filename args} {
    # ...
    fconfigure $file -encoding [dict getwithdefault $args -encoding utf-8]
    # ...
    foreach table [dict getwithdefault $args -what [array names table_indexA]] {
    # ...
 }

i.e., one would have a subcommand

 dict getwithdefault $dictionaryValue ?$key ...? $default

A reason for this order of arguments is that the natural language equivalent sort of is

 DICT $dictionaryValue, GET $key WITH DEFAULT $default

Addendum: See also TIP#342 [L1 ].

FM Moreover, using "dict merge" is a trick, so a clear semantic is missing. But the "getwithdefault" subcommand doesn't satify me though. If there is default values for keys in a dict, we may intend that a "dict remove" of thoses keys only resets this key to the default value. Then a key with a default value may not have the same meanig as others keys. So I prefer an interface like this :

 proc read_data_from_file {filename args} {
     dict setdefault args -encoding utf-8
     dict setdefault args -what [array names table_indexA] 
     dict setdefault args -operate command
     # ...
     fconfigure $file -encoding [dict get $args -encoding]
     # ...
     foreach table [dict get $args -what] {
          # ...
          if {[catch {[dict get $args -operate] $table}]} {
              # 2nd chance : retry with the default operation
              if {[catch {[dict setdefault $args -operate] table}]} {
                  # No luck !
                  error "Can't figure out how to operate on table $table"
              }
           }
        }
     }
 }

The interest of such an interface is that the semantic is clear, that all defaults values can be set at the same place, and that a default value won't desapear after hasardous data manipulation. The default value of a key is the default value of a key which must exist. We need it, but just want it implicit in most cases. As we need it, we can't any more remove such a key, so it's usefull to keep a trace of it as a special kind of key.

male - 2004/27/01: To restart the discussion about TIP #29 [L2 ] and the handling of [dict get ...] - arrays and dictionaries are very near and the handling is nearly the same, so what's against the usage of ...

 set dictionary [dict create a 1 b 2 c 3 d 4];
 puts $dictionary(a);

It is about luxury and laziness and about reducing code! I don't wanna type every time I access a dictionary:

 puts [dict get $dictionary a];

Any comments and opinions?

The next suggestion is to "normalize" the usage of patterns with the dict command to the usage of patterns with the array command.

Currently there are only glob patterns to be used with dict. But with the array command we can decide to use glob, regexp or exact patterns!

So, what's about extending the following calls to dict?

 dict keys   dictionaryValue ??mode? pattern? # mode is element of {-exact -glob -regexp}
 dict values dictionaryValue ??mode? pattern?
 dict filter dictionaryValue key   ??mode? pattern?
 dict filter dictionaryValue value ??mode? pattern?

And at last ... why the command:

 dict for {keyVar valueVar} dictionaryValue body

Isn't it the same than:

 foreach {keyVar valueVar} $dictionary $body

And what's about an iterator?

schlenk 27.01.2004

About dict get behaving like short array references...

This would be a possible way, but what about nested dicts, the one major feature dicts have that arrays do not have?

 dict get $dict foo bar baz

How would you translate it to array syntax?

 $dict("foo bar baz")  ;# wrong, as there could be a key {foo bar baz} in the dictionary

 $dict([list foo bar baz])   ;# could work but i'm not sure

About dict for: I think it is an optimization to save a conversion to a list.

 foreach {key value} $dict $body  ;# this converts the dictionary to a list value

 dict for {key value} $dict $body ;# this does not destroy the internal dict representation

But your right, it's inconsistent and the better way would be to enhance foreach so it does deal better with dictionaries.

About the iterator: You think about a thing like the iterator for arrays? Should be trivial to do, but has nearly no extra value for dicts (in terms of memory savings)

DKF: Actually, dict for has advantages over foreach for iterating over dictionaries in that it is considerably more efficient. Or would be if I BCCed it (on the to-do list, but probably not while Tcl is still alpha, since it is just a performance opt.) In theory you could make foreach know about dicts directly, but I'd rather not. But then I've seen the implementation of foreach which might make me a bit biased... :^/

PWQ 29/01/04 Just why is the sub command called for and not foreach'?. For implies a range which there is none for dict. We also already have a foreach that iterates over all members of a list.

Donald Arseneau: I think introducing dict for is terribly wrong-headed. If you want efficiency, by leaving the dict object intact, then that should be built into foreach, whenever it gets a list of length two followed by a dict object for arguments.

CN:' 27-jan-2004: just two quick comments:

$dict([list foo bar baz]) should not work (for the same reason as $dict("foo bar baz"): there could be a key of that name) (DKF: It actually depends on how rigorous you are about key management. Dicts have the advantage of "just working" in this area without special actions.) (CN: you've lost me here... I somehow think "{a key} is {a key} is {a key}", no matter whether the first is internally represented as a list, the second as a dictionary, and the third as a plain string.. since 1 == [string eq [list foo bar baz] "foo bar baz"] any difference at this point would be very confusing (to the casual Tcl user at least)..!?!)
the difference betwwen dict for and foreach is the ordering: the dict version doesn't have to respect it. (DKF: The rendering of the dictionary into a list - required for input to foreach - imposes an arbitrary ordering on dictionaries too.)

Example:

   foreach {x y} {high noon alpha beta 2 1} { puts $y }

is guaranteed to print "noon", "beta", "1" in this ordering, wheras

   dict for {x y} {high noon alpha beta 2 1} { puts $y }

can give you "beta", "1", "noon" (or any other ordering).

DKF: Similarly

   foreach {x y} [dict create high noon alpha beta 2 1] {puts $y}

might also pick any ordering. (It'll actually currently turn out to always be the same and always the same as the dict for example above, but this is an implementation detail that is pretty much certain to change in Tcl 9.0 for fairly complex security reasons.)

male - 2004/27/01:

nested dictionaries: I never thought to address nested dictionaries. And IMHO the "easy" way to dereference dictionary keys don't need to address nested dictionaries! It's a shortcut to request the value of a key of a dictionary, not more. But to address nested dictionaries would be like (couldn't that a way?):

 % set d [dict create A [dict create A [dict create a 0 b 1 c 2 d 3] b 2] b 2 c 3];
 A {A {a 0 b 1 c 2 d 3} b 2} b 2 c 3
 % set d(A)
 A {a 0 b 1 c 2 d 3} b 2
 % set [set d(A)](A)
 a 0 b 1 c 2 d 3

DKF: That's unification between dictionaries and arrays. That's a very messy topic.

male: But that's unification (without "nesting") is IMHO worth discussing! Especially, because I don't see really a big difference between arrays and dictionaries! So only reason not using them as replacement would be the complicated way to access dictionary data with dict get ....

DKF: It's messy because of the differences between variables and values. If we had some kind of magical reference value that we could implement a variable with, making arrays and dictionaries be the same would just be syntactic stuff.

I'd just like to note here that there are some funky ideas floating around for unifying arrays with other kinds of container types (lists, BLT vectors, and metakit databases have been mentioned.) None of these ideas has really been pinned down yet enough.

PWQ 29/01/04 To my mind, another case for Meta programming.. If one could redefine the syntax for $, they could use array notation to access lists and both sides would be happy. Personaly I would take a performance hit for using $dict(key ...) over the typing for the current implementation. While some typing can be saved by using command aliases, I suspect that they won't get the same benefit of byte coding as dict get (eventually) will. Due to the way traces work, you also cannot fake a dict with an array without creating the key values in the array.

(DKF thinks this was also written by male on 2004/27/01, but separated it for clarity)

dict for: I think it would be better to enhance foreach to take a dict to without converting it to a list, if the variable list contains two variable names. I don't understand the thing about ordering mentioned by CN. Could someone explain? (CN: sorry: I should have given give an example right away.)

AJD Is it too late to change the interface to dict keys? I think that the optional pattern argument is a far less common use than finding the keys of a sub dict. It would also be more consistent to the other dict subcommands. ie. Make

  dict keys dictValue key1 key2 key3

the same as the currently necesary

  dict keys [dict get dictValue key1 key2 key3]

And similarly for dict values. I'll write up a TIP if anyone else thinks this has merit...

PWQ 29/01/04 Three parting comments:

Why do we have to have yet another object type. It would have seem preferable (in my mind) to extend the listobj to allow indexing via an installable handler so that all manner of processing functions. It would have only required adding a few new List Obj commands for handling the indexing of the list items, rather than creating a whole new infostructure.
Will the changes to API for Array break existing applications that expect to get a list obj and not a dict obj?
Dict does nothing to address the desire to have more expressive array handling in TCL.

Given that there are already a number of extensions for keyed lists, it seems overkill to add it to the core as a new object type.

schlenk 30/01/04 to 1: The dict infrastructure was more or less already there (the internal changes to the Tcl_HashTable functions), only the access functions are new. Providing a generic list interface with pluggable implementations is an idea floating around, but as said above isn't done yet. Could be done with Tcl 9, because it would probably break the current List API. to 2: No, it will not break, as dicts are autoconverted to lists if accessed as lists, only extensions playing tricks with internal non public Tcl_Obj details will probably be affected. to 3: what do you mean by more expressive array handling? Dicts do some things, they can be passed as values, which arrays cannot, they have some great features (dict values for example is something that should be ported to arrays).

On one side you propose proactive changes, on the other you say: No, don't put it in the core, an extension does it already. If arrays had been done right from the beginning (as first class objects) there would be no need for dicts.

JMN 2006-10-20 I think the whole interface to dict could be nicer if 'key' was always a list. I've been experimenting with a basic Tcl wrapper over dict.. that allows easier nested operations like this: e.g

 %dictn set data {slot1 item1 a} 1
 slot1 {item1 {a 1}}
 %dictn set data {slot1 item1 b} 2
 slot1 {item1 {a 1 b 2}}
 %dictn set data {slot2 item1 a} 3
 slot2 {item1 {a 3}} slot1 {item1 {a 1 b 2}}

 %dictn keys $data
 slot1 slot2
 %dictn keys $data {slot1 item1}
 a b
 %dictn keys $data {slot1 item1} a*
 a

This also allows in place incr & lappend without having to unpack/repack your dicts.

 %dictn incr data {slot1 item1 b} 
 slot2 {item1 {a 3}} slot1 {item1 {a 1 b 3}}
 %dictn incr data {slot1 item1 b} 100
 slot2 {item1 {a 3}} slot1 {item1 {a 1 b 103}}

Compare the one liner "dictn incr data {slot1 item1 b}" with:

 %set sub [dict get $data slot1 item1]
 %dict incr sub b
 %dict set data slot1 item1 $sub

Or I suppose you can still have a one liner with dict as it is, but yuck:

 %dict set data slot1 item1 b [expr {[dict get $data slot1 item1 b]+1}]

Now clearly, a Tcl wrapper over a core datastructure like dict is unappealing for performance reasons, so why does the dict interface have separate key arguments? To avoid clumsiness for in-place operations such as incr & lappend; a mechanism to separate keys from further arguments is necessary.

I can see there's a certain symmetry with lindex & lset in having keys as separate arguments.. but is this symmetry justified? Was this the main reason for this choice of interface?

I dunno.. despite the long delay in between Tcl major releases.. sometimes I feel it's all moving so fast ;)

Lars H: I quite agree, unwrapping and wrapping is a bore and eyesore. Allowing multiple keys (as in dict get and dict set) might be more appropriate in general than always requireing a list of keys, but it's hard to retrofit dict incr (and dict lappend etc.) with either. Using dict with should help, however:

  dict with data slot1 item1 {incr item1}

If you need auto-initialisation, it'd probably rather have to be

  dict with data slot1 {dict incr slot1 item1}

however, and if the dictionary keys are not constant (so you know they don't clash with variables) then you'd have to use dict update.

jmn hmm.. 'dict with' is interesting but a little scary. I really like the sqlite approach in iterating resultsets (a vaguely analogous situation), where you can optionally supply the name of an array in which to collect the values as opposed to the default of spewing potentially clashing variables into the calling context. 'dict update' has a similar approach with its variable mapping, but unfortunately brings us right back to the nesting problem in that it only operates at the 1st level of keys.

The more I look at it, the more it seems that the entire set of access functions could be useable with a 'keylist' in place of each currently specified 'key' argument. I don't imagine such change would be easy, particularly for 'dict with' & 'dict update' - but if simplified access to nested dict structures is a desirable goal, then presumably it'd be far better performance-wise for the dict internals to handle it than attempting such by wrapping in Tcl.

2006-10-21 dictn - a wrapper over dict to see what an interface focused on nesting support feels like.

2006-11-20 SS - Are there plans to add syntax sugar in order to access/set dict elements in a less verbose fashion? p.s. I could like to see in this wiki the ability to set an email address in order to get notification on (specific list of) pages edit.

2006-11-21 DKF: I have no plans at the moment. Other people might. See The L Programming Language for an example of some of the things that people have been thinking about.

slebetman 21 Nov 2006 - I completely disagree with the assertion that every -nocomplain option is a signal for a wrong interface design. While people like DKF may puke at the idea, there is nothing preventing them from ignoring it and using the thrown error instead. Different people just have different tastes and ideas on how to handle such conditions.

Take glob for example. If you assert that its interface is wrong and that the -nocomplain behavior should be the default then along comes people who'll say otherwise and say that it's easier for them to process thrown errors. OK, fine with me. I'll use the -nocomplain and you can use catch. But then some of those who loves using catch so much have the nerve to suggest that -nocomplain should be eliminated! That's NOT OK with me. I don't force you to use empty strings as special values so you shouldn't force me to use catch.

Donal, if you want to throw an error fine, then give us who don't care a -nocomplain switch. Even compared to supplying defaults I still prefer -nocomplain since other commands use it and is familiar to me.

DKF: My problem with -nocomplain is that it is usually implemented to neglect all errors, including ones that indicate problems in the program as opposed to "expected" errors. This leads to unfixed bugs and programs that cannot possibly work. They're too blunt a tool (and it is easy to use catch in just as bad a way).

slebetman: I guess that's just a difference in how we see our programs should run. For most of the programs I write, the specified behavior of handling such errors is to ignore the error, skip processing the step and continue. This is somewhat like the browser-server interoperability requirement of HTTP and most other internet standards. For example if I recieve a HTTP request, I am supposed to simply skip processing any malformed or unrecognised headers. I use HTTP here as an example but most of the protocols I handle also have similar interoperability requirements.

So for my applications -nocomplain is the right thing to do. Which is why I'm so thankful that the list functions don't throw up errors. Now I am less than happy that dicts will throw up errors.

Besides, in this case -nocomplain doesn't neglect errors. It is just I don't allow empty strings to be a valid value in my programs. So for me it is much less awkward to detect an empty string than it is to catch an error.

AMG's dict gripes

AMG: I'm having problems with [dict] that tie into JMN's above 2006-10-20 comment. Rather than inject my comments into old discussion, I'll start fresh on the bottom of this page. I toyed with the notion of creating a separate "dict gripes" page, but I don't think it's necessary... yet. :^) Well, here goes:

Many [dict] subcommands are incompatible with nested dictionaries. The workaround is to use [dict with], but [dict with] only updates the dictionary variable's value after it finishes evaluating the script, which means the script or its subprocedures can't access the "current value" of the dictionary as a whole if it has been "changed" earlier in the script. Also, it is impossible to add keys to a dictionary inside a [dict with] script.

To help make the case that consistent support for nested dictionaries is lacking or at least warty, I now divide all [dict] subcommands into three groups:

Subcommands that cannot be used for nested dictionaries without [dict with] or unpacking/repacking:

[dict append], [dict incr], [dict lappend], [dict update]

Subcommands requiring the aid of nested calls to [dict] to operate on nested dictionaries:

[dict create], [dict filter], [dict for], [dict info], [dict keys], [dict merge], [dict remove], [dict replace], [dict size], [dict values]

Subcommands that directly support nested dictionaries:

[dict exists], [dict get], [dict set], [dict unset], [dict with]

Only 26% of [dict] subcommands directly support nested dictionaries; the rest need help from the script*. And I thought one of the main advantages of [dict] over [array] is nesting! Most of the time I get better support for hierarchy with [array] by means of constructing key names with [list], and this works with all existing Tcl commands, even [trace]--- there's no need for [array lappend] or any such thing.

(*) Subcommands in the first list make changes to variables, so when using nested dicts they need to be surrounded by code that somehow connects them to the correct (sub)elements of the correct dictionary. Subcommands in the second list access dictionary values, so when using nested dicts their value arguments need to be returned by nested calls to [dict get] (or [dict create], to produce a nested dict in the first place). I'm primarily concerned about the first list, by the way, but the second list is still bothersome to me due to its inconsistency.

A possibility is encoding any given key path as a list passed as a single argument to [dict]. For which [dict] subcommands is this applicable? Definitely everything in the first and third lists above, and of the second list the following:

[dict create], [dict remove], [dict replace]

That is, the subcommands that presently take individual keys (i.e. paths of one element). Probably also change the remaining second-listers to accept key paths, eliminating the need for calling [dict get] to get the dictionary value to pass to other [dict] subcommands.

Incompatibility aside, the main drawback to this approach is that it requires [list] all over the place, which is annoying.

In this case it mostly helps:

 # Old version
 set d [dict create "John Doe" [dict create phone 555-5555] "Jane Roe" [dict create phone 555-5556]]
 set name "Jane Roe"
 puts [dict get $d $name phone]
 # New version
 set d [dict create [list "John Doe" phone] 555-5555 [list "Jane Roe" phone] 555-5556]
 set name "Jane Roe"
 puts [dict get $d [list $name phone]]

But in this case it hurts:

 # Old version
 set d [dict create "John Doe" 555-5555 "Jane Roe" 555-5556]
 set name "Jane Roe"
 puts [dict get $d $name]
 # New version
 set d [dict create [list "John Doe"] 555-5555 [list "Jane Roe"] 555-5556]
 set name "Jane Roe"
 puts [dict get $d [list $name]]

So much for that idea... By the way #2, I probably would be fine with this change if Tcl recognized (...) as shorthand for [list ...]. But of course that would break arrays, and arrays are already a valid alternative to nested dicts. :^) And if we're going to make fundamental changes to Tcl, we might as well just go ahead and unify arrays and dicts, [trace]s and all. I think that's the only real solution here; I suspect it might also be the most backward-compatible one, if we do it right.

(Sorry if I come off as being rude, ranty, and sarcastic, but I'm disappointed in [dict] and I'm venting my frustration.)

jcw 2011-03-26 - It would be useful to support "dict exists $dict", i.e. without any key arguments, as always returning true. That way, you can use "dict exists $dict {*}$keys" where keys is a path into the dict structure. The empty path being the entire dict. This might also be applicable to other members of the dict ensemble.

tombert 2011-07-25: Found this discussion searching for a -nocomplain option in dict ... and I want to support this. In Tcl8.4 I used my custom similar dict commands - if the key does not exists simply nothing happens.

 proc getOption {list args} {
    foreach arg $args {
        array set optlist $list
        set list {}
        if {[info exists optlist($arg)]} {set list $optlist($arg)}
    }
    return $list
 }

 proc confOption {list option value} {
    array set optlist $list
    if {[llength $option] == 1} {
        set optlist($option) $value
        return [array get optlist]
    } elseif {[info exists optlist([lindex $option 0])]} {
        set temp $optlist([lindex $option 0])
    } else {
        set temp {}
    }
    set optlist([lindex $option 0]) [confOption $temp [lrange $option 1 end] $value]
    return [array get optlist]
 }

 proc remOption {list args} {
    array set optlist $list
    if {[llength $args] == 1} {
        if {[info exists optlist($args)]} {unset optlist($args)}
        return [array get optlist]
    } elseif {[info exists optlist([lindex $args 0])]} {
        set optlist([lindex $args 0]) [eval [list remOption $optlist([lindex $args 0])] [lrange $args 1 end]]
    }
    return [array get optlist]
 }

Lars H: Hmm… confOption and remOption are dict set and dict unset respectively, except they take the keys as a list and operate functionally rather than on a variable. getOption is dict get defaulting to the empty string when the value is missing. Is that correct?

The point is they are nearly as fast as dict, but when I would first need to check if a key exists and only then use dict get the dict performance seems worse. Thats why I vote for -nocomplain. Similar option is -strict in the string command.

Lars H: I find it highly unlikely that these procs would be nearly as fast as the 8.5 dict command counterparts, but they may well be fast enough for what you're using them for. At present, the dict exists+dict get combo should be way faster (even if ugly).

With many posters I agree that the dict command is missing something. Especially jcw's idea I give a thumbs up.

Lars H: It is certainly in the spirit of TIP #323 and the dict get special case for no key arguments. Would probably be uncontroversial if TIPped.

One thing that I'am out of idea: Why does dict needs order-preserving. I understood dict always as a database. It should not matter if the dictionary looks like:

 -range {a b c} -myval {1 2 3}

 -myval {1 2 3} -range {a b c}

Is there really a use-case?

Lars H: Yes, there are cases where it is useful, though I can't recall one right now. Most of the time it indeed doesn't matter.

Category Discussion