Tcl Tutorial Lesson 23a

Dictionaries as alternative to arrays

Tcl arrays are collections of variables, rather than values. This has advantages in some situations (e.g., you can use variable traces on them), but also has a number of drawbacks:

  • They cannot be passed directly to a procedure as a value. Instead you have to use the array get and array set commands to convert them to a value and back again, or else use the upvar command to create an alias of the array.
  • Multidimensional arrays (that is, arrays whose index consists of two or more parts) have to be emulated with constructions, using element names with commas for instance. The comma used here is not a special piece of syntax, but instead just part of the string key. In other words, we are using a one-dimensional array, with keys like "foo,2" and "bar,3". This is quite possible, but it can become very clumsy (there can be no intervening spaces for instance).
 set array(foo,2) 10
 set array(bar,3) 11
  • Arrays cannot be included in other data structures, such as lists, or sent over a communications channel, without first packing and unpacking them into a string value.

The alternative is the dict command. This provides efficient access to key-value pairs, just like arrays, but these dictionaries are pure values. This means that you can pass them to a procedure just as a list or a string, without the need for dict. Tcl dictionaries are therefore much more like Tcl lists, except that they represent a mapping from keys to values, rather than an ordered sequence.

Unlike arrays, you can nest dictionaries, so that the value for a particular key consists of another dictionary. That way you can elegantly build complicated data structures, such as hierarchical databases. You can also combine dictionaries with other Tcl data structures. For instance, you can build a list of dictionaries that themselves contain lists.

Here is an example (adapted from the man page):

#
# Create a dictionary:
# Two clients, known by their client number,
# with forenames, surname
#
dict set clients ID1 forenames Joe
dict set clients ID1 surname   Schmoe
dict set clients ID2 forenames Anne
dict set clients ID2 surname   Other

#
# Print a table
#
puts "Number of clients: [dict size $clients]"
dict for {id info} $clients {
    puts "Client $id:"
    dict with info {
       puts "   Name: $forenames $surname"
    }
}
  • What happens in this example is: We fill a dictionary, called clients, with the information we have on two clients. The dictionary has two keys, "1" and "2" and the value for each of these keys is itself a (nested) dictionary -- again with two keys: "forenames" and "surname". The dict set command accepts a list of key names which act as a path through the dictionary. The last argument to the command is the value that we want to set. You can supply as many key arguments to the dict set command as you want, leading to arbitrarily complicated nested data structures. Be careful though! Flat data structure designs are usually better than nested for most problems.
  • The dict for command then loops through each key and value pair in the dictionary (at the outer-most level only). dict for is essentially a version of foreach that is specialised for dictionaries. We could also have written this line as: foreach {id info} $clients { ... }. This takes advantage of the fact that, in Tcl, every dictionary is also a valid Tcl list, consisting of a sequence of name and value pairs representing the contents of the dictionary. The dict for command is preferred when working with dictionaries, however, as it is both more efficient, and makes it clear to readers of the code that we are dealing with a dictionary and not just a list.
  • To get at the actual values in the dictionary that is stored with the client IDs we use the dict with command. This command takes the dictionary and unpacks it into a set of local variables in the current scope. For instance, in our example, the "info" variable on each iteration of the outer loop will contain a dictionary with two keys: "forenames" and "surname". The dict with command unpacks these keys into local variables with the same name as the key and with the associated value from the dictionary. This allows us to use a more convenient syntax when accessing the values, instead of having to use dict get everywhere. A related command is the dict update command, that allows you to specify exactly which keys you want to convert into variables. Be aware that any changes you make to these variables will be copied back into the dictionary when the dict with command finishes.

The order in which elements of a dictionary are returned during a dict for loop is defined to be the chronological order in which keys were added to the dictionary. If you need to access the keys in some other order, then it is advisable to explicitly sort the keys first. For example, to retrieve all elements of a dictionary in alphabetical order, based on the key, we can use the lsort command:

foreach name [lsort [dict keys $mydata]] {
    puts "Data on \"$name\": [dict get $mydata $name]"
}

Example

In this example, we convert the simple database we used before to work with dictionaries instead of arrays.

#
# The example database revisited - using dicts.
#

proc addname {dbVar first last} {
    upvar 1 $dbVar db

    # Create a new ID (stored in the dictionary for easy access)
    dict incr db ID
    set id [dict get $db ID]

    # Create the new record
    dict set db $id first $first
    dict set db $id last  $last
}

proc report {db} {

    # Loop over the last names: make a map from last name to ID

    dict for {id name} $db {
        # Create a temporary dictionary mapping from
        # last name to ID, for reverse lookup
        if {$id eq "ID"} { continue }
        set last       [dict get $name last]
        dict set tmp $last $id
    }

    #
    # Now we can easily print the names in the order we want!
    #
    foreach last [lsort [dict keys $tmp]] {
        set id [dict get $tmp $last]
        puts "   [dict get $db $id first] $last"
    }
}

#
# Initialise the array and add a few names
#
dict set fictional_name ID 0
dict set historical_name ID 0

addname fictional_name Mary Poppins
addname fictional_name Uriah Heep
addname fictional_name Frodo Baggins

addname historical_name Rene Descartes
addname historical_name Richard Lionheart
addname historical_name Leonardo "da Vinci"
addname historical_name Charles Baudelaire
addname historical_name Julius Caesar

#
# Some simple reporting
#
puts "Fictional characters:"
report $fictional_name
puts "Historical characters:"
report $historical_name

Note that in this example we use dictionaries in two different ways. In the addname procedure, we pass the dictionary variable by name and use upvar to make a link to it, as we did previously for arrays. We do this so that changes to the database are reflected in the calling scope, without having to return a new dictionary value. (Try changing the code to avoid using upvar). In the report procedure, however, we pass the dictionary as a value and use it directly. Compare the dictionary and array versions of this example (see More Array Commands) to see the differences between the two data structures and how they are used.