Version 134 of array

Updated 2015-12-15 11:51:59 by hkassem

array is a built-in ensemble of commands that manipulates Tcl's array variables. Array variables can also be manipulated using arrayName(key) syntax.

Synopsis

array anymore arrayName searchId
array donesearch arrayName searchId
array exists arrayName
array get arrayName ?pattern?
array names arrayName ?mode? ?pattern?
array nextelement arrayName searchId
array set arrayName list
array size arrayName
array startsearch arrayName
array statistics arrayName
array unset arrayName ?pattern?

Documentation

man page

Description

A Tcl array is a container of variables. Tcl's two-component $arrayName(key) syntax can be used to substitute individual values of an array. The name of a variable in an array may be any string.

Unlike a dictionary, an array is not a single value. Instead, the name of the array is used as a handle, and is passed to array commands in order to perform some operation on the array such as reading the value of some variable in the array. The array itself, being a container, does not have a value that can be read. However, array get returns a dict representing part or all of the array.

Internally, Tcl uses a hash table to implement an array. An array and a dictionary are similar in functionaly, but each has qualities that distinguish it. trace and upvar both can operate on array member variables, but not on elements in a dictionary.

Neither array nor dict are the direct script-level equivalent of Tcl's internal hash tables, but a dictionary is the more minimal interface to them. Prior to the introduction of dict, arrays were the Tcl script-level facility bearing the closest resemblence to such structures, so they were often conscripted to that end.

Array keys are not ordered. It isn't straight-forward to get values out of an array in the same order that they were set. One common alternative is to get the names and then order them. In contrast, values in a dict are ordered.

traces may be set on either an array container or an individual array variable.

See Also

Arrays / hash maps
more details about arrays
A simple database
container
a guide to containers in Tcl
parray
Arrays as cached functions
Arrays of function pointers
Memory costs with Tcl
for measurement of array/list element consumption in bytes.
Persistent arrays
Procedures stored in arrays
array name string matching extension
GUI for editing a Tcl array
Fitting a new Rear-End to Arrays
foreach
iterating through an array

Creating an Array

To create an array, set a variable within the array using the arrayName(key) form:

set balloon(color) red

or use array set:

array set balloon {color red}

To create multiple array keys and values in one operation, use array set.

To create an empty array:

array set myArray {}

Array names have the same restrictions as any other Tcl variable.

When using the braces syntax of variable substitution, include the parenthesis and the name of the member variable within the braces:

array unset {this stuff}
set {this stuff(one)} 1
parray {this stuff}

A common beginner mistake is to over-quote the name of the member variable:

#warning, bad code ahead!
set a("key") value ;#-> value
array get a ;#-> {"key"} value

unset a

#better:
set a(key) value ;#-> value
value
array get a ;#-> key value

In Tcl, everything is a string. Quoting strings is mostly not necessary, and can even be a problem, as in the previous example.

In the following example the double quotes are not needed because the values don't contain any special characters:

#unnecessary double quotes
array set myArray {"element" "value"}

Better syntax would be:

array set myArray {element value} ;# or:
set myArray(element) value

More examples:

unset x                        ; # x doesn't exist at all anymore
unset x ; array set x {}       ; # x exists as an array but has no elements
array unset x                  ; # x doesn't exist at all anymore

foreach idx [array names x] {
   set x($idx) {}
}                              ; # array exists - all the elements still
                                 # exist, but values of each element are now
                                 # empty
array set colors {
    red   #ff0000
    green #00ff00
    blue  #0000ff
}
foreach name [array names colors] {
    puts "$name is $colors($name)"
}

Retrieving the Value of an array key

# use a literal key
set value1 $ballon(color)
# use a key within a variable
set value2 $ballon($key)

Iain B. Findleton 2004-06 asked whether there were easier ways to read an array element, given that the name of the array was in a variable and the array key was in a variable. His example was:

eval {set ${key}($item)}

DKF writes, Just remove eval from the outside (it just confuses things) and it'll be fine:

puts [set ${key}($item)]        ;# Read
set ${key}($item) $val          ;# Write

If you're in a procedure, use upvar to create a local reference to the array so you get something like this:

upvar $key v
puts $v($item)
set v($item) $val

It's also possible using upvar to link to an array element, but I don't recommend it (for example, it fails if you decide to set key equal to ::env since env-var management is done via a whole-array trace).

Determine whether a Key Exists

info exists array(key)

See info exists

Unset an Array

unset balloon

array unset provides a way to unset a subset of keys

Incrementing an Array, Creating it if it doesn't Exist

Newer versions of tcl already behave this way, but with older versions:

proc incrArrayElement {var key {incr 1}} {
    upvar $var a 
    if {[info exists a($key)]} {
        incr a($key) $incr
    } else {
        set a($key) $incr
    } 
}

Simulating Multiple Dimensions

There are no multi-dimensional arrays in Tcl but they can be simulated by a naming convention:

set a(1,1) 0 ;# set element 1,1 to 0

This works if the keys used do not contain the ',' character. If the keys can be arbitrary strings then one can use the list of the indices as name of the variable in the array:

set a([list $i1 $i2 $i3]) 0; # set element (i1,i2,i3) of array a to 0

This is completely unambiguous, but might look a bit uglier than the comma solution. Also remember that

set a([list 1 2 3]) 0

is equivalent to

set {a(1 2 3)} 0

but not to

#wrong # args
set a(1 2 3) 0

because the last example passes four argument to set.

AMG: To implement multidimensional arrays, I often use the convention given above (commas, not list, but that's a good idea), but it prevents me from easily getting a list of elements in any one dimension. For the following array:

array set data {
    foo,x ecks    foo,y why    foo,z zed
    bar,x ECKS    bar,y WHY    bar,z ZED
}

I'd like some means to get a list foo bar. How is this useful? I have written many server programs that use multidimensions arrays to keep track of state for all connected clients. To get a list of all client IDs, I have another variable or special array element listing the client IDs, but I have to always keep it in sync with the rest of the array. I dislike this.

What if multidimensional arrays were accessed using $name(dim1)(dim2)(dim3) syntax? Thanks to a bug, we once had multidimensional arrays, but the syntax was of course very very weird (I think it used uplevel 0). This is a bit cleaner-looking. But it has very bad interactions with array. How would the following be converted to use array set?

set data(foo)(x) ecks; set data(foo)(y) why; set data(foo)(z) zed
set data(bar)(x) ECKS; set data(bar)(y) WHY; set data(bar)(z) ZED

What should array get data return?

Lars H: Well, why don't you ask Tcl? :-) It would tell you that after the above commands, array get data returns

bar)(z ZED foo)(x ecks bar)(x ECKS foo)(y why bar)(y WHY foo)(z zed

and (as an aid to help overcome one's prejudices about how the above should be interpreted)

join [array names data] \n

returns

bar)(z
foo)(x
bar)(x
foo)(y
bar)(y
foo)(z

This is a recurring problem with attempts to extend Tcl syntax: the "new syntaxes" people come up with usually already mean something, even if that "something" looks rather silly.

AMG: in response to Lars: Wow, I didn't realize Tcl would accept such syntax! It turns out that I'm simply using )( as my dimension delimiter.

Alright, now let's think about how to get a list of all elements in a given dimension. This is easiest to do if the array indices are proper lists:

array set data [list                                         \
    [list foo x] ecks    [list foo y] why    [list foo z] zed\
    [list bar x] ECKS    [list bar y] WHY    [list bar z] ZED\
]

proc array_dimnames {array_var dim_index} {
    upvar 1 $array_var array
    set result [list]
    foreach name [lsort -unique -index $dim_index [array names array]] {
        lappend result [lindex $name $dim_index]
    }
    return $result
}

% array_dimnames data 0
bar foo
% array_dimnames data 1
x y z

That works. For other delimiters, each element of array names needs to be split before the list can be passed to lsort. Another job for lcomp I guess.

For really big arrays such as the enormous MV catchall array used in OpenVerse, I wonder if this costs too much, so much that it's worth it to separately maintain element lists rather than extract that information from the array names.

AMG: Continued from before: array names data should return foo bar, but $data(foo) wouldn't be valid, breaking old assumptions. Should set data(foo) dummy unset data(foo)(*)? And so on.

If array notation could be applied to dicts we'd be in great shape. Doesn't Jim do this?

Lars H: Why don't you just use nested dicts? It seems those will do precisely what you ask for above.

AMG: I can do some things with arrays that I can't do with dicts: namely, traces and upvars and everything else that uses those features. So, I often use arrays when I need to use elements as -textvariables. Perhaps I should be using namespaces instead, preferably wrapped by snit.


LV: Over on comp.lang.tcl, 2007-02, Fredderic provides the following proc in response to someone who was to declare an empty array at the start of a Tcl script.

proc declare_array arrayName {
    upvar 1 $arrayName array
    catch {unset array}
    array set array {}
} 

The idea here is to catch the unset in case the variable was not already declared. Then, array set makes the variable an array, but without any members. That way, a later reference to the name in a non-array setting generates a variable is array error.