Version 0 of Better Arrays for Tcl9

Updated 2001-01-30 10:10:03

(Pulled out of Tcl 9.0 WishList in order to keep that page shorter and easier to read...)


Better Arrays. Arrays should be objects that can be passed around and returned.

DKF - Some of this might in fact find its way into the core before 9.0; c.f. dictionary values.

MSJ - Is the idea here to use the syntax of present day arrays, but just to store the value like in Jean-Claude Wipler's Dictionary? In this case I think it would be a huge improvement, as not only would one get more functionality, but the TCL "everything is a string" philosophy would now also apply to arrays (I've never really understood why arrays had the limitations in the first place). The way I understand it, the following would apply:

 set a(fruit) apple
 ==> apple
 set a(vegetable) tomato
 ==> tomato
 set a
 ==> fruit apple vegetable tomato
 set basket(contents) $a
 ==> fruit apple vegetable tomato
 set basket
 ==> contents {fruit apple vegetable tomato}

I would like to add a few requests so that slightly more complex data structires can be handled. I would like to be able to do:

 set basket(contents)(fruit) pear
 ==> pear
 set basket
 ==> contents {fruit pear vegetable tomato}

It would also be nice if lists could be easier to use than at present, e.g.:

 set xx [list apples bananas cherries]
 ==> apples bananas cherries
 set xx((1)) [list BANANAS BROCOLI]
 ==> BANANAS BROCOLI
 set xx
 ==> apples {BANANAS BROCOLI} cherries
 puts $xx((2))
 ==> cherries
 puts $xx((1))((0))
 ==> BANANAS

DKF - Maybe. The problem with doing something like this is it obscures a number of other useful functionalities (arbitrary array names come to mind) and there is code about that tests for presence, absence and/or type of variables by using [catch] which would cease to work under such a change. That isn't to say that this is a bad idea, but the side effects are quite long-reaching and may not all be obvious. I have also corrected a few minor errors in your examples above; Tcls association lists - those accepted by [array set] - are not the same as TclXs keyed lists; they have fewer layers of quoting...

MSJ - I suppose what I am saying is "in the process of reforming arrays, please don't forget arrays & lists embedded in arrays". I do realise that the double parenthesis proposal would require changing the satus of parenthesis to a reserved symbol that would need to be escaped in certain sequences much like the $ symbol. This would of course break some existing code, but would be beautiful - everything would be a string, but with the symbols $,( and ) one would be able to reduce the use of functions set, lindex, lreplace, array etc, for basic tasks such as saving/retrieving a value in an array/list. One could of course also use a less common symbol (e.g. @ or $$) instead of parenthesis, but this would not look as good. BTW, was your answer to my first question a yes with the changes that you made?

DKF - Yes. Are there any overlaps with BLTs vectors?

MSJ - Don't know. I just had another brainstorm that I think will fix the present difficulty with embedded data structures without backwards compatibility problems - One function, let's call it struct for the purposes of demonstration:

 The first argument is the variable name
 Hereafter, the arguments come in pairs
 First argument in the pair is a charcter and determines how the arguments just before and just after it are interpreted:
 A . has the same meaning as in C (the argument before is a Tcl-array (C-struct))
 An @ means that the argument before is a list and the argument after is the index
 An = means the next argument should be assigned to the previous
 % foreach idx {a b c d} {struct x . $idx = $idx$idx}
 % set x
 ==> a aa b bb c cc d dd
 % foreach idx {1 2 3 4 5} {struct x . vec @ $idx = 4$idx}
 % set x
 ==> a aa b bb c cc d dd vec {{} 41 42 43 44 45}
 % struct x . vec @ end
 ==> 45

One can of course work on allowing more complicated indices after @ characters e.g. "end+1", "2*$i+5", "1..end". Instead of ".", "@" or "=", one might also write "delete" to delete an item.

RS -- Can't the deficiency of

 set foo(x) 1
 1
 set foo
 can't read "foo": variable is array
 set foo {bar grill}
 can't set "foo": variable is array

be overcome even before 9.0 by just calling array set/get in the positions where these two errors are thrown? Performance may get lost if huge arrays are recast to lists and back, but in the object representation one bit might indicate array state, so converting $foo to a list would be needed for accesses like lindex $foo 1 .