This page written by [DKF] ---- What is good about [Tcl]'s arrays? Their compact syntax for common operations. What is not so good about [Tcl]'s arrays? You have to use a hash-table for the mapping from keys to values. ---- [LV] For the more newbie reader, why is the hash table not so good? ---- Would would make things extra cool? If we could take arrays and put a new back-end on them based on [list]s, [dict]ionaries, or even something more exotic. There have been attempts to do this in the past ([BLT]'s vectors spring to mind here) Why do this to arrays instead of putting magic conversion on values? We'd have a proper named location to store the metadata describing how the array is implemented. What is going to be hard? Well, [trace]s are definitely tricky, as is [upvar] and friends, as they both (currently) require that your mapping mechanism hands out references to a '''Var''' structure, and that's not something that you can really map nicely onto a [Tcl_Obj] as it is an updatable structure. ---- '''What Operations are Needed?''' get: Retrieve a value from the mapping given a key. This operation should succeed if the key was one returned by the '''list keys''' operation below; behaviour for other keys is undefined in general. set: Update or insert a value into the mapping given a key unset: Update the mapping so as to remove an element list keys: Get the list of keys that map to data values. Note that an array might be permitted to support other keys with "magic" names, but this operation should only list the keys that map in a straight-forward fashion. serialize: Convert entire array to a string (preferably in the order listed by the '''list keys''' operation.) deserialize: Convert string to array [[...]] ---- [RS] Don't [array]s (and [dict]s) represent a string -> value mapping, while vectors (and [list]s) do (0<=int value? [DKF]: So what if vectors/lists have a restricted language of keys? :^) ''13may04 [jcw] - Cool! The serialize/deserialize operations are "slightly less primitive operations" IMO, in that they can be implemented with the first four. The get/set/unset/list combo is the core. They offer all sorts of interesting new options, similar to Perl's "tie". My first goal would probably be to map this to hashed Metakit views, i.e. persistent memory-mapped arrays. Gdbm is another obvious candidate. More advanced uses may need some more machinery, but I think most of that can be done in Tcl.'' I'd like to describe an idea which unifies keyed access (arrays), indexed access (lists), scalars, and more - but let me just point to [http://www.equi4.com/39] and [http://www.equi4.com/179] for now. ---- [RHS] ''08June2004'' I've been thinking a lot about arrays and how they aren't "first class citizens" when it comes to being data objects. Other than the fact that it would require a lot of work to make the change, what are the reasons for not converting arrays to be TclObjects that could be shimmered to/from other types of TclObjects? By way of an example, I think it would be very handy to be able to do: set bob {a 1 b 2} puts "bob(a) = $bob(a) -> should be 1" puts "bob = $bob -> should be {a 1 b 2}" set bob(c) 3 puts "bob = $bob -> should be {a 1 b 2 c 3} or {b 2 a 1 c 3} or {c 3 b 2 a 1} or etc" My thought is that, by converting tcl arrays to TclObjects and allowing them to shimmer to/from other types, that we would now be able to return arrays from procs, pass them in by value, etc. Other than the obviously huge amount of work the conversion would require, what are the major problems with this approach? ''Traces perhaps?'' [RS]: As soon as we have [dict]s, these can take over all the pure-value/first-class usages, and [array]s will be used less, I expect. [NEM] There is a major problem with this approach (overloading array syntax). Let's rearrange the items in your example: set bob {1 a 2 b} puts "bob(2) = $bob(2) -> should be... err 'b' if it's a dict/array, or 'a' if it's a list..." As you can see, this introduces an inheritant ambiguity. The return value depends on the current underlying Tcl_ObjType of the Tcl_Obj (value) - thus exposing a previously hidden implementation detail, and radically changing Tcl's script-level semantics. I wrote some notes on a related subject at [http://mod3.net/~nem/tcl/interfaces.xml] a while back. A possible way out is to introduce type-tagging at the script-level (so that the current type becomes a part of the string rep) where you want this sort of polymorphism. See [TOOT] for my experiments in that direction (no use of array syntax, but I think the idea could carry across, with some work). [DKF]: My idea is that the [array] command will get a new subcommand which allows you to declare a new array with an alternative implementation inside it. This might work like this: array declare bob -type list ;# Support more options (e.g. database username/password) array set bob {a b c d} ;# Set up the array contents puts "bob(2) = $bob(2)" ;# which is the value "c" if it is a list, of course. [RHS] I disagree with NEM's conclusion above. The notation ''bob(2)'' refers, as Tcl is defined now, to the element ''2'' of array ''bob''. As such, set bob {1 a 2 b} puts "bob(2) = $bob(2) -> should be b" Would be the only interpretation. Only if we decide to state that ''bob(2)'' can represent * element ''2'' of array ''bob'' * lindex $bob 2 (for a list) * some other reference into bob, for some other form of bob Do we run into ambiguity. I propose that sticking with the current definition of ''bob(2)'' as an index into an array is just fine. As an added note, if both the key and values of arrays can be TclObjects, then we can have arrays of arrays of arrays (much like in 7.6), which I'm a big fan of for representing tree structures and the like... And the shimmering to other forms should work fine: set bob {a 1 b {A 8 B 7} c 3} set joe $bob(b) puts "bob(b)(A) = $joe(A) -> should be 8" puts "bob(b)(A) = [set ${bob(b)}(A)] -> should be 8" puts "bob(b)(A) = [set $bob(b)(A)] -> should be 8 too? Not sure on the parser here" [NEM] Ah, ok - I misunderstood what you were proposing. So you are after accessing [Dictionaries as Arrays] (i.e. using array syntax to access [dict]s - coming in Tcl 8.5, see TIP 111 [http://www.tcl.tk/cgi-bin/tct/tip/111])? I haven't quite made up my mind about [DKF]'s proposal for an [[array declare]] command - static typing makes me nervous. Actually, it's a weird sort of typing that we have since array variables were introduced -- special variables which hold a value (an ''array'') which is opaque and can never be replaced (the array variable cannot be assigned to) -- instead, the array is mutable (it contains scalar variables, which can be assigned to as usual). With Donal's proposal, AIUI, this changes, and we now enforce type-checking on variables. The array variables declared become normal variables, and can presumably be assigned to. However, they are now type-checked, or at least cause automatic shimmering of items assigned to them. Some examples, needing clarification: array declare bob -type list set a "This is \{not a list" catch {llength $a} err; puts $err ;# --> unmatched open-brace in list array set bob $a ;# 1 puts $a(2) ;# 2 What happens now? An error? At 1 or 2? This is presumably an error, and is indeed an error now, as [[array set]] expects a valid list anyway. But what if we had other types? array declare foo -type myspecialtype array set foo [myothertype create] puts $a(jimmy) Now suppose that values created by [[myothertype create]] are valid lists, but not valid ''myspecialtype''s... What happens? As a further example: array declare foo -type dict array declare bob -type list array set bob {a 1 b} ;# 1 array set foo [array get bob] ;# 2 What happens here at 1 and 2? Is 1 valid - it is a valid list, but not a valid dict or array (odd number of elements)? If 1 is valid, what about 2? Note, I've not made up my mind on this completely yet, just trying to think about the implications. [Lars H]: NEM's claim that an array can never be replaced needs to be refined. You cannot replace an array with an ordinary variable using [set], but if you [unset] it then that variable name is ready to be reset as any kind of variable -- ordinary or array. As I see it, one important benefit of fitting a new rear end to arrays is that one can store special sets of data more efficiently. As an extreme example one could take a long "vector of bits" array type, where the data is stored as [Judy arrays] or some such. For such an array, accessing any index which is not a (32-bit) integer would be an error, as would be trying to assign a non-boolean value to an array element. [array get] would still be possible, but run a significant risk for out-of-memory panics. [RHS] ''09June2004'' Personally, I'm not a big fan of either dicts or typed arrays. I'm of the opinion that normal arrays, if they were TclObjects and could be treated as such (ie, how I described above), would be adequete for all the tasks I would want to use them for. Admittedly, it wouldn't address Lars' point of storing specialized data more effeciently, but it would be fine for most complex data types that I can think of (ie, trees, etc). As far as NEM's comments about malformed lists and the like, the way I see it they would cause an error. My thought is that any form that ''array set'' accepts (ie, a even numbered element list) would be acceptable as a list form that could be shimmered to an array. If ''array set'' wouldn't accept it, neither would it be possible to shimmer it to an array from a list. ---- [[ [Category Internals] | [Category Discussion] ]]