Multi-assign

This page split out from Tcl 9.0 WishList


Larry Smith - Byte-compiled let


MSW @ 18 Mar 2003 multiple assignments via set

include something simliar into the core for set

 rename set theset
 proc set {args} {
  switch [llength $args] {
    0 { return -code error {wrong # args: should be set varname ?newvalue? ?varname ?newvalue?? ...}}
    1 { return [uplevel "theset [lindex $args 0]"] }
    2 { return [uplevel "theset [lindex $args 0] [lindex $args 1]"] }
    default { 
      uplevel "theset [lindex $args 0] [lindex $args 1]"    
      return [uplevel "set [lrange $args 2 end]"]
    }
  }
 }

i.e. allow set to set multiple variables at once, where the last one potentially gets no newvalue. The value of the last varname is returned. I've really seen enough of

 foreach {a b c} {1 2 3} {}

don't you think

 set a 1 b 2 c 3

looks much nicer ? :)

RS Agreed - and it matches the way the variable command can be called. Just a little nit: it seems to become an idiom to rename original Tcl commands into the tcl namespace, so I'd write (testing, so I don't lose the original on repeated sourcing)

 if {[info command tcl::set] == ""} {rename set tcl::set}

MSW Good point - I've added it like this as an example to the set page.

DKF - I favour adding TclX's lassign command. It's well known already, and it doesn't tinker with the behaviour of a very basic core command. TIPs 57[L1 ] and 58[L2 ] cover this sort of area as well.

MSW I disagree. lassign's syntax is no bit better than abusing foreach, and that's how it looks like. Having a TIP there (#58 looks fine) is a good reason that this entry is not needed. And I don't think that leaving a core command at a crappy state is a good argument against enhancements. lassign and variable's semantics are different. It would be better that if whatever you used for multiple assignments (and I do favour set, for no incompatibility to its current use will arise!) had the same conventions as variable has, or changing variable to have the same convention lassign has - and breaking tons of scripts.

RS: I think the foreach...break idiom is no abuse. foreach has two functionalities:

  • assign values from one or more lists to one or more variables;
  • iterate with these variables over a code body.

Both functionalities can be used by themselves, and I think these are no abuses:

 foreach {x y z} $threeValues break
 foreach _ _ {#some code that runs once and may be left with [break]}

MSW I think it's an abuse because I'd prefer to not have foreachs iteration variables in the calling scope, and thus I treat foreach as if it didn't clobber its calling environment. But that's personal preference. Both the uses you cited as 'valid' uses of foreach are nice obfuscators imho...


FW: Optimize the idiom

  foreach {a b} {1 2} break

to be byte-compiled automatically into a series of set operations, making it no longer a major slow-down for small utility procedures that are run repeatedly.

jcw - How about extending "set" a teeny bit? Perhaps like this:

   % set a b c {1 2 3}
   1 2 3
   % set a
   1
   % set b
   2
   % set c
   3
   %

FW: I was gonna suggest that, but that's a more dramatic change. In an ideal world, both would happen, but Tcl gets changed extremely slowly, so it'd be idealistic to suggest both at once ;)

rmax: I think both changes would even be allowed in a minor release, as the foreach optimization doesn't change behaviour at all, and the set extension only makes a backwards compatible change.

DKF: There's a currently Draft TIP on extending the set command, and it is really unpopular with the TCT. We're kind-of hoping the author will withdraw the TIP sometime so we don't have to be mean and vote the suggestion down. FWIW, the consensus seems to be that the set command is inviolate, even across major versions.

DKF: The TIP in question is #58[L3 ] and it looks like it will be rejected outright (as I write this, the vote is in progress but there have been many votes against and none for.)

DKF: Yep. Utter rejection. Anyone wanting to modify set had better not rely on core mods to do it!

FW: Does the TCT have an opinion on the foreach mod idea? :)

DKF: Try TIP #57[L4 ] instead, though perhaps recognizing the foreach-as-multi-set idiom and getting smart with it may be done at some point too.

FW: The thing is, the current idiom is just as understandable, already in use, and doesn't add a new command just for sugar.

DKF: If we put TclX's lassign in the core, I think you'll find that that does rather more than the foreach idiom. :^)

PWQ 12 Sep 2003, How about we implement this by thinking outside the square. One the one hand, most people just want to remove the trailing {} from the foreach command. On the other, people want to introduce another 30 C Functions to TCL's API.

How about this.

We add to interp alias a switch -trailing which adds trailing args to the aliased command. Then lassign becomes:

 interp alias -trailing {{}} {} lassign {} foreach

I know, too radical for the TCT, like TIP #57, it actually solves the problem simply without a lot of fuss.

BTW, If this is incompatible with the byte-code compiler, then it just adds to my argument that the BC should not be there.

FW: You don't like the bytecoding? What's so bad about it that's worth sacrificing the 10x performance advantage?

PWQ When changes are proposed, a lot of them are hard to do because implementing the change requires a major change to the byte-code compiler. My argument is that the BC should not influence changes to either the core or syntax (as in argument expansion)

 And yes I would rather TCL be 10x as slow, but have the features I want.

DKF: The reason why lassign is preferred is that it doesn't just handle the use-case that the foreach-idiom handles, but also does something sensible when the number of values in the list doesn't match up with the number of variables to assign to. That is something that the version based on foreach cannot possibly handle.

PWQ This seems to me that you want to add an error trap if they are not the same, Big brother watching out for the dumb programmer, Is this of real benefit.

OTOH, the -trailing option to interp alias is an interesting idea, though in the specific case you're looking at it would be better to do this (so that the effects are more predictable when the number of values exceeds the number of variables):

 interp alias -trailing break {} lassign {} foreach

I still prefer the TclX version of lassign though, as it makes the result of the command be a list of those values which were not assigned, which can be extremely useful indeed. One use might be a shift-analog, with the following putting the first three arguments into the arg array and setting argv to those arguments that are left (if any):

 set argv [lassign arg(1) arg(2) arg(3) $argv]

Doing such things with foreach is not really that practical (and doubly so when you start to deal with lassigning the results of a command-substitution...)

rmax: That's a nice feature of lassign I didn't notice before. It reminds me of Prolog where the only way to access a list is to take a finite number of elements from the head and assign the tail to another variable.

A practical example for this feature is replacing all but the first of multiple files with identical content by links to the first one. When I have a list with the names of such a set of files, I currently do it this way:

 set first [lindex $files 0]
 foreach file [lrange $files 1 end] {
     # replace $file by a link to $first
 }

With lassign it would become even smaller and more elegant:

 foreach file [lassign $files first] {
     # replace $file by a link to $first
 }

It would be great to have something like this in the core, although I don't particularly like the name "lassign".

DKF: That's the name used in TclX, and we've really no good reason to choose another. That'd just be gratuitous incompatability.


KJN: Multiple assignment is most convenient if the values are supplied as a list. Then it is simple to explode a list that is the return value of a proc. Also, I prefer the syntax

  multiset {a b c} [proc_that_returns_a_list $args]

to

  lassign [proc_that_returns_a_list $args] a b c

on the grounds that

  lassign $a $b $c $d

will have me reaching for the manual to remind me which arg is which, but

  multiset "$b $c $d" $a

or a different combination such as

  multiset "$a $b" "$c $d"

is more "obvious" (at least to me).