K is short for the K combinator , which, along with S, is one of the fundamental functional operators in Combinatory Logic from which all function composition can be done.
K is trivial:
proc K {x y} {set x}
An inline version of K is more efficient in 8.5a2 but not so hot in previous versions:
$x[<do y stuff here>; list]
K merely returns its first argument, discarding the second. The benefit comes from taking advantage of command substitution to do the real work. The expanded value of $x is the return value of the command, and although $y is discarded, it can be used as a finalizer for the command via side effects of any command substitution that may occur: Consider popping a global stack (top at end):
proc pop1 {} { set top [lindex $::Stack end] set ::Stack [lrange $::Stack 0 end-1] return $top }
or, implemented with K:
proc pop2 {} { K [lindex $::Stack end] [set ::Stack [lrange $::Stack 0 end-1]] }
One command less, one variable less, a slick one-liner (from: A different FORTH). The inline version looks like this:
lindex $::Stack end[set ::Stack [lrange $::Stack 0 end-1]; list]
AMG: Postincrement:
set x 5 set y [K $x [incr x]]
Implemented inline:
set x 5 set y $x[incr x; list]
See Hot curry and Combinator engine for more details than you'd expect ;-)
RS:
Besides use in performance enhancement, I mostly like K because it very simply gives us sort of the PROG1 functionality of Lisp - do this and that, and return the result of "this" :) This often helps to avoid an intermediate variable.
A generalized K that returns the first of its arguments:
proc K* {arg args} {set arg}
can also be used as identity operator with one argument, which comes in handy when using if like this (remember it returns the result of the applying branch):
set result [if $condition {K* yes} else {K* no}]
which is another way of writing
set result [expr {$condition? {yes} : {no}}]
DKF: However, this is not really K, as that is a two-argument combinator; indeed, there's no such thing as an n-arg combinator as every one has to have a specific number of arguments (which is enforced by the binding and evaluation rules of combinator theory - need reference on Head Normal Form here, but none handy.) It makes more sense though for Tcl though, which (normally) knows how many arguments there are without looking at the command name.
AMG: I renamed from K to K* to differentiate this proc from the K combinator DKF describes.
Lambda in Tcl can also be had as a one-liner with K:
proc lambda {argl body} { K [set n [info level 0]] [proc $n $argl $body] }
aspect: Here are some standard K-based combinators that make their way into many of my scripts. SK takes its name from combinatorial logic, as discussed near the top of this page.
# K generalised to multiple arguments -- because why not? proc K {a args} {set a} # (S x y z) = (x z (y z)) => (S K f x) = (K x (f x)) # naive version: proc SK {f x} {K $x [$f $x]} proc SK {f args} {K $args [uplevel 1 [list $f {*}$args]]} # Throwaway Debugging: puts the argument and return it proc D {a} {SK puts $a} # unset a variable, returning the value it had proc unset! {name} {upvar 1 $name x; K [set x] [unset x]} proc clear {name} {upvar 1 $name x; K [set x] [set x {}]} # Post-set: set a variable, returning its previous value proc set! {name val} {upvar 1 $name x; K [set x] [set x $val]}
In the Tcl chatroom, Miguel pointed out "inline K":
[K x y] ~ [lindex [list x y] 0]
That's slightly faster in 8.4+, as it is all bcc'ed.
RS, PYK: From 8.5, "anonymous K" will be more efficient. Assuming y returns the empty string, just it add it directly x. It depends on the fact that the low-level (bytecode) string concatenation operator knows not to do anything to an object when joining an empty string onto the end of it:
set someVar [someFunc $someVar[set someVar {}]]
sbron: According to Miguel in the Tcl chatroom this has been backported to 8.4.8
Lars H: It doesn't seem to be in 8.4.10. Exploiting a bug concerning internal representation of wide integers:
% set a [expr {wide(65536)}] 65536 % expr {$a*$a} 4294967296 % set b $a[set a {}] 65536 % expr {$b*$b} 0
IOW, $b, is not the same Tcl_Obj as the original $a. It is being reparsed from the string representation, and thus becomes an ordinary int. That's the kind of detour one cannot afford if at all bothering with tricks of this kind.
MS: does not have 8.4.10 handy, but just tested 8.4.11 and HEAD and observed the correct behaviour in both cases at the console. According to the Changelog, this was backported on 2004-09-10 and 8.4.8 was tagged for release on 2004-11-19. However, there is a bug in the implementation - it only works for byte-compiled code. See also Bug #1251791 .
Lars H: You're right! If instead of
% set b $a[set a {}]
I do
% if 1 then {set b $a[set a {}]}
then I get 4294967296 also from squaring $b. Good that have this verified.
LV: 2007-11-30: Are there changes that could be made in Tcl (perhaps Tcl 9) after which this optimization would no longer be needed?
This use of K is incidental to the design of Tcl, and can yield dramatic performance gains. The examples make use of the "inline" version of K, but the named version could also be used.
Many built-in commands can make more efficient use of values if they are unshared. A special case is if the value is the empty string, which is always considered "unshared". The pattern is:
set someVar [someFunc $someVar[set someVar {}]]
For example, to make lreplace work in-place.
set theList [lreplace $theList[set theList {}] 7 42]
This pattern can provide profound performance advantages that are not immediately apparent, changing common operations from quadratic in data length to constant scale. Helmut Giese writes about this in Enlightenment sought on use of K combinator , comp.lang.tcl, 2003-03-09, and Impressive results using K combinator Options , comp.lang.tcl, 2003-03-10. It also shows up in many performance discussions, such as how to shuffle a list.
See [L1 ] and Can some explain what is happening here?, comp.lang.tcl, 2010-03-31
PYK 2012-10: MS explained on the IRC channel that $theList[set theList {}] is more effective than $theList[unset theList] because (1) you do not have to update the varname hashtable (assuming the variable is actually in a hastable); (2) you don't have to worry about the variable being at either end of a link, and (3) because set is bytecompiled but unset is not (too complicated). In short, setting to the empty string is faster because it does a lot less.
DKF: It seems from digging around that it depended on a discovery of mine from no later than 1999, i.e., that several Tclers were writing posts to c.l.t at the time describing it as a well-known technique. I've not yet identified what it started, but it may have started from Reversing List, comp.lang.tcl, 1998-11-21 . It's quite difficult to search back in the 1998–99 period; you keep finding irrelevant stuff from later in the way.
KBK: Donal, this deserves to be made more widely known; it just came up again as I was benchmarking Bob Techentin's code to Shuffle a list. It's incredible what a difference it made in the performance.
DKF: On Jan 14, 8:40 am, tomas <[email protected]> wrote:
> someone care to enlighten us newbies on "K-tricks"?
They're named after K, which is one of the fundamental combinators and which is used to produce a sort of constant function. But that's not what we use it for. :-)
What's going on is that, at the C level, Tcl values are stored in reference counted structures called Tcl_Objs. These can be efficiently shared and passed around, but for Tcl's semantics to work right, can only be modified when they're not shared. If they're shared, they have to be copied first; the copy is unshared and can then be modified at will.
Now, applying this knowledge to the specific case where we're going to do a little list manipulating of a value in a variable and write it back:
set x [lreplace $x 2 3 foo bar grill]
Where are the references held? Well, there's going to be one reference in the variable (obviously!) and there's going to be another from the argument list; that second reference has to happen because otherwise we'd lose transient values such as the results of complicated substitutions. But this means that the list value is shared! The lreplace operation has to duplicate it, which is expensive (especially for long lists). What do we want to do to make it cheaper? Easy, we need to reduce the number of references held at the point when lreplace works on the value. Since there's a mandatory reference held as part of the argument list, the one we have to get rid of is due to the variable.
So what does the K trick do here? Well, it lets us reduce the reference count in the variable by changing it to something else or unsetting it, while still using that value. There are many such variations: here's one that's relatively clear:
set x [lreplace [lindex [list $x [set x {}]] 0] 2 3 foo bar grill]
What have we done? Well, firstly most of the command is unchanged. All we've done is replace $x with lindex [list $x [set x {}]]] 0. The only part that is perhaps not immediately obvious to you is that set x {} does indeed have a result: the empty string. However, at the reference level it's more interesting. Firstly, it adds another reference to the value in 'x' (as part of the arguments to list). Then it sets 'x' to the empty string, which reduces the reference count back to 1; the rest of the construction is just about extracting that value from the arguments to list and converting it into a naked argument to lreplace (with no net change to its reference count). That then means that lreplace has an unshared value to work with, and can hence do its edits in place; that's faster.
The original K trick was similar, but used a helper procedure:
proc K {x y} {return $x} set x [lreplace [K $x [unset x]] 2 3 foo bar grill]
The details are slightly different but the overall story is the same. Moreover, in 8.5 there's a cheaper version that has less intermediate state:
set x [lreplace $x[set x {}] 2 3 foo bar grill]
This is a bit trickier as it relies on the fact that concatenation of any value with the empty string is a specially optimized case. (In all the above things, it all works just as well if you use set x {} for unset x; any other constant would do too. Indeed, there's a real sense in which unsetting a variable is just writing a NULL for its contents instead of a value reference.)
Of course, if this is unclear then ignore it. Write your code in a simple way first. Then if you think there's a problem with speed, use time to check whether that's really so. Only use the tricks described in this post if you're sure that the lack of clarity is justified by the measured performance boost.
Prior to Tcl 8.5, this pattern was employed via K:
proc K {x y} {set x}
For example:
set theList [lreplace [K $theList [set theList {}]] 7 42]
NEM: There is also a language K . It's tailored towards high-performance relational database work -- data mining? As I understand it, it is somewhat related to APL, but jcw would know more about the specifics. I'm not aware of any direct connection to Tcl.