Why is TCL syntax so weird

Summary

Why is Tcl Syntax so weird?

Description

It's not. Everything else is weird.


Larry Smith 2013-04-10:

And this is literally true. But it can be very hard to see if you are used to languages with elaborate syntaxes, which includes C and all its derivatives.

Tcl is a macro language, it derives from macro-assemblers, the C preprocessor, and older, text-oriented languages like TRAC (text reckoning and compiling) and SAM76. Essentially, all it does is scan a line of text and performs a series of substitutions according to Tcl's invariable substitution rules. Applications like expand actually use Tcl in just this way. Where most Tcl programs are actually more interested in the side effects (like setting variables), expand simply does the substitution and returns the resulting string, a simple macro processor. If you were a heavy abuser of the C preprocessor, if you had experience with macro-assembler - or you are a computer history nerd with a taste for obscure languages - Tcl's nature is immediately apparent, totally logical, and completely predictable. All it's power comes from the fact that there are no exceptions to its syntax rules - with the arguable exception of {*} but we won't go into that here. :) At any rate, many scripting languages today are built with this same paradigm.


Okay, you might not buy this. I don't either, completely. The point is, though, that Tcl may be the only language I've ever heard of where there are no special/built-in functions.

VL 2003-06-14: Hmmm, how about Prolog, LISP-dialects, etc. etc. etc.? Or am I missing your point here? I suppose that what is a language and what is the "standard library" of that language are a bit of a grey area.

pdm: Good point. I have mostly used procedural rather than functional languages. From my college days, I vaguely remember the syntax being simpler in prolog or lisp. Which actually made it harder for me to think that way. Maybe I was poisoned by an early exposure to BASIC :)

There is no Pascal println that allows variable arguments when every other function has a fixed number of arguments.

LV: On the other hand, variable arguments are just fine in Tcl. It is just that the default output command, puts, requires one string as its output format. If someone wanted to write something like println, they could write a proc that would collect its arguments, hand them to format and then to puts.

In fact, you could have something like this:

proc println { args } {
    puts [join $args {}]
}

so that you coded, in tcl,

set a "STring 1"
set b "string 2"
println $a $b [list This is a list]

On the third hand (if you are so endoweled), in Tcl, you really don't NEED a println type function, since you can just use the one argument of puts:

puts "This is some text, and here's $one, and here's $two, and here's $three ."

In Pascal, you had to have variable arguments so that you could put together strings and variables. Tcl allows you to just use them.

pdm: This shows one of the strengths of Tcl: you can make it do almost anything you want. Sometimes it isn't obvious, especially if you are stuck in Pascal-think. I find myself having to think harder because I'm trying to force C code into Tcl. If I just do it the Tcl way it can be much easier :)


There is no C for that has three arguments separated by semi-colons where each argument can have any number of operations separated by commas.

LV There are, however, several looping constructs that you can do looping - for instance, where in C you would write:

for (i=0;i<10;i++) {
    printf("x is %s\n", x);
}

in tcl you would write:

for { set x 0 } {$x<10} { incr x } {
    puts "x is $x"
}

glennj: the tcl for is just like the C for -- the start and next parameters are arbitrary script bodies. Not that this is encouraged, but you can do:

# add up the numbers from 0 to 9
for {set sum 0; set counter 0} {$counter < 10} {incr sum $counter; incr counter} {}
puts $sum ;# ==> 45

pdm: You're right, of course. What I meant was that the rules for Pascal or C are not simple and consistent.

The rules for TCL are. Since I have "baggage" of C, Perl, etc, it seems weird at first.

It is real simple: a command followed by zero or more arguments. That's it.

Unfortunately, that means a simple assignment statement has to be written funny. let Trying to say

a = "12345"

would try to run the a command, which probably doesn't exist. If you try to use Perl-like syntax, you might try

$a = "12345"

which would also fail because the parser would try to get the value of a variable called a and treat that as the command name (see Substituting Command Names).

LV: Unless you play games with Salt and Sugar, which allows you to write assignments in more unusual ways.

So, TCL has a set command that takes two arguments: the first is the name of a variable and the second is the value.

I just think it's weird because everything else has special cases.

LV other languages use set - csh for instance, which apparently provides a number of influences on Tcl. The original BASIC required a syntax like:

[let] a = 1

which is even worse!

Arjen Markus: I can add to this, that formally and semantically the use of an equal sign, as it appears in many programming languages, is claimed to be a bad move by some. The reason is, that in a mathematical sense, an equal sign simply means that something is equal to something else.

Larry Smith I believe this particular sin dates back to Fortran.

In descriptions of algorithms one frequently sees

variable <- value

to distinguish it from a check for equality.

Larry Smith And C does as well, but there it uses "=" for assignment, and therefore must use "==" for equality test, thus setting up the perfect trap for the unwary and a fecund source of bugs notorious for being nearly invisible, since even C coders will look at if (a = b) and read it as an equality test.

Still other languages, Eiffel for one, use this

variable := value

Smalltalk used the arrow, but encoded it as the "_" character, so to this day, "a _ 0" is interpreted as "a<-0" by Smalltalk. This was deemed much too opaque, however, so Smalltalk now also recognizes :=, a notation that dates back to Algol and is standard in all of Wirth's language designs. Using ":=" and "=" is much less error-prone than "=" and "==".

So, again, the fact that Tcl does not adhere to the commonly found constructs does not mean that it is weird.

As a personal note, the first time I tried to find man pages about Tcl I was very disappointed to find only a description of the basic syntax - nothing about if, for or whatever.

DKF: Note that in some BASIC systems (ZX80, ZX81 and ZXSpectrum in all its variants, though some came with variations that did not work this way IIRC) the LET keyword was inserted by pressing a single key, which made its use much less of a chore (though the platform was definitely fairly bizarre.)

Lars H: That comment makes me remember when I wrote (in BASIC) a combined assembler/editor for the ZX Spectrum. The entire assembler program was stored in a string (now that we know that everything is a string, this is obvious, but back then it wasn't). And "n" is still my "foo loop variable", because "NEXT n" was just a matter of pressing the "n" key twice.


LV: As a personal note in reply to the message about disappointment in the man pages - man pages were not, at least initially, intended as tutorials to the use of the subcommands, but simple reference items describing what a command's subcommands, etc. were.

It was, and to the most part still is, expected that sites like this one, books, etc. would provide tutorial level information.

If there is reference material that you feel is seriously lacking, there are several alternatives.

  1. We have a variety of pages on this wiki, at least one per tcl command, on which questions, and subsequently answers to those questions, can be placed.
  2. http://tcl.sf.net/ and the tk sibling have a feature request option, where one can request changes. These may, or may not, be implemented, but at least they won't be lost.
  3. At that same site, there is a patch manager, where people can submit patches to broken behavior. If you feel the man pages are broken, consider fixing at least one of them, and submitting it as an example of proper documentation.

As another note, there is a really, really weird construct in C that is explicitly allowed, if my memory serves, it is called the Duffy construct and consists of an unexpected mixture of switch and case statements. Unfortunately, both my memory and my library do not serve at this moment. The reason it is allowed by the standard is merely that it optimizes certain operations.

pdm: For the compiler/language-designer purists, that brings up an interesting question: when is it okay to "break" or "abuse" the syntax of a language in the name of faster runtime? My guess? The TCL and C designers answered this question "whenever a clever user/developer can figure out how" and "whenever needed", respectively :)

EE: Duffs Device is an example of loop unrolling, but the reason it is allowed by the standard is not "merely that it optimizes certain operations", but rather because it does not violate any syntax rules or constraints.

AMG: I did a bit of loop unrolling here on the wiki. See my contribution to Matrix Multiplication. The Tcl script uses nested [for] loops to generate a single, long line of Tcl script that accomplishes matrix multiplication with no run-time loops. Alas, I wasn't responsible enough to profile before I optimized, or else I'd be able to say whether or not this is actually worth anything.

In a sense, I embed a mini-compiler in my code. This ties in with the meta-language comment further down this page.


RS: Call it weird or not, most "algebraic" languages like C have mathematical syntax (infix operators +-*/, parens etc.) mixed into another with keywords, braces etc. C++ goes so far as to allow overloading operators, but you couldn't define new operators or new arities (for instance, a unary /). Math syntax is of course needed as computers often compute. In Tcl, most math is concentrated into the expr command, leaving the rest of the language as clear and pure as it is. But even in expr, using the general rules of string manipulation we can do more things than are possible in C or other algebraic languages:

foreach operator {+ - * /} {
    puts "3 $operator 3 = [expr 3 $operator 3]"
}

The expr parser would not accept the operator substitution, so in cases like this, leave the argument to expr unbraced, so you can enjoy the full freedom of Tcl.

DKF: SML permits declaration of new operators. Overloading existing ones requires a language extension though.

AM: Fortran (90 and later) allows the definition of new unary and binary operators, it also allows extending the existing operators for new data types (but you can not redefine the meaning of summing two integers for instance - for obvious and wise reasons).


Earl Johnson:

There are some weird items in the parsing of Tcl like:

set x a"b
set y $x.b
set x.b c
puts $y ; # is a"b.b

but on the whole the language is simple. But the parser is just a small part of the story after the parser gets done with the source each command has almost complete freedom to add more levels of substitutions, evaluation or what ever.

I don't think this "strangeness" is too bad because I picked it up so easy but I can understand that others may be turned off by it.

;^)


This is a response to Earl's remarks above:

I think this "strangeness" is in the eye of the beholder. When I read this code it all looks right to me. Before I knew tcl I it would have told me that tcl is different. I'd expect it to be different, after all it is a different language. The question I needed to resolve is "Is it worth it to learn tcl?" since it's different.

tk is what made it worthwhile for me, and I suspect it did for others. I didn't learn perl because it was different, didn't have a gui, and seemed to carry over a lot of the ills of other early shell languages. bob


TV: What a strange page title..

I guess people have a certain structure or language in mind when they program, or maybe certain concepts, which could be either like comparing english and german, or maybe like comparing logo and C++. Tcl is of course simple in its fundamental form while it has the power to do most known programming things without too much effort.

But if your mindset is to be grounded in pascal's formal definitions or your fingers just won't move without objects of your desire, you probably weren't raised on tcl or scripting.


Tcl syntax *is* weird, at least when compared to other (popular) languages. It's weird in that it doesn't have much syntax. Tcl uses semantics where most languages would use syntax: variable assignment, control flow and command definition are all done through the same syntax. Whereas most languages have a list of keywords and a BNR grammar for these sort of operations, Tcl has default commands and even these can be redefined.

I think of Tcl more as a meta-language, and every program winds up being a custom and application specific language. It's similiar to Lisp and Forth in that regard, but (IMO) easier to read than either of those.

And that is weird to anybody comming from a C++/Perl/Java or even Python background. In my experience, lisp-niks get it right away (but dislike Tcl for the syntax it does have).

pdm: and for the heretical under-use of parentheses :)

Seriously, though, this last comment puts it very well. Thank you, anonymous contributor!


alove: 2005-12-10: The reason this issue keeps coming up is because other languages all the way back to BASIC and including Perl and Python, have very user-friendly and intuitive syntax for math expressions, for example:

Python: c = a + b
Perl:   $c = $a + $b;
BASIC:  c = a + b
PHP:    $c = $a + $b
Ruby:   c = a + b
TCL:    set c [expr {$a + $b}]]     - can you feel the difference?

Considering how basic and fundamental arithmetic expressions are, I'm amazed that no one has fixed this yet.

User friendliness takes precedence over abstract ideas like the Dekalogue. If you have to make an exception to make arithmetic expressions look pretty, as in every other interpreted language, then I think you should.


EMJ 2005-12-10: I was going to correct the Tcl line above, then I thought it was better to have a visible correction:

Tcl:    set c [expr $a + $b]

The original works (except for the extra right bracket), but the braces are unnecessary, so why put them in?

Or why not put something unnecessary in the other languages?

aricb: with apologies for interrupting, EMJ: put in those braces for all the reasons at Brace your expr-essions. Those few extra keystrokes will pay big dividends--and you'll sleep better at night :)

Larry Smith Unfortunately, they also make it much harder to read the expression, and it adds large numbers of obscuring characters to index expressions - which is why Tcl now has a brain-damaged expression evaluator for index expressions. In my opinion adding "$a+1" and "end-2" to indexes was a tacit admission of this. It would've been far cleaner if we acknowledged that numerical expressions just have different rules and used another mechanism to trigger them. Still, we might've avoided some of this ugliness if we hadn't decided to pretend the "$x" substitution was being done by the Tcl engine when it's really being done by expr, which didn't need to do it that way. Expr knows perfectly well that "x" is a reference to variable "x", the $ is useless sugar to try to hide it's intrinsically unTclish nature. 8.4 had a set of patches that fixed this, and if we had taken that forward, we could write expr's without braces, catenate all the arguments together, and get the benefits of byte-compiling without the need for the fol-de-rol we have now. Even now, I still think this wart could be completely fixed by using a different set of delimiters for expressions.

emj: Apology accepted, I'm prepared to believe all the reasons, but would adding bits improve the behaviour of any of the other languages? I don't know, but when making comparisons you need to be on the same level.

Everything is a string, and every line is a command - that's Tcl, it's clean, simple, and consistent. Of course, since it is flexible as well, you could just add a bit of logic to unknown to say the if the second argument is just an equals sign, convert the whole thing to set ... [expr ...]. Naive users may think it's special syntax, but it won't be. And I won't do it, because I don't want it.

Changing the foundations of Tcl to follow a fashion, or to be like some other language, is a very slippery slope.

jcw: That last comment can be a two-edged sword. (emj I do not want to take a two edged sword down a slippery slope :) ) Tcl is also very adept at embedding domain-specific notations - the "expr" command itself is an example of that. There are ways to stay with the current rules, yet abbreviate:

$ {c = a + b}

This requires a command "$" which parses its arg, a bit like expr but treating names as variable or array references.

emj: Embed what you like, no argument there.

The other notational inconvenience is list indexing and such:

set x [lindex $data 12]
set y [lrange $data 1 end-1]

If array index notation were allowed for lists, it could be written as:

$ {x = data(12)}
$ {y = data(1,-1)}

All without changing anything in Tcl. As an extension which would work with any Tcl >= 8.2

emj: but what will happen if you do

parray data

to a list?


slebetman: While I respect alove's opinion, I have to disagree. Even in C, math expressions account for less than 10% of my code (I took C/C++ classes to avoid math ;-). Unless you're doing declarative programming, the majority of your code will be control flow - for, if, while etc. (Actually the majority of C++ code tend to be struct and class definitions in header files :-) Just look at your code, how many expr's are there and what's the ratio of expr to the rest of your code. It doesn't make sense to complicate the clean syntax of Tcl for something that accounts for such a small minority of code.

AMG: [expr] isn't the only way to do math in Tcl: [if] and [while] have expression parameters.

Some commands implement math using their own expression syntax, for instance [lindex] accepts end-$x and [incr] groks positive and negative integers. Plus there's a TIP (I think) that proposes to extend index notation to accept $x+$y. I'd have to double-check that to be sure, but my point still stands: you can't simply look for the [expr]s to see how much math is in your code.

Regarding the above [expr] abbreviation discussion, I'd like to see support for $[$insert + $math * $here] notation. (Note the lack of braces inside the brackets.)

Also see Math Operators as Commands for an alternative to most uses of [expr].


Another way to think about this: in most other languages, syntax defines the development experience. With Tcl, though (as well as Lisp and Forth), syntax is available for extension at any time. Ed Watkeys writes about this from a Scheme perspective.


LV I never understood the drive to force Tcl into having all the features of another language - why not just use that other language if you like the features? I guess, at least in some cases, it is a matter of trying to make a good language even better, by adding the favorite part of the other language. That path leads to Perl (shudder).

Larry Smith For me, the issue is that I feel Tcl is the only language that has come so close to perfection. Compared to C, C++, or even the "fixed" Cs like C#, Java, and so on, Tcl is easier to write, debug and develop in. Part of its charm is its simplicity of syntax, which is a huge advantage. But that same simplicity has a couple of corner cases, and one of them is expr. The current need to brace expressions is a case in point, a bandaid over a wart (i.e., doesn't help, but still makes trouble).

The fundamental requirements of being able to write an expression in something resembling mathematical notation are very much at odds with Tcl's simple syntax. expr tries to deal with this within Tcl's syntax rules, and works provided you don't mind the clutter it introduces to otherwise simple Tcl syntax. The pain causes us to do weird things like partial expression evaluation in indexes, but then that itself becomes an exception. The real problem is, the problem domain (expressions) is just not easily expressible in Tcl's solution domain (endekalog).

I call the brace requirement a bandaid because it could have been avoided simply by allowing expr to parse variable names without emulating Tcl syntax - e.g., [ expr a+b ] should compile to produce the sum of a and b, as should [ expr a + b ].

Making expr pretend to be another substitution (with $) added the requirement for braces, two more characters of clutter overhead per expression and one per variable reference. There was actually a set of patches that fixed this for 8.4. It would still be a good idea to make that standard, even if we must maintain backward compatibility. But that can largely be handled just by ignoring $ in expressions.

The real solution is to acknowledge that, while Tcl syntax is ideal and simple for most problem domains, expressions are not among them. It needs some kind of escape to "real" expressions. What would be most concise would be an escape using different delimiters to invoke expr. If this were () (yes, I recall the objections) then you could drop in (a+b) anywhere you bloody liked and know you'd get bytecode for adding a and b in that spot -- and it would work everywhere, in indexes (sparing us the partial evaluation code) in if and while statements, and so on. I find this more elegant because acknowledging the problem and providing the escape just happens to make Tcl more like other programming languages, and has the nice side-effect of making transitions easier.

This would still leave us with set, but expressions should be able to do assignments anyway. (a := b+c) becomes a logical replacement for set a [ expr {$b+$c} ]. It also calls out intent to a reader - here I'm doing normal Tcl stuff but here I'm just doing math.

Of course, I'm old-school. I started with punch cards and moved on to teletypes, the number of keystrokes it takes to explain something to a computer is still a big concern to me, as is readability. Both are enhanced this way.


arjen I reverted the change made by "gudfak", because it was inappropriate. I would not have minded the claim he/she made that Tcl's syntax is weird, but that would have to be a separate discussion point - with arguments.