'''Tcl warts''' presents aspects of [Tcl] that a considered frustrating, confusing, aggravating, enraging, etc. ** Mandatory "quoting" of strings in `[expr]` ** The following is not currently allowed: ====== set a hello expr {$a eq hello} ====== There's no good reason. It's just a wart. [AMG]: I'm inclined to disagree. There is good reason. The [[[expr]]] syntax is significantly different than the [Tcl] syntax, and in the [[expr]] language, strings must be quoted (using braces or double quotes, not backslashes on each character). That's how it's defined. This is done to avoid misinterpreting strings as operators or function names or other special characters such as whitespace and brackets. Perhaps you're suggesting to treat unrecognized words as if they were quoted strings. Sure, this can be done, but it opens the door to surprises when a previously-unrecognized word becomes recognized, for instance when a new math function is added. When it comes to language design, my preference is to be as picky as possible because it leaves room for future expansion. Every construct that is currently marked illegal, really is marked as reserved for future use. This is how we were able to get [{*}]. [PYK] 2014-06-11: The "previously-unrecognized word" scenario could be avoided by having `[expr]` interpret anything that looks like a function call as a function call, and failing if it can't be found, but yes, I guess in order to know whether this is really a wart would require learning the rationale behind the decision to require double quotes or braces around string literals. [AMG]: [[[expr]]] already has this behavior. Everything that looks like a function call (i.e. is an unquoted string followed by optional whitespace then open parenthesis then zero or more comma-delimited arguments then close parenthesis), it tries to invoke as a function call (i.e. calls [[[tcl::mathfunc]::''name'' [{*}]$''arglist'']]), then fails if it can't be found. ** `[namespace]` ** See '''An Anonymous Critique''' [http://wiki.tcl.tk/1489#pagetoc5382e1bc] in `[namespace]` ** `[glob]` ** The need to somehow quote the `glob {}` combination so that one gets the quotes passed on to glob. Use quotation marks (`"`) or braces around the arguments to get them past the parser. ** [Regular expressions] ** Like the problem with `[glob]`, the need to quote arguments appropriately so that various regular expression metas make it through the Tcl parsing. ** [comment%|%Comments] ** [Why can I not place unmatched braces in Tcl comments] ** Hashes vs [array%|%Arrays] ** Sometimes people attempt to simulate 2 or more dimensions of arrays using the Tcl associated hashes (aka tcl arrays). The gotcha here is that because the array index is a string, white space is significant. ====== $ set a1(1,2) abc abc $ puts $a1( 1,2) can't read "a1( 1,2)": no such element in array while evaluating {puts $a1( 1,2)} ====== Another gotcha here is trying to set arrays with white space: ====== $ set a1( 1,2) abc wrong # args: should be "set varName ?newValue?" while evaluating {set a1( 1,2) abc} ====== You need to use quotes if you are putting space into that variable. [AMG]: Use nested [dict]s. Or, generate the array index using `[list] $row $col` instead of `$row,$col`. In all cases, be aware that `$row` and `$col` are general strings, not numbers. ** `[exec]` Special Characters ** Easily Tcl's most undeniable blemish: [exec]'s brokenness in not directly allowing `<` or `>` as leading characters in arguments. `[open] |` has the same problem. See [exec ampersand problem] for discussion of a very similar problem involving `&`. Also, `|` doesn't work as the first character, and neither can the two-character sequence `2>` be used. `[{*}]` addresses another blemish, which was previously handled using `[eval]` and precise `[list]` manipulation. ---- [LV]: Shoot, `[exec]` doesn't allow `|` as a leading character in an argument either. And `-` as the leading character in an argument can cause problems, and so on... On the other hand, if you are talking about using `<` or `>` as the leading character '''in a file name''', you solve that problem in the same way you solve the others - make certain it is a relative path name: ====== % exec ls ./>stuff ./>stuff: No such file or directory ====== So it is doable. [AMG]: `-` as leading character can cause two problems. If it's the leading character of the first character of the pipeline, it needs to be preceded by `--`, or else exec will think it's an option. If it's the leading character of other words, it may still require `--` but for a different reason. It won't be `[exec]` misinterpreting the `-`, but rather the `[exec]`'ed command. Of course, each command is different, so `--` isn't always appropriate. "Quoting" (disabling special interpretation of) filenames with `./` works fine, but there's no way to do it for non-filenames, where the external command absolutely must receive `<`, `>`, `2>`, `|`, or `&` as the first character of an argument. Since [sh] internally handles these characters, it also supports quoting them. Tcl doesn't handle these characters internally, so it falls to exec to do quoting, but exec doesn't. Adding quoting to exec would be highly unfortunate, considering how it would have to be used in practice. If you must start an argument with these special characters, construct your command pipeline as a suitably-quoted argument to "sh -c". ====== exec sh -c {echo ""} ====== But watch out that the !'s don't get interpreted as history substitution markers! Hmm, I bet there's a security issue here. If arbitrary untrusted user-provided data gets passed as the argument to an external program, what's to stop the user from passing `>/etc/passwd` and hosing the system? Perhaps what's needed is an alternative to `[exec]` that doesn't use in-band signaling metacharacter sequences to configure command pipelines. For example, alternate between commands, each of which would be a list-formatted argument, and options describing how the commands are tied together. This can get complicated, so maybe justify the complexity of the syntax by also making it more powerful. Consider allowing for more flexible pipelines that branch different file descriptors around into a tree structure. E.g. some program is run, its stdout gets piped to the [stdin] of `tee log`, its [stderr] gets piped to the [stdin] of `tee errors`, and the stdout of each tee gets combined again to be the stdout of the whole pipeline. Or perform tee-like functionality internally, so that the stdout/stderr can be separately piped through gzip then written to disk but also combined to be the pipeline stdout. Lots of craziness is possible. [PYK] 2014-05-29: 4 years after [AMG]'s comments above comes [TIP] [http://www.tcl.tk/cgi-bin/tct/tip/424.html], which fits [AMG]'s description to a "tee". ---- [AMG]: Most of these quoting-related warts arise from conflicts between the commands' [little language]s and the core language of Tcl defined in the [dodekalogue]. The same is true for [sh], e.g. you need to quote wildcards in the `-name` option to the find program, or else the shell will expand them prematurely. Tcl has fewer metacharacters than sh, hence quoting isn't needed as much. An example of a non-collision is [format]'s % specifiers. They're similar to `$` in that they substitute in the value of an argument (which is like a read-only variable), but `%` and `$` don't collide simply because the notation uses a different symbol. An example of a collision is [regexp]'s `\` character, which works very much like, but not identically to, Tcl's `\` character. The two collide because the symbol is the same. `[regexp]` could change `\` to === ` === (which looks like the upper half of a backslash), but this breaks compatibility and makes Tcl look more like [Perl] in that too many different typographical symbols are used. Also this isolated change would gain nothing without also changing `[regexp]`'s `[[`, `]]`, and `$`. All this, just to avoid having to quote! And quoting would still be necessary for patterns containing quote or whitespace characters. Back to the format command: it's acceptable here to introduce a different symbol, since the notation is inherited from [C]. ** Inconsistencies in names in Library ** [PYK] is inclined to delete this one as there is no example. [AMG]: This might be a consequence of Tcl 8.0's addition of the "object" API which uses [Tcl_Obj] instead of character pointers. Many functions had Obj added to the name, but it's not totally consistent. One example that comes to mind is `Tcl_GetString()` versus `Tcl_GetStringFromObj()` [http://www.tcl.tk/man/tcl/TclLib/StringObj.htm]. ** [[[error]]] versus [[[return] -code error]] ** [AMG]: Using [[return -code error]] gives a cleaner stack trace than [[error]]: ====== % proc err1 {} {error asdf} % proc err2 {} {return -code error asdf} % err1; set errorInfo asdf while executing "error asdf" (procedure "err1" line 1) invoked from within "err1" % err2; set errorInfo asdf while executing "err2" ====== What reason is there for [[error]] to inject itself into the stack trace? ** Backslash-newline-whitespace replacement in brace-quoted words ** [AMG]: I argue that this should not happen because brace-quoted words ought to be for verbatim text. See [http://wiki.tcl.tk/10259#pagetocb6ec067b] for details. ** Inconsistent error messages ** [AMG]: Witness: ======none % puts wrong # args: should be "puts ?-nonewline? ?channelId? string" % binary asdf unknown or ambiguous subcommand "asdf": must be decode, encode, format, or scan % fileevent asdf fdsa bad event name "fdsa": must be readable or writable ====== Sometimes we get "should be", sometimes "must be". Sometimes it's "unknown or ambiguous", sometimes just "bad". I'm sure there are many more examples of inconsistency, but these few serve to illustrate the issue. ** See Also ** [Dangerous Constructs]: [gotcha]: <> Advocacy | Design