Tcl Minimal Escaping Style

Difference between version 83 and 84 - Previous - Next
A '''minimal Tcl escaping style''' is a [code style] that eschews unnecessary
Tcl escaping syntax.



** See also **

   [A question of style]:   A discussion about when to brace [expr%|%expressions].  The Tcl minimal escaping style has nothing to say about that particular scenario,  and certainly does not discourage [Brace your expr-essions%|%proper escaping] of [expr%|%expressions] to avoid [double substitution]. 

   [http://web.archive.org/web/20081009203437/http://www.tclscripting.com/articles/jan06/article3.html%|%You only need curly braces...]:   Looking from a different angle.



** Description **

In some languages such as [C], [Javascript], [Lisp], and [Python], double
quotes indicate a certain type of value: a string.  In Tcl [EIAS%|%all values
are already strings], so double quotes do not have that effect.  Instead, they
[dodekalogue%|%provide an environment] in which whitespace characters are
ordinary characters rather than [word] delimiters. [dodekalogue%|%Braces] do
the same, but additionally inhibit the [dodekalogue%|%standard substitutions].
It's technically possible to use [dodekalogue%|%backslash substitution] to
accomplish the same thing that quotes or braces do for all values except the
[empty string], but in all non-trivial cases this results in unreadable
scripts. 

Some Tcl scripts use [Dodekalogue%|%double quotes] even when not strictly
necessary.  Sometimes this is done to make a text editor highlight those
values, to make Tcl code look a little more like [C] or a [Unix] [shell], or to
indicate that a value is data rather than code to be evaluated by the
interpreter.  Someone learning Tcl might get from this style the wrong idea
that double quotes indicate strings and that braces indicate lists.  The
interpreter may also have to do a little extra work to look for
[substitution%|%substitutions].

The minimal Tcl escaping style presented here makes a script more readable by
using only escape characters only as strictly necessary to achieve the desired
effect.  Here are the rules:

   '''braces''' :   Don't use braces where no backslash substitutions are otherwise needed.

   '''quotes''' :   Don't use quotes where braces could be used. 


That's it.  Code that follows this style is both more readable and more
performant.  It also works in a greater variety of scenarios such as [code
generation].  Beginners reading such scripts will get a better sense of Tcl,
understanding the real intent of braces and quotes, and perhaps avoid the
all-to-common misunderstanding that braces mean "[list]".

The format of a [list%|%canonical list], which does not use double quotes since
no variable or command substitutions are done, conforms to this
minimal escaping style. 

----

[MS] dons his evil hat and remarks: in that case, ''really'' minimal style could be without quotes or braces, just backslash-escape the whitespaces! Of course, it depends on how you define ''minimal'' ...

[PYK] 2015-04-30: Except that braces and quotes were presumably introduced into
the language specifically to alleviate backslash pain.  This minimal escaping
style merely advocates using them only for their respective intended purposes.
Note that the minimal escaping style says nothing about backslash escaping,
which is always fair game.

[PYK] 2022-04-06:  Also, there is one word that can't be expressed in a script
without either quotes or braces:  The [empty string].

----

In the minimal escaping style, quotes are used for a value that contains both
whitespace and
[http://wiki.tcl.tk/10259#pagetoc9713972a#pagetoc9713972a%|%substitutions],
since braces would over-escape the value, preventing the substitutions from
occurring.

Some routines such as `[expr]`, `[regexp]` interpret their arguments accoding
to their own [domain-specific language].  The domain-specific language of
`[expr]` happens to be so similar to Tcl proper that it could in some ways be
considered a superset of Tcl.  The Tcl minimal quoting style can be used in the
context of `[expr]` as well, as long as literal non-numeric values are quoted.
To conform to the minimal escaping syntax, the command,

======
expr {$mystring eq "hello, world"}
======

would instead be:

======
expr {$mystring eq {hello, world}}
======



---

[PYK] 2015-08-31:  Over at [C-like structs and file scope], [Brian Griffin] has this to say, and I think it also applies to some of the arguments critical of the minimal escaping style:


    :   ''IMHO, using C syntax, even a little bit, is a bad idea.  The more Tcl looks like C, the more you raise the expectation of C style parsing as well.  This is the one area that screws with programmers minds the most.  Because Tcl looks similar to C, the false conclusion is reached that it's parsed like C, which couldn't be farther from the truth. Tcl is just not C, so stop trying to turn it into C.  Any programmer worth their salt can be multilingual; you don't need to baby them with fake syntax.''



** Examples **

======
#instead of
puts "Hello!"

#use
puts Hello!
======


======
#instead of 
puts "Hello World."

#use
puts {Hello, World.}
======

======
#instead of 
proc {x} { ... }

#use
proc x { ... }
======



** Counterpoint **

A few reasons to use a less grammar-bound style of quoting:

   * Syntax highlighting and "type" identification.  Writing strings quoted with "" generally causes editors to highlight them differently, and conveys to the reader that the value is intended to be treated as a string.  The visual cues make code easier to read on the wiki, in an editor or printed on paper.
   * It's arguably more common to make a change like "Hello, world!" -> "Hello, $whom!".  Not having to change delimiters makes such changes easier and less bug-prone.
   * `proc foo {a} {...}` is another case where leaving out the first set of `{}` just makes later edits more cumbersome.

These points can be summarised in the famous Abelson and Sussman quote:  "Programs must be written for people to read, and only incidentally for machines to execute."  Optimise for the human reader, not the mechanical parser.

[aspect] can see the minimal escaping style as a useful pedagogical device for users who haven't yet internalised the [dodekalogue], but finds its use elsewhere rather bizarre.  Changing existing examples to this style without specific justification for that example seems gratuitous and potentially harmful.

This might be a good place to collect other examples of divergent coding styles .. [Heronian triangles] uses some short definitions to make equations work with minimal punctuation, at the cost of performance and safety.

[PYK] 2015-05-17:  Wouldn't the proper solution here be to fix your editor's
Tcl syntax highlighting algorithms, rather than to change your coding style
just for the sake of an editor that "doesn't get it?"  The problem with the
"convey to the reader that the value is intended to be treated as a string"
idea is that in Tcl '''[everything is a string%|%every value is intended to be
treated as a string]'''.  I don't think this "type identification" idea holds
any water at all.  Rather, it obfuscates the nature of Tcl.  If I ever looked
at a piece of Tcl code and thought, "oh, this value is enclosed in quotes --
now I know it's going to be used as a string", it was back before I really
grokked Tcl.  Instead, to understand what's happening in any given command, I
look at the command name, the value itself, and its position.  I'd love to see
some examples where double quotes actually add to a Tcl script the kind of
value you're talking about.  I suspect that any such examples that are brought
up would be just as easily distinguished as strings if they were enclosed in
braces, and that those who prefer quotes are just carrying habits over from
other languages they work in.  I agree that the minimal style isn't strictly
adhered to nearly as widely as it should be, and that as more Tclers become
more vigilant about it, there will be a corresponding increase in the speed at
which newcomers grasp the nuances of the [dodekalogue%|%rules] of Tcl.



** Discussion **

[RLE]: [PYK] your 'rules' are starting to stretch and crack under the strain here...  To gain a starting point, and so no one has to keep scrolling back, here are your rules as they stand today (2015-05-19):
----
Here are the rules of the minimal escaping style:

   '''braces''' :   Don't use braces where no backslash substitutions are otherwise needed.

   '''quotes''' :   Don't use quotes where braces could be used. 

That's it. ...

----

So, first, you are now starting to talk about "In the normal Tcl context" and "in the `[expr]` context".  But your rules, as you've set them out above, contain absolutely no mention of any 'context' sensitive aspect at all.  So, no language in any of your rules mentioning context means there is no contextual limitation to your rules as you've enumerated them so far.  So your now adding a hidden contextual limitation is a big red flag that your rules are straining to stand up under load.

So, lets look at the rules, number one, expanding the contraction, says "do not use braces where no backslash substitutions are otherwise needed".  Ignoring the double negative for the moment, this says that when there is "no need for backslash substitutions" one should "not use braces". 

Ok, so, for "expr {$mystring eq hello}, since there are zero backslash subs in this command, there is "no need for backslash substitutions" and rule one says "do not use braces".  Therefore writing "expr {$mystring eq {hello}} would "use braces" when rule one says "do not use braces", and thereby violate rule 1.  So to meet rule 1, we can not use braces.

So, on to rule two: "do not use quotes where braces could be used".  Since to follow rule one, one "can not use braces", then in the expr, "braces can not be used" as required by rule 1.  So since we ca not use braces in the [expr], rule two says that quotes should, in fact, be used.  I.e., since we can't use braces, due to rule one, then rule 2 does not apply, and so, [[expr {$mystring eq "hello"}]] is actually within what your rules proscribe (at least as you have written them, with the double negative assertion in rule 1).

So, what we have here is a nice contradiction.  You say your rules require "expr {$mystring {hello}}" and your rules themselves, when parsed, say that the proper, minimal, quoting is "expr {$mystring "hello"}".  Two different answers, from the same set of rules.

[PYK] 2015-05-20:  As we all know, the [dodekalogue%|%rules] of Tcl take a
minimal approach, and this minimalism yields much more than initially meets the
eye. Its prime offering is the ability to design [little language%|%little
languages] with ease.  When it came time to tack on `[expr]`, we ended up with
a little language very similar to Tcl itself, but with a few extra constraints.
Barewords were made off-limits in order to save some room for expansion of the
`[expr]` grammar at a later date.  `[expr]` happens to be the most visible of
little languages of Tcl, but it's not part of Tcl proper, and its syntax is not
covered in the [dodekalogue%|%Tcl rules].  The minimal escaping rules don't
aspire to cover all the Tcl-like little languages that might spring up.  On the
one hand, it's asking a lot to demand that a set of rules cover two different
grammars (not even the Tcl rules manage that), but on the other hand, I think
the minimal escaping style can easily extend to `[expr]` syntax.  All that's
needed is to exclude rule 1.  The rules are written such that they proscribe
rather than prescribe braces and quotes given certain conditions, so one
approach is simply to understand that because of the additional bareword
constraint in `[expr]`, rule 1 doesn't apply to its syntax.  I think that the
"no backslash substitutions are otherwise needed" is already perfectly
sufficient for that, as that part makes no sense for `[expr]` grammar.  [RLE]'s
interpretation can be summarized as, "Doc, it hurts when I appy this rule to a
grammar where the rule is nonsensical."  The obvious answer is, '''Don't do
that!''' If we wanted to be really explicit we could simply add, "... or they
are needed to avoid barewords in `[expr]` expressions," and we're golden again.
I really don't see the need for that, though.  After all, The Tcl rules also
leave many things implicit, such as the grammar of `[list]`.

[PYK] 2018-07-01:   Additionally, in the `[expr]` example above, backslash
rules would otherwise be needed in order to avoid [Brace your
expr-essions%|%double substitution], so it's actually an example of the
goodness of rule #1.


----

[EMJ] (2015-04-22) Re "Sometimes this is done..." near the beginning: it might be done to clarify the purpose of a string, or with an eye on a future change (and so is about program maintainability). And I just don't believe the "seem more like [C]" bit.

[PYK] 2015-04-30:  Anyone who thinks they can improve this page is free to dive
in and rewrite it wholesale as far as I'm concerned....

[EMJ] (2015-04-22) No they can't, you'll put their name on the rewrites and try to turn them into another bit of your own private but global edit war!

[PYK] 2015-04-30: I added your name to your comments because they were in discussion format, expressing a different opinion.  In that case, the speakers need to be labeled so the reader can understand that it's different people talking.  
If you don't want to attribute yourself,  you could always defer to ''anonymous coward'' or something.  
In any case, other readers already aren't going to understand what we're discussing here because the page has already taken on another form.  That's cool.  It'll just need some cleanup later.

[RS] 2015-05-13: Some exceptions:
   * semicolons must be quoted/braced if not used as statement separators: 

======
set x [split $y ";"]
======

   * small characters are better visible if redundantly quoted/braced: e.g. "."

[PYK] 2015-05-13: To conform to the minimal escaping style, that would either be

======
set x [split $y \;]
======

or

======
set x [split $y {;}]
======

The idea behind the minimal escaping style is specifically to not use braces
quotes or backslash for side purposes such as increasing visibility of
characters, appeasing syntax highlighters, or posing as another language.  
To be justified under the minimal escaping style, a brace, quote, or backslash
must have a programmatic effect.  Says me :)

[MiHa] 2015-05-18: So, what is the point of replacing `"xx"` with `{xx}` ?
onforming to some random style is no value per se.
`{xx}` doesn't get syntax-highlighted as a string by the wiki.
I count that as a disadvantage.

If some style makes it harder for the user to see / understand / work / modify the code,
is bad.

Is there some "official" / regular style ?

[RLE] 2015-05-18:  There is no "official" or regular style, per se., as in "imposed by others".  There are some suggestions that have the advantage of making code easier to read (i.e., indenting one's nested blocks), but nothing required/official, other than that required in order to get the Tcl parser to parse and execute the script.

For what it is worth, the "point" of replacing `"xx"` with `{xx}` is essentially arbitrary and decided by PYK.  It is simply yet another version of [pyk%|%the great space-comma rampage] of last year by PYK, only at least with this rampage PYK has a more logical and reasonable reason for the changes.  But in the end, it is simply PYK imposing his "one world view" on everyone else yet again (albeit with a more reasoned and reasonable excuse behind the imposition).

[EMJ] (2015-05-18) No sensible syntax-highlighter for Tcl is going to highlight `{xx}` as text because then most of the code would be highlighted as text.

[PYK] 2015-05-18:  It's not arbitrary, but informed by the
[dodekalogue%|%rules] of Tcl. In the case of `"xx"`, the double quotes have no
functional value since they aren't escaping any whitespace.  For that matter,
the same is true of `{xx}`.  To conform to the minimal escaping style, it would
simply be `xx`.  As an example, the first random page I came across just now that uses double quotes is [dictutils]. They only occur in a couple of spots, and here is one of them:

======
foreachLine l myfile.tcl { puts [format "%-4d | %s" [incr count] $l] }
======

The minimal escaping style for this line would be:

======
foreachLine l myfile.tcl { puts [format {%-4d | %s} [incr count] $l] }
======

More difficult to read?  Nah, but certainly more informative to newcomers who
can note that from a functional standpoint only braces are necessary to get the
job done.  The main point of the Tcl minimal escaping style is readability,
actually, since its wide adoption would help reduce [quoting hell] issues among
newcomers -- in my experience the primary source of the most unreadable and
contorted Tcl code out there.

When I started using the wiki a couple of years ago, the vast majority of its
code examples weren't highlighted at all.  I've put many hours into fixing that
([dkf] has as well), so I guess syntax-highlighted examples on the wiki are yet
another one of my rampages imposing my "one world view" on everyone :) 

[EMJ] 2015-05-18: No, it's an example of the perfectly reasonable wiki-gnome activity of converting a page to use a useful but previously-unavailable piece of wiki-formatting without any actual change in content (assuming you manage to not accidentally change the content).

BTW,
======
foreachLine l myfile.tcl { puts [format {%-${nwidth}d | %s} [incr count] $l] }
======

I can see a reason for doing the above, but I seem to have forgotten that now I '''have''' to use double quotes!

[PYK]: Eat more casein protein!

[EMJ] 2015-05-18: It would be nice (or perhaps not) if I could see any point in that remark.

[PYK]: The point is that quotes are not there to save the programmer from
forgetting to use the right escaping mechanism for their needs in a given
scenario.  That's more a function of size of dessert portions, caffeine levels,
domestic felicity, etc.  We all make typos, forget to link variables from other
scopes, and commit any number of other programming mistakes.  The purpose of
double quotes is not to save the programmer from braces, but to permit
substitutions '''when necessary'''.  If double quotes are used where braces
would have sufficed, another more important signal to the reader is sacrificed,
namely that '''"There's (almost) no substitution going on here, folks.  Move
along."''' 

[EMJ] 2015-05-18: You claim that your minimal quoting helps beginners. I don't
agree. Even if it did, for a short while, you are actually hampering their
ability to eventually understand how the language works, and to be able to
understand code not written in this style. My normal approach is to use double-
quotes unless I '''need''' to suppress substitution, so when I see later that I
have used braces other than as an '''apparent''' part of the language structure,
I know that there was a reason for it. Not so different from your argument
really, just the other way round. But I can still cope with other styles, it may
just take a little longer sometimes. As for all this diet etc. stuff, we know that how you
feel affects your mistake rate, but it's got nothing to do with a choice of
programming style.


[PYK]:  An assertion like "hampering their ability to understand code not
written in this style" deserves some evidence or explanation.  Have you got
any?  I believe quite the opposite, that it enhances their ability to
understand code written in any style.  This "apparent part of the language
structure" stuff is exactly what messes newcomers up.  It's so easy to start off
with that mistaken conception about braces, and hold it for a long, long time.
Just ask scoofy in the [Tcl Chatroom].



----
'''[JMS] - 2023-11-04 23:54:53'''

Based on some of the points in the discussion, I think "the rules" could be clarified a bit to use clearer wording and remove ambiguity.
I propose the following four-clause algorithm:

    1. If a string (value, command argument, etc) can be represented unambiguously as is, without any quoting or escaping, then it shall be represented as is.
    2. Otherwise, if a string can be escaped by simply surrounding it with braces, then it shall be surrounded by braces.
    3. Otherwise, if the string contains spaces ''(or semicolons?)'', it shall be surrounded by double-quotes, with all special characters backslash-escaped.  (This looks cleaner and is often shorter than escaping every individual space, like the `[list]` command does when it can't use braces.)
    4. Otherwise, the string shall be written without surrounding braces or quotes, with all special characters backslash-escaped.

''(Note: this happens to be ''exactly'' how I'd define `[list]`'s canonical representation if given the chance, rather than its current, kinda inconsistent and not quite formally defined behavior.)''

However, I wouldn't say this style should ''always'' be followed with no flexibility at all; in some cases it may be justified to bend the rules a bit for the sake of program readability (and leaving "hints" to future developers of your code), for example when it may be unclear whether something needs quoting, or to highlight the fact that a bareword is actually a list or a command/script (e.g., you may prefer to write «`proc foo args return`» as «`proc foo {args} {return}`»), or simply when a different choice of representation would be clearly more concise or more readable.
----

'''[dusthillresident] - Tue  7 Nov 11:37:29 GMT 2023'''

I agree very much with the Counterpoint, it says pretty much everything I've been thinking while reading about this style.

I think the examples of the style on this page only show how the style is more harm than help. As soon as I saw them, I instantly thought:

1. "But what if later on you want to add more to that message? Then you end up adding quotes or braces anyway..."

2. "But what if later on you want to make it say 'Good (morning/afternoon/evening) (username), the current time is (current time)", then you have to change the braces to quotes anyway..."

3. "But then later on you have to add braces there anyway if it later turns out that your procedure needs to take more arguments..."

It reminds me of stuff I've done in the past when I've been thinking like "let's be clever and write something as short and compact as possible!", only to find myself having to fix and remove my "cleverness" later on when expanding the code, either because I was adding new features or for other reasons. 

I understand the dodekalogue well, but I still use quotes or braces or write other potentially technically sub-optimal constructs deliberately, as a way to express intent, very much like how I might use phrasing carefully when using human language. Subtle details can sometimes express a lot and make a big difference to clarity.

Syntax highlighting in text editors is another valid reason not to adopt this style. The suggestion given in response to this, "fix your text editor", mentioned further up on this page, is really not practical.
 
I come away from this unsure of the value of this style, I see more reasons to avoid it than to adopt it. At least from the impression I'm getting from this page, it feels more like an obsessive purism kind of thing rather than something that has a sound logical or rational basis.

I can imagine though that this style could have some merit, in some particular situations I can imagine it improve code clarity as well as performance. I think better examples are needed though, the current examples as I said really don't show how the style could be of any benefit at all.

<<categories>> Code Style | Tcl Quoting