Is everything a list?

LES on April 27, 2004: I am a newbie so what do I know. But the more I use Tcl, the more I am convinced that everything is a list. And that's one of the best features in Tcl.

FW: Wrong ;)

  "{"

, for example, is not a list. Your real point is probably that, say, numeric or other simple values can be treated as one-item lists: [lindex 123 0] == 123. Which, when you think about it, is really because of everything being a string!

Lars H: Actually,

  "{"

is a list (whose only element is a left brace), but

  {

is not a list. Although when you write them in a command you need to quote them, and "{" is one way to quote {, so

   llength "{"

rightfully produces the error "unmatched open brace in list". The most transparent way to quote "{" is probably

   \"\{\"

FW: Right, I meant "{" as in [llength "{"], not that the quotes would be part of the value.

LES: I don't think it is "transparent"...

   % set x \"\{\"

"{"

   % llength $x

1

   % puts $x

"{"

Hmm... I was expecting to get { instead of "{"

Lars H: Time to reread the endekalogue? Of course the above sets x to the three character string quote, left brace, quote (set said so itself). And of course it doesn't matter that you've treated the string as a list when you later ask puts to print it; everything is a string, and merely using that string doesn't change it. You do however get

   % puts [lindex $x 0]
   {
   % puts [lrange $x 0 end]
   \{

LES So right. But I still say that it is not "transparent". The brace is a special character in Tcl and will often boggle minds if it's part of that string. But now we're digressing.


Is everything a list (2)?

KJN 2004-11-04

The endekalogue does not define lists - it leaves it to commands to decide whether to interpret an argument as a list (and, implicitly, to define what is meant by a list). It appears that the list commands will automatically interpret any string as a list (unless the string has unmatched braces, see above).

I would like to express an arbitrary (nested) proper list in Tcl; but how do I distinguish

 "an atom"

from

 [list "an" "atom"]

It is true that I can write

 {"an atom"}

but it appears that list commands cannot distinguish this from

 [list [list "an" "atom"]]

In short, whenever you use list commands to look at the element "an atom", it appears that they will always regard this as a list of length 2, not length 1.

Is there any way to fix this, rather than to work around it by:

(a) substituting spaces in "atomic" strings;

(b) recording the type in the expression, either only in cases where list commands would get it "wrong" (cf the 4 examples above):

 [list ATOM "an atom"]
 [list "an" "atom"]
 [list [list ATOM "an atom"]]
 [list [list "an" "atom"]]

or always (which by warning me not to use list commands on the "atom", also saves me the job of escaping any braces inside it):

 [list ATOM "an atom"]
 [list [list ATOM "an"] [list ATOM "atom"]]
 [list [list ATOM "an atom"]]
 [list [list [list ATOM "an"] [list ATOM "atom"]]]

(c) since "ATOM" is starting to look like a tag, give up on "simple" lists and use XML instead;

(d) give up on Tcl, and use Lisp instead

RHS 04Nov2004

There is no difference between

 "an atom"

and

 list "an" "atom"

I think what you're looking for is more of:

 % lindex [list "an" "atom"] 0
 an
 % lindex [list "an atom"] 0
 an atom

KJN 2004-11-05

Tcl does:

 % lindex [lindex [list "an"] 0] 0
 an

So what I'd like is

 % lindex [lindex [list "an atom"] 0] 0
 an atom

but what I get is

 % lindex [lindex [list "an atom"] 0] 0
 an

As you say: there is no difference between

 "an atom"

and

 list "an" "atom"

Both are list of length 2.

AD 06Jul2018

This seems to me equivalent to the lisp case when using lists:

Lisp does:

 % (car '((an)))
 (an)

and also

 % (car '((an atom)))
 (an atom)
 % (car (car '((an atom))))
 an

which is right because the first item is a list and the first item of that list is an atom

why do you find Tcl behaving different? For me it behaves ok taking into account in Tcl a list is represented as a string

RHS 05Nov2004

Ok, let me go about this another way then... Why is it you want this behavior? I think the problem is that you want you be able to say "this is a list" and "this is not a list". In Tcl, there's no such concept. You can only say "I want to treat this like a list" and "I want to treat this like some-other-thing-that-isn't-a-list".

What are you trying to do that this isn't sufficient for?

KJN I'd like to store tree-structured data, with the possibility that a leaf can be an arbitrary string. If I build up the tree using list, and then inspect it with llength, lindex and so on, I hit the problem that my intended leaf is itself treated as a list, and may have length not equal to 1. There are obvious workarounds, like the ones I've mentioned above. I wondered whether it is possible to instruct the list commands that a string such as "an atom" is to be treated as length 1, so that I could use the native Tcl commands with no workarounds. A fictitious way to do this might be

 list -norecurse -- "an atom"

RS: As lists can always go via string rep and back, this hidden feature would not be stable. But how about separating the domains of "text" (any string) and "child nodes" (a list)? E.g. to model a parent node with two children,

 set tree {"a parent" {{"a child" {}} {"another child" {}}

KJN: yes, this is the same in spirit as my

 [list [list [list ATOM "an"] [list ATOM "atom"]]]

workaround. This kind of solution is the best I can think of, because it makes maximum use of the native list commands and data structures, and avoids the need to process the leaf strings to make them both list-safe and length 1. The result

 % llength "an atom"
 2

surprised me, but I suppose it is a consequence of the encoding that Tcl chooses when it expresses a list as a string, plus the requirement (which you mention) of invariance when a value is converted from list to string and back again.

So, to summarise Is everything a list? (I hope I understand this now, please edit any errors)

  • nearly every string is also a valid list - the exceptions are strings that contain unmatched braces or quotes. If such a string is passed to a list-processing command when a list is expected, the command will throw an error.
  • if a string that is not a valid list is an argument of the list command, the return value of the command is a valid list, that has escapes and braces inserted to ensure its validity. These escapes and braces are visible in the string representation of the list, but are removed if the list item is extracted (e.g. with lindex), so that the original item is restored.
  • a string is not always a list of length 1, but is parsed into list elements, according to its whitespace, quotes and braces. The string representation of a valid list will be parsed into the original list, so the parsing rules can be understood by inspecting lists printed out with puts.

aricb One of the fundamental properties of Tcl lists is ambiguous depth. To know how to interpret a nested list, you need some other source of information than the list structure itself. There are probably several different ways to build that information into the contents of your list.

However, one very nice alternative is the tree code in the Tcllib struct package. It's very tclish, much more convenient than rolling your own tree infrastructure, and it includes (among other features) a very nice command for traversing a tree in the order of your choice.

KJN - thanks, I'll have a look at that.

Lars H: At the end of the parsetcl page, there is a beautiful example of how a "tagged tree" can be converted to something quite different (in that case a Tcl script), by making use of the fact that every list is a command. The trick is to make sure every node, i.e., not just the leaves, is tagged and then for every tag create a proc which converts that type of subtree to the desired format. Then to convert a tree, one only has to evaluate it!

LES on August 28, 2007: An interesting thought to anyone who's just discovered Tcl: I started the Is everything a list question more than 3 years ago (see above). And see what beautiful discussion it brought up. :-) Today I feel like adding that, back then, I was probably still under the influence of Perl and PHP, languages that I used for a while before adopting Tcl. In those languages and probably others, whenever some command or function returns some text, the language will almost certainly treat it as a string. If you want to print (echo, puts etc.) it, for example, you'll have a string. Turning that string into a list (known as "array" in Perl and PHP) so that it can be used as a list requires additional steps. In Tcl, they can be treated as strings or lists almost willy-nilly. No additional steps. Whatever differences there are between a string and a list seem to be 100% irrelevant most of the time, and whether Tcl knows it or works like that just by coincidence, it still is a very practical and convenient approach. The interesting thought is that I didn't like it at first. I actually frowned upon it because it was like "too much freedom". Some mysterious feeling inside me told me it wasn't supposed to be like that. It wasn't "proper". And that's exactly how an awful lot of people feel about data types. Take types away from them and they will immediately feel robbed and outraged. But Tcl doesn't use types, everything is a string and it works beautifully. Another language may have told you that you simply cannot code without data types, but that's just someone's particular approach, not some universal truth. "Never let school interfere with your education." - Mark Twain.