list

Difference between version 247 and 248 - Previous - Next
'''[http://www.tcl.tk/man/tcl/TclCmd/list.htm%|%list]''', a [Tcl
Commands%|%built-in] [Tcl] [command%|%routine], creates a [Tcl Rules
Redux%|%list].



** Synopsis **

    :   '''list''' ?''arg arg ...''? 



** Summary **

`list` creates a new list and appends each ''arg'', in order, as an element in the list.  If no ''args''
are given, the new list is empty.




** Documentation **

   [http://www.tcl.tk/man/tcl/TclCmd/list.htm%|%man page]:   
   [http://core.tcl.tk/tcl/artifact?filename=generic/tclUtil.c&ci=trunk%|%generic/tclUtil.c]:   dDocumentation of the list format.
   
   [tip%|%Tip] [http://www.tcl.tk/cgi-bin/tct/tip/407%|%407], The String Representation of Tcl Lists: the Gory Details:   
   
   [tip%|%Tip] [http://www.tcl.tk/cgi-bin/tct/tip/148.html%|%148], Correct [[list]]-Quoting of the '#' Character:   



** See Also: **

   [Tcl Quoting]:   

   [deep list]:   A list in which each single [value] is encoded as a list containing only one item, and list containing multiple items is itself a deep list.  This makes it possible to differentiate e.g., the single value `one two three` from a list containing the values `one`, `two`, and `three`.


   [Is everything a list?]:   

   [Additional list functions]:   

   [Chart of existing list functionality]:   

   [Chart of proposed list functionality]:   

   [Internal organization of the list extension]:   

   [pure list]:   A Tcl value for which no string representation has been generated, but for which an internal structured representation has. 

   [scripted list]:   Writes a list using [script substitution%|%script] and [variable substitution], without having to escape the newline between the words, and with the ability to make comments in between words of the list and comment out some words of the list.

   [cmdSplit%|%scriptSplit]:   Splits a command into its logical words, taking [script substitution%|%script-substitution] syntax into account.

   [string is list]:   

   [Tcl syntax]:   



** Standard List Operations **

   '''append''':   `[lappend]` and `[lset]`

   '''delete''':   `[lreplace]`

   '''extend''':   `[concat]`, `[lappend]`, and `[list]`

   '''insert''':   `[linsert]`, `[lreplace]`

   '''length''':   `[llength]` returns the length of the list, which is always `0` or greater.

   '''prepend''':   `[lreplace]`

   '''search''':   `[lsearch]`

   '''set''':   `[lset]`

   '''retrieve''':   `[lindex]` retrieves an element at a particular position.  The position of the first element is `0`.  `[lrange]` retrieves a list of elements within the range of two indexes.  `[lassign]` assigns of a list to a sequence of variables.

   '''transform''':   `[join]`, `[lmap]`, `[lrepeat]`, `[lsort]`, `[lreverse]`, and `[split]`

   '''validate''':   To validate the format of a list, use `[lappend]` (with only one argument) or `[string is list]`.


`[lappend]` and `[lset]` are the only 2 list routines which use
the '''name of a variable containing a list''' rather than the list value itself. 




** Description **

In general, a '''list''' is an ''ordered [tuple] of values''.  In other
languages this is sometimes known as an ''array'' or ''vector''.  `[list]` and
related routines format a list such that it can be interpreted as the words of
a command, making Tcl [homoiconic] (see "lists vs commands" below).

In contrast with other languages, in Tcl a list is not a [data structure], but
a [string] value that conforms to a specific [data format%|%format] derived
from the [dodekalogue%|%rules] of Tcl.  As with all Tcl values, the
interpretation of the string value implies certain operations, and the various
list routines implement those operations.  For performance, those routines
[Tcl_Obj%|%cache] a list [data structure] for future use, but this is an
implementation detail.  This list format and the routines that process it
comprise an [abstract data types%|%abstract data type].

Because a list is a string, string operations can be applied to it, but the
result might no longer be a valid list.  See "lists vs strings" below.
`liIn a script`, quoite is wcustomardsy with [dfodekr a loguite%|%braces]l vand/lue tor backslashes such writhaten if
n list formay passt,
ed.g. to `[heval]` asrgument a specript ficontation ing at most one [proc%|%prommancedure] havdefing
nition, [subsut itu's
safer tion%|%s ubste listi routines], awhen building whmosre wcordmplex lists. Thavese routhine vals
ensure of theat wochardacters lin
thke `{` or `"` are properlisy quot.ed:
In a script, it is customary to include in some places, i.e. [proc%|%procedure]
definitions, literal values are lists, but it's often safer to use list
routines when building more complex lists. These routines ensure that
characters like `{` or `"` are properly quoted:

======
list look, Ma! a \{ list with `" weird characters \} in it
======



** List Format **
A list is parsed exactly as a [script] is parsed, except that `$`, `[[`, and
semicolon have no special meaning, newline is just another whitespace
character, `{*}` is not allowed, and backslash-newline substitution is not
performed on words in braces.  A [script] that contains no [script
substitution%|%script substitutions] and
 does not use `{*}` is a valid list 
composed of all the words of all the
 commands in the script.  Conversely, any 
valid list is a valid [script]
 containing at mzerost onr more commands whose wopards havted
by the samne vawlue as itnems in tche
lisaracter.
A list is parsed exactly as a [script] is parsed, except that
`$`, `[[`, and semicolon have no special meaning, newline is just another
whitespace character, `{*}` is not allowed, and backslash-newline substitution
is not performed on words in braces.  When [Tcl rules] are stripped of those
parts that refer to various aspects of evaluation, they also serve to specify
the format of a list.  Here are the
[dodekalogue%|%Tcl rules] that do not apply to the processing of a list:

    '''Evaluation''':   No routine is invoked.  A list is formed by breaking a command into [word%|%words], with each word becoming an element in the list.  The first word, normally name of a routine to invoke, is simply the first element in the list.

    newline and semicolon:   A list is a sequence of words rather than a sequences of commands, so command delimiters are not necessary, and newline is just another whitespace character and semicolon has no special meaning.

    `#`:   Because there is no concept of multiple commands, `#` has no special meaning.

    '''[dodekalogue%|%variable substitution]''':   No variable substitution is performed, so `$` has no special meaning.

    '''[script substitution%|%script substitution]''':   No script substitution is performed, and '''`[`'''  has no special meaning. 

    `{*}`:   Since no routines are invoked `{*}` is not allowed to prefix a word in a list. 


In all other respects, list evaluation is identical to script evaluation.
'''[Dodekalogue%|%double quotes]''', '''[Dodekalogue%|%braces]''', and
'''[Dodekalogue%|%backslash substitution]''' are all processed as usual.

The [empty string] is a list that contains no words.
Enc`losing a well-formed list` in braces resqults in a list containing exactlys one
[word],s which is the [dorigindekal list, but enclosing an arue%|%bitrary string of
characters] iand/or braceks
doelas not nhecessarily resultch in tha well-formed list.  One of 
the kreys to
successfully workting with lists in Tcl is toa understaind that although bracles
(`{` [command] `}`) are whoften used to delimit individual words contain lino
[sts, ubraces dtitutio
no%|%subst mean "list".  They are sutimply aon esc], mapking mechanismt forit stom be of Tcl's
specival chuaractersd by, e.g., whitespace.  As 
`[DGP] put it, "Don't imagine that
braces have magic list-ifying powers]`."
Double quotes and braces are essentially syntactic sugar for backslash escaping. 
For example, although braces are a very convenient way to
represent nested lists, it's possible to represent the same nested
lists without braces: 
Enclosing a list in braces results in a list containing exactly one [word]
whose value is the original list, but enclosing an arbitrary string of
characters in braces does not necessarily result in a well-formed list.  One of
the keys to successfully working with lists in Tcl is to understand that
although braces (`{` and `}`) are often used to delimit individual words in
lists, braces do not mean "list".  They are simply an escaping mechanism for
some of Tcl's special characters, e.g., whitespace.  As [DGP] put it, "Don't
imagine that braces have magic list-ifying powers."

Except for the empty string, double quotes and braces are syntactic sugar for
backslash escaping.  For example, although braces are a very convenient way to
represent nested lists, it's possible to represent the same nested lists
without braces: 

======
puts [lindex {a {b c {d e f g}}} 1 2] ;# -> d e f g
puts [lindex a\ b\\\ c\\\ d\\\\\\\ e\\\\\\\ f\\\\\\\ g 1 2] ;# -> d e f g
======

Not that one would want to write code like with all those backslashes. It just
illustrates the point that when interpreting a string as a list, braces merely
escape the normal interpretation of whitespace as the delimiter betweenelements in the list. They are only indirectly involved, by virtue of their
effect on whitespace, in the interpretation
 of a value as a nested listed by virtue of their effect on whitespace.  
Braces
 by themselves do not mean "list".
The definition of the list format is recursive: A list is a sequence of lists
separated by whitespace.  Therefore, `[list]` can be used to prepare a value
for insertion into a list:
======
set part {five six} 
set list "{one two} {three four} [list $part] {seven eight}"
llength $list; #-> 4
======
However, it's normally preferaable to use procedures such as `[linsert]` and
`[lappend]` to maniupulate lists.
This recursive definition only applies to the format.   The definition of the
abstract data type is slightly different:  A list is a sequence of values.
Procedures like `[lindex]` that implement opertions on the abstract data type
peform the needed quoting and unquoting to produce their results.



** A Note on Terminology **
This page discusses both the syntring formaxt of a list as a string as well as operations
 of a 
list as an [abstract data type].  In the [dodekalogue%|%rules] of Tcl, "word"
has a special technical definition, and that definition carries over to the
format of a list.  On this page, "word" is used in this technical sense torefer specifically to a componensubstring of a string that represents an item in a list.  In 
other
 places it's used in the common non-technical sense.  EnougThe applicontablext
meaning should be
 apparovided nto differentiate theach twco meaningstext. 



** Lists vs Commands **
Because whitespace delimits words in both [Dodekalogue%|%commands] and lists
words, 
every valid list is a valid script.  However, not every valid script is
 a valid
list, because the list format does not include [script substitution]:

======
% llength {set b [list $one "$two $three"]}
list element in quotes followed by "]" instead of space
======

The `]` at the end of the command is a violation of the
'''[Dodekalogue%|%double quotes]''' rule, which states that `"` terminates a
quoted word.  A space could be added to turn it into a well-formed list:

======
% llength {set b [list $one "$two $three" ]}
6
======
Notice that aAlthough as a script this is a command composed of three words, as
a list there are 6 words because there is no [script substitution%|%script
substitution].  Furthermore, variable substitution is not performed, so the
fifth word in the list is `$two $three`.



** Strings that Are Lists **

An empty string is an empty list:

======
% set list [list]
% llength $list
0

% set list {}
% llength $list
0

% set list ""
% llength $list
0
======

A string containing nothing but whitespace is also an empty list:

======
% set list { }
% llength $list
0

% set list \n
% llength $list
0

set list \t\n\t
llength $list ;# -> 0
======

A string that contains no whitespace is often, but not always a list (containing one word):

======
% set list hello
% llength $list
1

% set list he\{llo
% llength $list
1

% set notalist \{hello 
% llength $notalist
unmatched open brace in list
======

The following is a list that contains one word, which is an empty string:

======
% set list {{}}
% llength $list
1

% set list [list {}]
llength $list ;# -> 1
======

Whitespace separates words in a list:

======
% set list {1 2}
% llength $list
2

% set list "1 2"
% llength $list
2
======

Here is a list that contains two words, both of which are empty strings:

======
% set list {{} {}}
% llength $list
2
======


List routines parse strings into lists in much the same way that Tcl parses
strings into commands.  In the following example, `[llength]` parses the string
into a list, and the backlash-space sequence results in the first character of
the second word being a space character:

======
% set list {1 \ 2}
% llength $list
2
======

It is very common, and perfectly acceptable, to use braces instead of `list`
when writing a list:

======
% set list {one two three}
% llength $list
3
======

But in Tcl, braces are simply a means of escaping whitespace and other special
characters in strings.  They are a nice way to write out lists, but Tcl itself
doesn't equate braces with lists.  Any quoting can be used to make a
well-formed list, and there are an infinite number of ways to produce a string
that is the same well-formed list:

======
% set list "one two three"
% llength $list
3

% set list one\ two\ three
% llength $list
3

% set list one\x20two\x20three
% llength $list
3

% set list one\ttwo\tthree\t
% llength $list
3

#... and on and on ...
======


When formatting a list as a string, `list` will escape values where necessary:

======
% list \n\{
\n\{
======



** Strings that Are Not Lists **

[EIAS%|%All lists are strings], but not all strings are well-formed lists.  The
various list routines expect their arguments to be well-formed lists.


To check whether a string is a list: 

======
string is list $somevariable
======

Alternatively:

======
catch {llength $somevariable}
======

When doing experiments to understand lists, it is a good idea to first assign
the values in question to variables before operating on them, since it is hard
to keep track of when quoting is interpreted as Tcl parses the command vs. when 
a routine is interpreting the arguments it has received.  This
makes it possible to first inspect the value of the string before passing it to
a routine:

======
% set var1 \{
{
======

Many simple strings are lists:

======
% llength hello
1
% llength hello\ world 
2
% llength {how I made a great mistake in quotation}
8
======

Some strings, however, are '''not''' lists:

======
% llength \"
unmatched open quote in list

% llength \{
unmatched open brace in list

% llength "ab{ {x y"
unmatched open brace in list

% llength \{}a
list element in braces followed by "a" instead of space

% llength {{a b} {c}]}
list element in braces followed by "]" instead of space

% llength {{*}exactitude}
list element in braces followed by "exactitude" instead of space
======



** Canonical List **

A single list may be represented by different combinations of double quotes, braces, and
backslashes.  For example, `{[[ value ]]}` is the same list as `{{[[} value \]]}`,
but the latter is the canonical representation of the list.

To produce a canonical list:

======
list {*}$somelist
======

One quick and dirty way to check for list equality is to compare their
canonical forms as strings:

======
expr {[list {*}$list1] eq [list {*}$list2]}
======

This may be undesirable if the lists are large because the string values of
both `$list1` and `$list2` are generated if they haven't been already.

`list {*}$list1` also armors special characters against
possible interpretation if evaluated as a Tcl script:

======
% list {*}{puts $hello; set b [list $one $two $three]}
puts {$hello;} set b {[list} {$one} {$two} {$three]}
======

In other words, a list is formed such that the following is true
([http://wiki.tcl.tk/440#pagetocad1fdaa5%|%1]):

======
expr {[eval list $list] eq [list {*}$list]}
======



** Canonical Streaming List **

The canonical representation generated by Tcl uses braces to enclose words
wherever possible.  In cases where the canonical representation is generated on
the fly as new data arrives this is problematic, since the choice to enclose a
word in braces depends on knowing the entire content of the word up-front.  A
'''canonical streaming list''' uses only backslash escapes, except for the
empty string, which is represented by a pair of adjacent braces `{}`.  A
streaming list is suitable for use in something like a [critbit] structure,
where uniform representation is required.



** Using Braces to Write Lists **

In source code, braces are often used to write lists.  One example is the
arguments to `[proc]`:

======
proc move {element speed args} ...
======

It would be a bit awkward, and a bit slower, to use `list` for that:

======
proc move [list element speed args] ...
======

Likewise, literal braced strings are used with `[switch]` (in the braced
patterns-and-bodies form), and `[string map]` (the map is a list). These
braced strings are usually not a problem, but you may need to think about list formatting when special characters (backslash, braces, whitespace, quotes) are
involved. Sometimes, an extra layer of braces are required around a word in a list, but if it is unbalanced with respect to braces then you may need to
backslash-escape all special characters in it instead.

Braces are also used to escape the body of a `[proc]`, but in that case, the body is not parsed as a list, but as a script (unescaped newlines take on special meaning):

======
proc myproc {} {
    this
    is not
    parsed as a list,
    but as a script
}
======



** Generating Code **

One useful feature of `[list]` is that is produces a value that is properly
quoted as a single word in a script.  This feature is fundamental to the
activity of generating scripts, and generating scripts for `[try]`, `[eval]`,
`[apply]`, and friends is fundamental to the activity of writing programs in
Tcl:

======
set value {lots of spaces}
set script [string map [list @value@ [list $value]] {
    set list {}
    lappend list @value@
    llength list ;# -> 1
}]
try $script
======

In the previous example, if `list` had not been used to quote the whitespace in
`$value1`, the length of the list would instead have been `3`.


----

[KBK] writes on comp.lang.tcl (with some modifications):

If a string has been built up using list routines like `[list]` and
`[lappend]`, then it is always well-formed as a command.  The first word of
the list (word number zero) is the name of the routine and the remaining
words are the arguments passed to that routine.  This method, in
fact, is one of only a very few ways to construct commands that have the
desired parameters when one or more of the parameters contains user-supplied
data, possibly including nasties like backslashes or unbalanced braces.

In particular, using double-quotes and the string routines such as `[append]`
and `[subst]` is NOT safe for constructing commands from untrusted values.

Moreover, `[eval]` and its friends give you special support for the technique
of using lists as commands.  If a command being evaluated is a "pure" list,
i.e. one that was constructed using the list routines, and has never
acquired a string representation, then the evaluator is able to short-circuit
the parsing process, knowing that all arguments have been substituted.  It
doesn't need to do the (fairly expensive) scan for `$`- `[]`- and `\`- substitution,
nor balance `""` and `{}`, but can go directly to looking up the routine by name and
invoking it.




** Using List to Concatenate Lists **

`list` can be used with [{*}] to the same effect as `[concat]`:

======
set a {a b c}; set b {d e f}
list $a $b                    ;# -> {a b c} {d e f}
concat $a $b                  ;# -> a b c d e f
list {*}$a {*}$b              ;# -> a b c d e f
======

See [Concatenating lists] for a timing comparison of the various methods.



**  Concatenating Lists **

The following three methods for concatenating list are roughly equivalent in
performance:

======
set list hello
concat $list $list
list {*}$list {*}$list
lappend list {*}$list
======

The difference is that `[concat]` does not make sure its arguments are valid
lists, and `[lappend]` modifies `$list`

Before the advent of the [{*}] operator, the following syntax was used:

======
eval [list lappend baseList] $extraList
======



** Newline-delimited Lists **

When writing a list of lists to a file, it's useful to represent it using the
newline character to separate the words of the lists.  Here's how to do that:

======
foreach list $tosave {
    puts $chan \{
    foreach word $list {
        puts $chan [list $word]
    }
    puts $chan \}
    puts $chan {}
}
======

The result is a list of lists, having the same length as `$tosave`.



** Internal Structured Representation **


Internally, Tcl tracks the structure of the list, and the various list routines
take advantage of this to improve performance to O(1) time (access time does
not depend on the total list length or position within the list).  A string
representation of a list is not made until is is needed.  Therefore, 
a `[string]` operation on a large list may incur a dramatic performance/storage
penalty if it causes the string representation has to be generated.  The rule of of thumb
is to use list-aware routines for lists, and avoid string routines.  One
obvious exception to this rule is `[string is list]`, which is smart enough not
to generate the string representation of a pure list.

`[concat]` is a string operation, but is smart enough not to incur the
penalty if all its arguments are pure lists.

The internal representation of a list should be transparent at the script
level, but for the curious:

In the [C] implementation of Tcl 8.x, a list is a type of [Tcl_Obj],  The
elements of a list are stored as C-style vectors of pointers to the individual
[Tcl_Obj] element values, plus some extra data.  The consequence of this is
that `[llength]` and `[lindex]` are constant-time operations — they are
as fast for large lists as they are for small lists.



** `[concat]` **
   
`[concat]` also operates on lists, but does not require that its arguments
be valid lists. `[foreach]` operates on lists.  `[split]` creates lists,
and `[join]` consumes them.  Various other routines also make use of lists.

Since all values, including lists, [EIAS%|%are strings] (but not all strings
are lists!), it is possible to use `[string]` routines on lists, but
performance can suffer and there are usually better ways to accomplish the
task.  In some common cases, Tcl is smart enough to do the right thing.  For
example, doing a string comparison between a list and the empty string is
perfectly acceptable, and just as performant as the `[llength]` variant:

======
proc foo {bar args} {
    if {$args eq {}} then { #string comparison might force internal Tcl gyrations
        set args $::foo::default_for_args
    }
    # ...
}
======

The `[llength]` variant:

======
proc foo {bar args} {
    if {[llength $args] == 0} then { # $args is empty
        set args $::foo::default_for_args
    }
    # ...
}
======


** Layers of Interpretation **

In the following example, the value of `a_single_backslash` is a single backslash:

======
set a_single_backslash [lindex \\\\ 0]
======

prior to invoking `lindex` , Tcl performs backslash substitution on the four
backslashes so that the first argument becomes two backslashes.  To convert the
first argument to a list, `[lindex]` then performs backslash substitution on
the first argument (two backslashes), resulting in one backslash, which is then
assigned to `a_single_backslash`.



** Converting a String to a List **

`[split]` takes a string and returns a list, but the best choice depends on
the task at hand. `[regexp]` is often handy:

======
set wordList [regexp -all -inline {\S+} $myGnarlyString]
======


[DKF] proposed this pretty alias:

======
interp alias {} listify {} regexp -all -inline {\S+}
======
 [Bill Paulson] notes that this alias changes all white space to a single space, 
which might or might not be what you want.

Other than that, this '''listify''' is effectively a `[split]` that interprets adjacent delimiters as as a single delimiter rather than interpreting them as delimiting an empty string, like `[split]` does.


(2014-09-08): this can also be used to duplicate words in a list:

======
regexp -all -inline {\S+} {a b c}
# => a b c
regexp -all -inline {(\S+)} {a b c}
# => a a b b c c
regexp -all -inline {((\S+))} {a b c}
# => a a a b b b c c c
======

etc.


** Validating a List **

Various ways to check whether a string is a list:

======
string is list $some_value
catch {lindex $some_value 0}
catch {llength $some_value}
======

In later versions of Tcl, [string is list] is available.



** List Vs. List of Lists **

[escargo] 2003-03-16: How can you tell if a value is a string of words '''or'''
a list of strings of words?

The practical application that I had for this was an error-printing proc. It
could be passed a value that might be a single error message or a list of error
messages.  If it were a single error message, then I could print it on one
line; if it were multiple messages, then I wanted to print each on its own
line.

So, how could I distinguish between the cases?

I think I eventually made all sources of errors provide a list of errors, even
if was always a list of 1 (instead of just the error message string).

But the question always stuck with me?  Was there a way I could have easily
distinguished between the two?  Could I look at the representation and see an
opening curly brace if it were a list?

[RS]: The (outer) curlies are not part of the list - they are added, or parsed
away, when needed.

Tcl lists are not fundamentally different from strings, rather, I'd say they
are a "view" on strings. Just as `42` can be viewed as string, or integer, it
can also be viewed as a one-word list. Except if you introduce your own
tagging convention, there is no way of telling whether a list is in reality a
string - in the other direction, only strings that cannot be parsed as lists
(unbalanced braces, quotes..) cannot be viewed as lists. But for your concrete
error-printing problem: if you simplify the interface to "a list of one or more
error messages", you can have the desired effect with

======
puts [join $errormessages \n]
======

Just make sure that the "one message" case is properly nested, e.g.

======
errorprint {{This is a one-liner}}
errorprint "{This too, with a $variable reference}" ;# braces in quoted strings allow substitution
errorprint [list "another $variable reference"]
errorprint {{Two messages:} {This is the second}}
======


[PYK] 2019-10-13:   A [deep list] addresses this issue by encoding a single
value as a list containing only one item.  Everything else is interpreted as a
sequence if it looks like one.  To add a value that might be interpreted as a
list, use `list $myvalue`.  Routines that understand the format automatically
unlist values as they are requested.



** Single-word Lists vs non-list value **

Many simple string values can also be interpreted as single-word lists.
Programs should use additional data or rely on program logic to decide whether
a value should be interpreted as a list or a string.

Here is another way of looking at the problem (`[lindex]` without an index
returns the list argument unchanged):

======
% lindex a
a
% lindex a 0
a
% lindex [lindex a 0] 0
a
% lindex [lindex [lindex a 0] 0] 0
a
% lindex {a}
a
% lindex {a} 0
a
% lindex [lindex {a} 0] 0
a
% lindex [lindex [lindex {a} 0] 0] 0
a
% lindex {{a}}
{a}
% lindex {{a}} 0
a
% lindex [lindex {{a}} 0] 0
a
% lindex [lindex [lindex {{a}} 0] 0] 0
a
% lindex {{{a}}}
{{a}}
% lindex {{{a}}} 0
{a}
% lindex [lindex {{{a}}} 0] 0
a
% lindex [lindex [lindex {{{a}}} 0] 0] 0
a
======

No program can tell the difference between the string "a" and the one-word
list "a", because the one-word list "a" ''is'' the string "a".

[Dossy] 2004-02-26: A co-worker yesterday who is new to Tcl discovered
something that surprised me -- [nested list]s in Tcl don't work as I expected
in a very specific case:

======
% list [list [list x]]
x
======

Um, when I ask for a list of a list of a list with the single word 'x', I
would expect '{{{x}}}' back.  However, you just get 'x' back.  Thinking about
it, I understand why, but it means that Tcl lists alone cannot be used to
represent ALL kinds of data structures, as Tcl lists magically collapse when
it's a series of nested lists with the terminal list having only a single
bare word that requires no escaping.

[Lars H]: It looks worse than it is. For one thing, it is only the string
representation that collapses, not the internal representation, so the above
nesting of `list` is not completely pointless. It is furthermore very
uncommon (and this is ''not'' specific to Tcl) that nesting depth alone has a
significance. Either you know the structure of the value, and thereby the
intended nesting depth, or the list is some generic "thing", and in that case
you anyway need a label specifying what kind of thing it is.

[DBaylor]: I think this is actually worse than it looks.  I see lots of people
trying to learn Tcl and the #1 point of confusion is Dossy's example.  But what
I really dislike about this behavior is that it hides bugs until specific input
is encountered.  If you ever mix up your data-types (string vs. list), your
code will work fine 99% of the time - until special characters are involved.
These bugs are inevitably found the hard way.  Oh how I wish `[list] x`
returned `{x}`.

[PYK] 2013-10-27: What would `[lindex] x 0` return, then?  An error that `x` is
not a list?  All commands are lists (after adjusting for [script
substitution%|%script substitution]), and `x` is potentially a valid command.
Therefore, `x` must be a list.  There is some subtlety to Tcl which can take a
little time for beginners to wrap their heads around, but this subtlety is
Tcl's strength, not its weakness.  The gripe about hiding bugs until specific
inputs are encountered is a gripe about dynamic languages in general, and not
particular at all to Tcl.



** Subsetting a List **

[LV] Question: in [Perl], [Python], and a number of other languages, one has
the ability to read and write to subsets of a list -- ''slices'' -- using an
almost array like notation.  Is this something that one could simulate without
much grief in Tcl?
 
[RS]: But of course - our old friends (with wordier notation)

======
set slice [lrange $list $from $to]
set newlist [eval lreplace [list $otherlist] $from $to $slice]
======

`[lset]` can only replace a single element, but possibly several layers deep
in the nesting. For reading access to a slice of a list, check [Salt and sugar]
for how the following is implemented:

======
set i [$ {a b c d e f g} 2 4] ==> {c d e}
======



** Flattening a List **

To flatten a list:

======
concat {*}$nested
======

It can be applied multiple times:

======
proc flatten data {
    concat {*}$data
}
set a {{a {b c}} {d {e f}}}  ; # {a {b c}} {d {e f}}
flatten $a                   ; # a {b c} d {e f}
flatten [flatten $a]         ; # a b c d e f
======

alternatively:

======
set flattened [join $mylist]
======

Another possibility:

======
foreach e $list {
    foreach ee $e {
        lappend flatList $ee
    }
}
======

`[eval]` is not a good option because it chokes on newlines:

======
% eval concat {a
b}
ambiguous command name "b": binary break
======

The newline in the argument marks the end of the arguments to `[concat]` and
the beginning of a new command. This is due neither to `[concat]` nor to the
[tclsh] prompt loop, but to fact that of `[eval]` itself concatenates into a
single script and then evaluates that script.

To get around that issue, use some list routine to convert the value into a
list containing no newline characters:

======
eval concat [lrange {a
b
} 0 end]
======


[RS] 2004-02-26: If you really want to flatten a list of any depth, 
i.e. remove all grouping, I think this way is simplest (and robust):

======
proc flatten list {
    string map {\{ "" \} ""} $list
}
% flatten {a {b {c d {e f {g h}}}}}
a b c d e f g h
======

[Lars H]: No, that won't work. Consider

======
% flatten [list \{ \}]
\ \
======

I'd give you that it isn't exactly clear what should happen to lists with such
words, but the above doesn't get a single character right.

[RS] admits he was thinking of well-behaved lists (as built with `[list]`
and `[lappend]`, where braces are only generated mark-up, not content :^)
You're right to point out that ''flatten'' is not chimpanzee-proof, and robust,
enough.

[cyrilroux] 2010-10-28:  Maybe here is the solution to solve the escape issue?
This always consist in replacing all { and } but NOT \{ and \} (escaped ones).
That is to say 6 cases:

"^{"
" {"
"{{+"
"}$"
"} "
"}}+"

======
proc lflatten list {
    regsub -all {^\{| \{|\{\{+|\}$|\} |\}\}+} $list { } flatten
    return $flatten
}

% set foo {{a {b c}} {d\{d\{ {e f} g}}
% lflatten $foo
a b c  d{d{ e f g 
======


[CMP]: In general, string manipulation on lists all have the same problem; they
do not consider the list structure (unless copying the complete implementation
of the existing list routines).  Hence, all list manipulations should be done
using existing list routines.



** Information about struct::list - extended list operations **

[Tcllib] now contains a [struct::list] module.  Its documentation can be found
at http://tcllib.sourceforge.net/doc/struct_list.html.

[dgp] offers this example of making use of it:


======
% package require struct 1.3
1.3
% namespace eval my {
    namespace import ::struct::list
    set l [::list 1 2 3]
    puts [list reverse $l]
}
3 2 1
======


In the previous example, `::list` refers to the standard `list` in the [global]
namespace, and `list` refers to the routine in the current namespace that was
imported from `::struct::list`.



** What Would a Tcl Version of `list` Look Like? **

[AMG]: Is this an acceptable implementation of `list`?

======
proc list args {
    return $args
}
======

Looks right to me...

[RS]: To me too. That's because ''args'' is already a list, as by the Tcl
parser... I'd just write


======
proc list args {set args}
======

[AMG]: Chuckle.

======
proc list args [list set args]
======



** Transform a List Into a List of Fixed-Size Lists **

[LV]:  in response to a question, Jonathan Bromley, 2008-05-30,
[comp.lang.tcl], wrote the following Tcl proc for turning a long list into a
list of lists:

======
proc split_list {L {n 50}} { 
    incr n 0; # thanks to RS for this cool "is it an int" check! 
    set result {} 
    set limit [expr {[llength $L] - $n}] 
    for {set p 0} {$p <= $limit} {incr p $n} { 
        lappend result [lrange $L $p [expr {$p+$n-1}]] 
    } 
    return $result 
} 
======

[arg]:  Just the code I wanted (needed to split results from SQLite). Thanks.
But can I suggest a few tweaks:-

   1. name changed to "partitionlist", I think it's less ambiguous than "split_list". Any ideas for a better name?
   2. change default from 50 to 2, splitting into pairs looks a more useful default.
   3. change setting/comparing "limit" so that all the elements of the original list are copied, the original version acted as if the original list were truncated to a multiple of the requested sublist length. This version will output any "extra" elements as another short sublist.

======
proc partitionlist {L {n 2}} { 
    incr n 0; # thanks to RS for this cool "is it an int" check! 
    set result {} 
    set limit [llength $L] 
    for {set p 0} {$p < $limit} {incr p $n} { 
        lappend result [lrange $L $p [expr {$p+$n-1}]] 
    } 
    return $result 
} 
======

[Lars H]: If the partitioned list is then immediately going to be iterated
over, it may of course be easier to take advantage of the fact that the
variable arguments of [foreach] are really lists of variables. I.e., instead of

======
foreach row [partitionlist $data 3] {
    lassign $row first second third
    # Further processing...
}
======

one can just do

======
foreach {first second third} $data {
    # Further processing...
}
======

Conversely, this can be used for the following [braintwisters]' implementation of '''partitionlist''':

======
proc partitionlist {L {n 2}} {
    set varlist {}
    set body {lappend res [list}
    for {} {$n>0} {incr n -1} {
        lappend varlist $n
        append body { $} $n
    }
    set res {}
    foreach $varlist $L [append body \]]
    return $res
}
======


** Other uses of `[list]` **

[AMG] [PYK]: When given no arguments, `[list]` returns empty string.  I find this
useful when entering Tcl commands interactively, e.g. into [tkcon] or [tclsh].
When I know that a routine will produce a lot of output, such as
[read%|%reading] a whole file, and I don't want to have it all on my screen, I
tack `; list` onto the end of my command line.

======
$ tclsh
% set chan [open bigfile]
file12dcd00
% set data [read $chan]; list
% close $chan
======

If not for the `; list`, the second line would flood my terminal with lots and
lots of garbage.

[AMG]: Another use for list is to pass it a single argument which it will then
return.  For an example, see SCT & [RS]'s comments on the page for `[if]`.
However, this works due to the "problem" noted above by [Dossy] 2004-02-26.
Often the argument requires quoting to become a single-word list, in which
case list will ''not'' return its argument verbatim.  On the `[return]` page
I discuss a few other, safer approaches.  The simplest one is to instead use
single-argument `[lindex]`.



** Some Tcl Core Routines Require Lists as arguments, Or [Return] Lists **

   `[after]`:   

   `[apply]`:   

   `[array]`:   

   `[binary]`:   

   `[chan]`:   

   `[dde]`:   

   `[dict]`:   

   `[oo::define]`:   

   `[oo::next]`:   

   `[encoding]`:   

   `[exec]`:   

   `[fconfigure]`:   

   `[file]`:   

   `[glob]`:   

   `[http]`:   

   `[info]`:   

   `[interp]`:   

   `[library]`:   

   `[msgcat]`:   

   `[namespace]`:   

   `[open]`:   

   `[package]`:   

   `[packagens]`:   

   `[pid]`:   

   `[pkgMkIndex]`:   

   `[platform]`:   

   `[proc]`:   

   `[read]`:   

   `[oo::refchan]`:   

   `[regexp]`:   

   `[registry]`:   

   `[return]`:   

   `[safe]`:   

   `[scan]`:   

   `[oo::self]`:   

   `[socket]`:   

   `[switch]`:   

   `[tcltest]`:   

   `[tm]`:   

   `[trace]`:   



** Page Authors **

   [DKF]:   
   [PYK]:   Articulated that in Tcl a list is not a data structure but a string, and that "item" refers to a value in an abstract list, while "word" refers to the literal representation of the item in athe string that is a list.  Identified the parsing differences between [list] and [script].  Added the terms '''canonical list''' and  '''canonical streaming list'''.



** References **

   1:   [aspect], [Tcl Chatroom], 2014-09-13


<<categories>> Command | Data Structure | Arts and crafts of Tcl-Tk programming | Routine | Tcl syntax