What kinds of data can Tcl scripts use

Everything is a string!

The simplicity of Tcl is simply amazing. Everything in Tcl is a string. That includes source code, numbers, "strings", values, lists, and arrays. They all have string representations. This gives you two huge advantages:

  1. Introspection: You can examine almost anything, like which variables are defined, the source code in procedures, and which extension packages are available.
  2. Dynamic Code Generation: You can create new data structures and even source code on the fly.

Actually, everything isn't really stored as a string in Tcl. As of version 8.0, everything is stored internally as objects, and the byte-code compiler efficiently translates strings to objects when necessary. But it is useful to think of things as strings. If you're interested in high performance, then take a look at the Tcl Performance page.

Tcl strings can be single words or they can be more complex. If you surround the string with quotes ( either " " or { } ) then you can have white space and even newlines (carriage returns) in a string. These are all valid Tcl strings.

   simpleString

   "More complex string with quotation marks"

   {More complex string with braces}

   {A complex multi-line
    string with braces for quoting}

As of 8.1, you can even put in binary data (usually taken to mean stuff that includes NUL characters - ASCII \000) and any old character from the massive set that is UNICODE. The chances are that most of the languages that are currently in use throughout the entire world are representable within Tcl strings.

However, you need to be careful. Early versions of Tcl had problems with strings containing a \x00 character. Even when that was fixed, problems continued on some platforms with strings containing a literal control-Z character, since on Windows, that character acts as a guard character indicating end of file. Cross platform consistency in Tcl was achieved in Tcl 8.4; now, on all the platforms, \u001a is an end-of-file character in scripts. If you have to use this character, then you need to encode it. See the source command's reference page for details on encoding options.

Talk about displaying of Unicode here


Numbers (using English representations)

Decimal Integers
0, 1, 2, 42, -347, 65536
Floats
0.0, 12345678123456878.0, 0.00000000001, 1.2345e-12
Hexadecimal Integers
0x4a, 0xabcd1357, 0xfee1900d
Octal Integers
01234567, 0135, 0040, 0177

Watch out with the last one! If you want to convert random strings of decimal digits into a decimal integer, you must use [scan] to do this or you will fall into the octal Pit-Trap of Doom. This also applies when you are trying to do fancy parsing of numbers from dates...


Lists

Tcl supports lists of strings very efficiently. A list is just a string that Tcl can handle specially because of conformance to a particular syntax. You can set up a list one element at a time, or as a single string of whitespace-separated strings and then treat it as a list of elements. Once you define the list, you can access list elements by number, search or modify the list, or use it in looping constructs like foreach

     # If you get into the habit of using tcl's list command you will
     # probably be happier in the long run...
     set namelist [list john sue george]
     set firstname [lindex $namelist 0]
     set lastname [lindex $namelist end]
     foreach name $namelist {
          puts "$name"
     }

Arrays

Tcl supports arrays. Actually, they are dynamic hashed associative tables, or hashed dictionaries. But they can do a lot of neat things. Arrays store values indexed by a name, where the name is the hash key. The name can be any string, including numbers.

     % set temperature(Deluth) -20
     % set temperature(Minneapolis) -12
     % set temperature(Rochester) 20
     % foreach city [array names temperature] {
        puts "$city $temperature($city)"
     }
     Deluth -20
     Rochester 20
     Minneapolis -12

If you are used to thinking of arrays in other languages, you should keep some things in mind to help avoid frustration. First, white space in the element name is significant, and may not work the way you expect. For instance:

     % set ar(1) one
     1
     % set ar( 2) two
     wrong # args: should be "set varName ?newValue?"

Fortunately, you can emulate multi-dimensional arrays found in other languages and libraries by consistently building the array element name. Use commas to separate indices to avoid white space problems, and it looks almost like FORTRAN. :-)

     for {set i 1} {$i<=3} {incr i} {
          for {set j 1} {$j<=5} {incr j} {
               set matrix($i,$j) [expr {$i * $j}]
          }
      }

But if you use the array command to access the elements, you will not get them in any particular order.

     % array names matrix
     1,3 2,2 3,1 1,4 2,3 3,2 3,3 1,5 2,4 3,4 2,5 3,5 1,1 1,2 2,1

Handle

Sometimes Tcl represents a value as a 'string' but underneath Tcl things are more complex. For instance, when you use the Tcl open command, it returns a 'string' which you then can use with read, write, etc. However, the string is just a simple representation for the operating system's native open file representation. Other complex data types are also handled like this. For instance, when you build a Tk widget like a canvas, scrollbar, etc. the commands return a 'string' but inside of Tcl, these strings are really just 'pointers' to complex data structures.

The reason I mention all of this is that it is a mindset that Tcl uses frequently. Programmers who seek some method of moving underlying operating system structures around within Tcl should consider building similar 'handles'.

I am sure someone will come along and replace this sentence with the specific details as to what Tcl man page to read to learn what kinds of C API calls you need to do this


Other information about data types handled by Tcl can be added.