everything is a string

Difference between version 167 and 168 - Previous - Next
''Representation is the essence of programming...''
    :   Fred Brooks, [http://en.wikipedia.org/wiki/The_Mythical_Man-Month%|%The Mythical Man-Month]

The more precise meaning of '''Everything is a string''' is '''every value is a
string'''.  This is one of the central features of Tcl.

This page lays out the implications of the ''everything is a string'' design,
as well as the details of working with this design when using the [Tcl C API],where the string value observed at the script level is part of [Tcl_Obj%|%a 
structure] that
 contains both the string representation and a typed 
interpretation of that string.  If this typed 
interpretation isn't of the 
appropriate type for the current context, it is discarded and replaced with a 
new interpretation.  A script can avoid this [shimmering] by consistently using 
a value such that its interpretation remains the same.



** See Also **

   [Everything is a Symbol]:   The other side of the coin.

   [shimmering]:   The discarding and regeneration of the internal typed representation of a value.

   [Tcl_Obj]:   

   [homoiconic]:   

   [Is everything a list?]:   

   [How Tcl is special]:   

   [http://www.infoq.com/presentations/dynamic-static-typing/%|%The unreasonable effectiveness of dynamic typing%|%]:   Uses the term, "stringly-typed language", not directly in reference to Tcl, but in a way very relevant to Tcl.

   [http://blog.metaobject.com/2014/06/the-safyness-of-static-typing.html%|%The Safyness of Static Typing%|%]:   




** Further Reading **

   [https://core.tcl-lang.org/tcl/file?name=doc/dev/value-history.md%|%History of Tcl String Values], [dgp] et al:   A speculative and likely revisionist look back at Tcl strings.

   [https://www.ianbicking.org/blog/2008/01/documents-vs-objects.html%|%Documents vs. Objects], Ian Bicking, 2008:   



** Description **

A '''string''' is a '''sequence''' of '''[character%|%characters]'''.  In
versions of Tcl prior to 8.7, a characters is one of the code points in the
'''[Unicode] basic multilingual plane'''. In Tcl 8.7 a '''character''' is one
of the '''[Unicode] scalar values''', which is all the Unicode code points
except the surrogate code points.  In Tcl version 9 it is expected that a
character will be one of the '''[Unicode] code points'''.

In a Tcl script, '''everything is a string''', and Tcl assigns no meaning to
any string, making it a '''typeless language''':  Since a string has '''no
particular value''', it also has '''no particular type'''.  The
[dodekalogue%|%rules for Tcl] state that a [script] is composed of
[command%|%commands], and a command is composed of [word%|%words], the first of
which is the name which is used to locate the [procedure] used to execute the
command, and the remainder of which are the arguments that are passed to that
procedure.

A script is composed entirely of commands.  Before evaluating a command, Tcl
forms the words of the command by performing any needed
[dodekalogue%|%substitutions]: variable substitution, command substitution, and
backslash substitution.  Since variable values, command evaluation results, and
interpreted backslash sequences can be substituted into words, it follows that
the result of each substitution is a string.  The phrases '''everything is a
string''', '''everything is a script''', '''everything is a command''', and
'''everything is a word''' allude to this design, which is fundamental to Tcl.

Since no type syntax is necessary in Tcl, characters like double quote or
braces may be put to other uses.  Sometimes double quotes are used such that
they appear to denote strings, but they actually don't.  Braces can be used
used in such a way that they appear to denote a block of code, but they
actually don't.  In Tcl, double quotes and braces merely change the context for
determining which characters should be interpreted as substitutions or word
delimiters.

A string has no intrinsic meaning.  The only fundamental feature of a
'''string''' is that it is unique among the set of possible strings.  This
feature is sufficient to differentiate procedure names.  The strings that Tcl
passes to a procedure are not in themselves identifiers, numbers, functions,
variables, lists, dictionaries, blocks of code, or special values like [null],
although a procedure may choose to assign one of those meanings to a string
that is passed to it.  That is to say, it is the individual procedures in Tcl
that assign meaning to a string, not Tcl itself.

A string is not an object, so it has no configurable properties.  The
[https://en.wikipedia.org/wiki/Identity_of_indiscernibles%|%identity of
indiscernibles] applies to strings:  Each identical string has the same
meaning.  Any extension that looks at some internal data Tcl keeps about the
value to determine the meaning of a string is ill-behaved.  Any such internal
data is merely an implementation detail.  On the other hand, it is perfectly
fair for each separate routine to interpret the same string representation in
its own way, or even for one routine to interpret a single string in multiple
ways, but it makes its decisions based on the context of its operation and upon
the meaning it can derive from the string itself.

This design makes Tcl rather unique.  In other languages each value has a
particular type, for example, '''`"8"`''' may be a string, while '''`8`''' is a
number.  In Tcl there is just '''`8`''', which one procedure may use as a
string, and another may use as a number.  In other dynamic languages where each
value is an object, and may therefore come with various behaviours, '''duck
typing'''  means that if the object "walks like a duck, and quacks like a duck,
it's a duck.  A simple string has no such behaviours attached to it, so duck
typing in Tcl is even more simple:  If it looks like a duck, then it's a duck,
although it may also be a witch.

Because everything is a string, there are no operators, no keywords, and no
code blocks.  If there were then values would have different types:  strings,
operators, and keywords.  This unification of all things into strings opens the
door for some peculiar modes of operation.  In other languages, the behaviour
of an operator may be overloaded based on the types of its operands.  In Tcl a
procedure maybe be overloaded by moving it out of the way and replacing it with
another in its place.  Since everything is a string, including things that look
like code blocks, and things like loops are implemented as procedures rather
than keywords, even fundamental things like conditional execution and loops may
be customized by replacing built-in procedures.  Tcl's '''everything is a
string''' design makes such [radical language modification] low-friction.

Tcl's typeless approach puts it in a somewhat exclusive category of
[programming%|%programming languages].  The other languages in this category
are [forth], BCPL, and [assembly language].  Almost all other languages
practice a form of tight syntactic coupling of data and metadata where the
pattern of characters that compose a value is juxtaposed with other patterns of
characters that communicate some additional information about the value. In the
most common case this additional information provides a '''type''' for the
value.  To Tcl, each pattern is just a message to pass along.  Tcl doesn't read
the message.  It simply passes it to a routine.  It is the routine that
ascribes some meaning to the pattern and that returns another pattern that, by
contract, also has a certain meaning.  The programmer must understand the
contract and rely on it to weave together a sequence of routines that make some
sense and accomplish some objective.

A string may be endowed with a rich and structured meaning.  Like a [command],
a [list%|%list] is represented as a string that is a sequence of words
separated by whitespace.  Each of the words in a list may itself be another
list, in which case the string represents a [recursion%|%recursive] [data
structure]. A procedure that understands the format and syntax of a list can
perform operations on the list.  For example, it might return one of the words
in the list, or one of the words contained in that word.  A word in a list
might represent some other structrue as well, like a [dict%|%dictionary].  This
concept of using strings to represent more complex data strucures is one of the
distinguishing concepts of Tcl.  In Tcl, every operation on a data structure is
also, strictly speaking, a '''string operation'''.  This design, however, only
represents the script interface of Tcl.  Internally, and also in the [C] API,
Tcl features mechanisms for using more performant in-memory data structures to
carry out these operations.  The more performant implementations must make
sure, however, to maintain the string semantics that the script interface
provides.

In [assembly language%|%assembly language], bits/bytes are paired with
instructions, which manipulate those bits/bytes.  The assembler doesn't concern
itself with any notion of type for those bits/bytes.  Instructions are issued,
and each instruction knows how to treat the bits/bytes handed to it.  Tcl
operates in the same way, except that the fundamental transactional unit
between instructions is the string rather than bits/bytes.  This string veneer
makes it easy to stretch Tcl over existing projects, providing a connecting
layer between components.  Each command is an entry point into a system of
arbitrary complexity, and the string is the universal currency.  Each new
routine can [extension%|%extended] Tcl into a new domain, and allow operation
between that domain and the other that have a window in Tcl.


Here are some examples of the interpretations routines give to the values passed to them:

   `[expr]`:   Concatenantes all words an interprets them as an expression having a syntax of its own.  This syntax is similar to that of Tcl itself.

   `[lassign]`, `[lindex]`, `[linsert]`, `[lmap]`, `[llength]`, `[lrange]`, `[lreplace]`, `[lsort]`:   Interpret some or all of their arguments as lists.

   `[lappend]`, `[lset]`:   Interpret some of their arguments as the ''names'' of variables whose values are lists.

   `[lassign]`:   Interprets some of its arguments as the names of variables to create and assign values to.

   `[dict]`:   Interprets some of its arguments as a dictionary, or as the name of a dictionary.

   `[eval]`:   interprets its argument(s) as a [script]

If one desires more types, numerous systems exist that provide them and that
allow the user to define their own.  See [object orientation] for a list of
such systems.  The stringineess of Tcl makes it as flexible as a language can
be, and the implementation has focused on providing the primitives on top of
which virtually any programming paradigm can be implemented.



** What's in a String **

[list%|%lists] and [dict%|%dictionaries] are strings that conform to a
particular format.  Other values are used as [handle%|%handles] for [data structure%|%data
structures] or resources that are not directly accessible at the script level.
The following resources, accessed by name, are examples of handles:

   * variables
   
   * [namespace%|%namespaces]

   * [array%|%arrays]
   
   * [proc%|%procedures]
   
   * [chan%|%channels]

   * Tk widgets
   
   * encodings
   
   * interpreters




** Types in the World of Strings **

Typically it is sufficient for caller to understand how a routine interprets
each argument, and then provide arguments that have suitable interpretations.
When a routine needs structured information, it can specify that an argument is
a [dict%|%dictionary], a [list], or even a [command].  A commmand passed to a
routine can be used by that routine.  If the command represents an [object
orientation%|%object], it can provide a set of subcommands that that implement
some [duck typing%|%duck type].

In `[expr]` each operator provides a context, and `[expr]` tries to interpret each
term as a type that best fits that context.  Where an operator is flexible, `[expr]`
prefers a numeric interpretation.

If a value can have multiple interpretations, a routine that uses it must
implement some strategy for determining which interpretation to use.

----

[tcl chatroom] 2013-04-30:

[DGP]: I find it useful to think of "types" in Tcl as being subsets of the
value universe.  So it doesn't make sense to ask what type a value is.
Instead, you can identify those types where a value is a member, and where the
value is not a member.

[CMcC]:  Right, subsets, not partitions

[DGP]:  "Everything is a String" is just the trivial observation that all
values are in the same value universe.



** Implementation **
In the implementation of Tcl the structure that holds the string representation
for a value may also hold one [Tcl_Obj%|%typed rinterpresentation] of the value.
Modifying the string representation clears the typed rinterpresentation, and vice
versa.  For example, a list can be modified either by changing its string
representation or by using a command like [lappend], which works directly withtyped rinterpresentation if it is of list type.

After either the string representation or the typed interpretation is cleared
it is only generated again when needed.  This means for example that a valuethat has a typed rinterpresentation but no string representation can be passed
between routines that use the same typed rinterpresentation without incurring the
expense of generating a string representation.  Needless to say, modifying alist by modifying its typed rinterpresentation is much faster than modifying the
string representation.
The typed rinterpresentation of a value is just an implementation detail.  It is
not exposed at the script level, and does not have any semantic impact on the
language.  The dual-representation format is used only at the implementation
level, and only as an optimization.  Two objects with the same string
representation are the same value, whether or not the string representation has
been generated.  At the implementation level, there may well be two [Tcl_Obj]
structures with the same string representation, but with different typed
representions, and any function that accepts a [Tcl_Obj] as an argument must
interpret them as having the same value.  A user of Tcl's C API will gain an
appreciation for the way Tcl values are handled at the C level, working with
either the string representation or the typed interpretation as is expedient.



** The Magic of EIAS **

EIAS is one of the grand unifying concepts of Tcl.  As [Edsger Dijkstra] noted
in [http://www.cs.utexas.edu/~EWD/transcriptions/EWD10xx/EWD1036.html%|%On the
cruelty of teaching computer science], a program can be viewed as a formula
that must be derived by the programmer, and the only known reliable way of
doing that is by symbol manipulation.  Hence, we construct mechanical symbol
manipulators by means of human symbol manipulation.  '''EIAS''' facilitates
such a mathematical style of programming by merging the concepts of code and
data more completely than even [Lisp], as a Tcl script itself morphs to become
its own result. 

When everything is a string, every kind of data is readily accessible:   When
some new data type is introduced in a language like [C] or [Java], it usually
has to come with its own library for printing values, doing I/O, initialising
variables, and often even for copying values. In Tcl all that is immediately
available, since it can be done with strings and the new data type is
represented using strings.  This common ground eases the burden of the programmer.

Strings are general.  The standard computing models are all readily expressible
in terms of strings. The tape of a [Turing machine] contains a
finite string of symbols. [Lambda] calculus is manipulation of
[http://www-groups.dcs.st-and.ac.uk/~history/Mathematicians/Post.html%|%Post]
production systems, which model computability by replacing parts of strings with other
strings.

[NEM] 2010-12-15: One aspect of [EIAS] that is worth consideration is how it
has kept Tcl "pure" in some sense. Part of EIAS that is little mentioned is
that Tcl's strings are ''immutable''. This means that Tcl's value space is
purely [functional programming%|%functional], in the [Haskell] sense. All side-effects are confined to the
shadow world of commands and variables and other second-class entities. What
this means is that Tcl now possesses some very powerful purely functional
data-structures that are somewhat better than those available in other
languages. For instance, I cannot think of another popular language that
supplies O(1) purely functional dictionaries and lists (arrays) out of the box
(or even in the library). Not to mention efficient [Unicode] and binary strings.



** Peeking Behind the Curtain **

[DKF]: See also [representation%|%tcl::unsupported::representation], which can
peek behind this veil. If you use this, feel dirty!

[AMG]: In testing and debugging high-performance applications I use this to
confirm that I'm avoiding [shimmering].



** EIAS the Misunderstood **

Programmers more familiar with other language sometimes criticize Tcl's
EIAS design, usually because they assume that complex algorithms requiring data
structures are not possible in Tcl.  What they might be missing is
that although they can't directly translate some of their idioms into Tcl,
equally powerful Tcl idioms exist and are waiting to be discovered.  By
sticking to EIAS, Tcl elegantly disposes of problematic "features" of other
language, such as [C] features that make
http://en.wikipedia.org/wiki/Aliasing_(computing)%|%aliasing%|% in a
possiblity.  The data structures that others may think Tcl is missing are
simply expressed in another way, but that is difficult to see at the outset.

However [LV] would like to point out that the true philosophy of Tcl says
''Do all that you can in Tcl - but then, do the rest in C/assembly/whatever
and create glue and handles to it for Tcl.''



** Misc **

[Donald Porter] remarked in the [Tcl chatroom]: ''More precisely, every value
has a string representation. Tcl arrays are not values; they are special types
of variables.''

[lvirden]: I guess there are other things that fit into the same category as
arrays - created items like procs, and in tk all sorts of widgets, etc. 

[aku]: But most have a way to serialize them into a value, and back (array
set|get, proc|info body|arg|default) 

[kennykb]: And the ones that don't have natural serialization generally are
managing external resources ([channel] handles are the most obvious example) 

[Shin The Gin]: If everything was a string, then one could easily [save the
whole runtime environment to a file and restore it later].

----

https://news.ycombinator.com/item?id=9448336%|%"At the end of the day you have to choose a primitive and history has shown that text is the right one."%|%

----

[RS]: likes the ditty '''"I'm not afraid of anything, if everything is a
string"'''. In fact, the Tcl mantra often relieves fears of complexity:
anything that can be brought to the prototype "string in, string out", can be
nicely done in Tcl. [A simple Arabic renderer%|%Arabic], [Korean%|%A little Korean editor]? Of course, everything is a [Unicode] string!
[A little rain forecaster%|%Geographic mapping]? Just give me a string with the latitudes, longitudes, and
whatever other data, and presto - [Tclworld]. Images can in many ways be
rendered as strings (XBM, PNM...); one pretty intuitive way is in [strimj -
string image routines].

----

[Todd Coram]:  Data typing is an illusion. Everything is a sequence of bytes.
Call 'em ints, floats, symbols, strings, whatever. Tcl exposes both code and
data to the ''user'' as sequences of bytes (called strings). This is Tcl's
choice of abstraction. And its quite a powerful choice IMHO.

[BR]: Hm, isn't it actually like that a string is a sequence of characters, and
bytes (in Tcl) are just characters with the values 0 - 255?  I think that's the
model of binary data in Tcl.  IOW bytes are not fundamental in Tcl, but
characters and strings are.


''Except that characters could be [Unicode] instead of [ASCII].''

----

2003-05-13:  Recently, '''Bruce Eckel''' in
[http://web.archive.org/web/20100209072034/http://mindview.net/WebLog/log-0025%|%Strong
Typing vs. Strong Testing], and '''Robert C. Martin''' in
[http://www.artima.com/weblogs/viewpost.jsp?thread=4639%|%Are Dynamic Languages
Going to Replace Static Languages?] talk about weak typing and dynamic
languages.

[CL] thinks these two make mistakes, but hasn't time now to explain more.  In
any case, yes, these are good noteworthy references.

----

[escargo] 2003-05-13: Another way that ''everything is a string'' can be an
issue is where a string representation can only be an approximation of what is
being represented.  The main instances of this that come to mind are floating
point numbers (for which there are already some existing wiki pages).  There
may be other examples as well. 

What? There is no reason a string can't fully represent a floating point
number. And Kevin Kenny has a TIP in the works to ensure that Tcl does always
indeed achieve exactness in this case - Roy Terry, But really, it seems a waste
of time to make fine points about "everything is a string" which is merely a
programmer's cliche and doesn't begin to express the power of Tcl.

[escargo] 2003-05-14:  Sorry; there was a slip of the finger there.  I said
"floating point" and what I meant was "real".

[Lars H] 2003-05-15:  Real numbers are ''beyond'' what is computable. The
number of possible outputs from a [Turing machine] (and thus the set of real
numbers which one can specify in any way whatsoever) is merely countable,
whereas the set of all real numbers is uncountable.  But this view does provide
an answer to why ''"Everything is a string"'' is such a powerful idea. Many
languages (most notably C) take the approach that ''"Everything is a number
(with native machine representation) or some fixed aggregation of such
numbers"'', but all such representations are limited. In order to support
general strings, it is necessary to venture into some scheme of dynamic memory
allocation and pointers to allocated objects. The ''string'', on the other
hand, achieves the maximal generality of a Turing machine (the tape always has
an obvious representation as a string) and thus if something wouldn't be
representable as a string, it wouldn't be computable either.

----

[escargo] 2003-05-16: What would it take to make [Tk] [widget]s serializable?
I was thinking about [xml2gui] and wondering what it would take to make a
widget produce an [XML] description of itself.  Further, what would it take to
have widgets that contain other widgets produce XML of themselves?  This would
seem to me to be one useful goal.

Another goal would be the converse, what XML would need to be used to create
all the Tk widgets (and pack them the right way, etc.)?  (This would be a
suitable storage format for [GUI Building Tools].)

[jcw] 2003-05-17:  There already is a serialized form of Tk, able to cope with
any complexity of widget hierarchies: the Tcl script that creates them.

[jmn]: Yes, but is there a canonical form for it?

[escargo]:  I am reminded of one-way hashes.  You can have a function that
given an input can produce a hash value that cannot be used to derive the
original input.  Just because I have a widget does not make it clear to me that
I can derive in an algorithmic way Tk and Tcl code to recreate the widget.
Perhaps this is something for the [Tk 9.0 WishList], but I would certainly like
to see whatever changes would be necessary to allow this (if it's practical at
all).

----

[jcw] 2003-05-17: While [EIAS] is indeed a wonderfully powerful and flexible
abstraction, I'd like to point out that [LISP]'ers and [Scheme]'rs have a very
similar set of self-contained mechanisms at their disposal, based on
"everything is made up of cons cells" (it's more of a mouthful, though...).
IMO, "strings" as convention to represent data in a certain way is not
inherently different from other representation choices - one could even use
neurons and synapses if that were practical.  What EIAS does imply is "code is
data" and "data can be used as code", which is why one can play so many tricks
in Tcl (and in [LISP]).

[NEM] 2005-07-25:  replying to this a couple of years too late... The
difference with [Lisp] is that cons cells aren't universal; as I understand it,
some basic data types like numbers are not represented as cons cells. You could
build up everything from cons cells, in a similar way to building everything
from set theory, but [Lisp] doesn't, and so you can't treat an integer as a
list. In Tcl, though, the string is the universal medium of representation, so
I can treat an integer as a list (of one element).

----

[FW]: Come to think of it, what are some other typeless languages in the
"everything is a string" sense - RS has already submitted and documented
thoroughly the antique TRAC in [Playing TRAC].

[JE]:  [http://en.wikipedia.org/wiki/MUMPS%|%MUMPS], or '''M''',  is another
[EIAS] language.  [Forth] and BCPL are also typeless, but there the fundamental
type is a "cell" or native machine word instead of strings.  (BCPL seems to be
extinct, but [Forth] and MUMPS are still around.

----

'''If "everything is a string," then how can you tell what's an object?'''

[escargo] 2005-07-23:  That's what I woke up to this morning. I was thinking
that Tcl lacks what I have seen called a "meta-object protocol," something that
allows some object-oriented languages (like [Smalltalk]) to do some useful
operations on objects and classes. I like [Snit] because of what it allows me
to do to compose objects using '''delegation'''. However, if I'm operating in
Tcl (or in Tk) and I have an identifier, how can I tell if its value represents
an object from an object system like Snit (or any of the other object systems
added onto Tcl). And if it is an object, how can I tell which object system it
is an object in, so that I can guess what behavior is has (which functions it
understands or implements)?

The only way I can see something like this working is if there were some
agreed-upon standard for names (or [reference]s) such that a classifier (say
'''[[string is object ...]]''') could return a yes or no answer.

Even better would be one that could tell which object system implemented the
object (say '''[[string is objectsystem ...]]''').

This might be possible in a system like [Jim] if the [Jim References] encoded
the object system and whether something was an object.

Even without add-on object systems, it would be nice to be able to determine if
there could be '''[[string is command ...]]''', but that in some respects
defeats the purpose of [unknown].  (I'm still fuzzy from sleep, so maybe there
is something that does this already, otherwise how would [unknown] get called?)

[Lars H] 2005-07-24: I think the best way of pointing out how your analysis
here is wrong is to point out that

   * Tcl has no '''whattype''' command;

indeed, it follows from ''everything is a string'' that there cannot be such a
command (at least not one that doesn't just return "string" or whatever
regardless of input), since nothing but the string may define the value. If you
want to type-tag values, then you must include that tag in the value itself.

Put another way: Values in Tcl are what you make of them.  Values don't
"know"[[*]] that they're command names, integers, or variable names--they
become whatever you decide to treat them as (or cause an error to be thrown
when they cannot be interpreted the way you claimed).

A consequence of this is that one shouldn't write programs that just throws all
data in a huge bowl and lets [unknown] (or whatever) sort it out later, one
should write programs so that there at every point in the program is clear what
type of data is going to be passed around. It is sometimes useful to let the
type of some argument be "either an A or a B", but then one must also have
sorted out whether it may happen, and if so what it means, when data comes
along that is both A and B. 

So what has this to do with objects, then? Everything, since whatever one uses
to identify an object is just another string (even though few eyebrows are
raised these days when people request magical behaviour from objects--for some
reason it seems politically correct to regard objects as nobler than ordinary
data).  You ''can'' ask an object system whether it recognises a particular
string as identifying one its objects (but this assumes the system is
implemented in such a way that this is possible), and you could start an object
system registry that goes around asking all known object systems whether they
recognise a particular object as theirs, but that's about it, and I doubt it
would be of much use outside debugging.

In a sense, the proper response to "how can you tell what's an object?" should
be:

   * What design error did you make that made you ask that question in the first place? Where did you (or someone else) throw away the information that you now find you need?

[[*]] Techincally, on the C level, most Tcl values ''kind of'' know from their
[Tcl_Obj] internal representation whether they are command names, integers,
variable names, etc., but it is more accurate to describe this information as
''if I'm a command name/integer/whatever, then I'm the name of '''that'''
command/integer/whatever'' since this type information is [shimmering%|%shimmered] away
whenever the [Tcl_Obj] is used in a different sense.

[escargo] 2005-07-25:  This is closer to the problem that I felt I was dealing
with.  In a unified object system (or alternatively, where only one object
system is possible), you don't have to speculate about what kind of behavior an
arbitrary object might exhibit.

In Tcl (especially if you are using [Snit] to delegate to some arbitrary
objects), you don't know (and as you pointed out, perhaps ''cannot know'') what
object behavior a particular object (for which you have a string to use as a
[reference]) might exhibit.

If I have a string, I can use '''winfo exists''' to see if it's a [Tk] window.

I can use '''info procs''' to see if it is a proc.

If it were an ''object'', then I expect that it has some behavior (otherwise
what's the point of it being an object).  But without knowing more about it,
it's not safe to try different probes into its behavior to see what it can do.
The irony is that at least some object systems for Tcl provide some kind of
[introspection], but I doubt that they provide it the same way, so you can't
just use it to find out more about the object.

''Why does this matter?'' - The reason I feel that it matters is that there has
to be somewhere where knowledge of the type of object has to be carried around
so that you can write your programs correctly. (The ''type'' in this sense
being the add-on object system that implements the object.) If you can't
determine the type of object from the object itself, then you have to code that
information into comments or else invent some other means of doing it.

It's not that this can't be done, but it's a wish I have that the answer were
within the language itself, either by implementation (e.g., you could
deconstruct a reference to determine the object system) or convention (all
object systems implemented an '''info''' command that all objects responded to
that could, as one of the items that might be returned, respond with the name,
and maybe revision level) of the implementing object system.

I realize that's not going to happen, but if enough people agreed with the
need, then progress could be made in that direction.

[NEM] 2005-07-25:  This all boils down to fundamental philosophical beliefs
about the nature of values and types. What really marks Tcl out from most other
languages, and what is at the heart of this debate is not that strings are such
wonderful things that they should be used for everything, but rather a
recognition that the notion of a "type" of a value is ''extrinsic'' to the
value itself. In other words, a type is an indication of some
''interpretation'' of a value. Any representation of a value can have multiple
different interpretations, and so to talk of ''the'' type of a value without
reference to the particular system doing the interpretation is difficult.
Conversely, any abstract type can have multiple possible representations (the
key idea of abstraction/encapsulation). So, the connection between values and
types is a many-to-many connection. Most languages assume a 1-to-many
connection, so each value has a single type which is associated with it by the
''language'', with less categorisation left up to individual commands/functions
(although it is not true to say that no choice is left; every function performs
an interpretation of its arguments to some degree). Tcl, however, is different
in that it performs almost no interpretation of the values it is passed. It
does basic tokenization and grouping, but leaves values as they are found: as
strings. The only further bit of interpretation that Tcl does is to treat the
first word of each line (talking loosely) as the name of a command. (Well,
there is also variable substitution and other items in Tcl.n, but we'll ignore
those for now). It is then the individual commands which take care of any
further interpretation. You can think of this as a form of extreme lazy
evaluation: even parsing is left to the last possible moment.

So, what are the trade-offs? On the negative side, the fact that Tcl does less
interpretation for you means that it makes fewer guarantees (e.g., it's hard to
do garbage collection of references if you can't guarantee that X is a
reference and Y isn't). Another difficulty, is that it is possible to break
abstractions in Tcl: you can always drop down to the level of strings and
manipulate the representation of a value, rather than use any higher-level
interface. I actually think this is one of Tcl's strengths, but it is a longer
argument. You can also get around this by using opaque [handle]s, which hide
the representation behind a layer of indirection, that may or may not be
introspectable. On the plus side, the fact that Tcl has an ultimate fear of
commitment, means that commands have more free reign in deciding how ''they''
will interpret the values. This, I suggest, is the heart of what makes Tcl a
good glue language: by not committing to a single interpretation of a value it
allows multiple components to make their own, possibly conflicting,
interpretations. (As an aside, an interesting parallel can be drawn here with
Daniel Dennett's ''Multiple Drafts'' theory of cognition/consciousness).
Another way to look at this is to say that by providing a common representation
medium you reduce the number of explicit conversions that have to be done.  If
you have N distinct types, then in order to convert between them you
potentially need N!/(N-2)! different conversion functions (i.e. N 2-way
permutations, e.g. int2double, double2int, int2string, string2int, etc). If you
have a common representational medium, then you can use that as an
intermediate, thus reducing the number of conversion functions needed to just
2(N-1), and just two functions are needed for each type: toString and
fromString (the string type itself obviously doesn't need these).

Can we combine the benefits of both approaches? I think we can. [TOOT] was
about doing just this, and [Interpreting TOOT] has my earlier thoughts on the
subject. I've been thinking about this some more since, and will hopefully soon
have time to write some more code and an essay detailing my further thoughts.
For now, I will point at [Monadic TOOT], which contains some clues to a
possible way forward. Those who know about monads will know that they are
useful for confining effects and enforcing abstraction boundaries. I think we
can use the same techniques in Tcl to create packets in which abstractions can
be enforced and guarantees can be made, if needed. The other side of this
process is [partial evaluation], a TOOT bundle of

======none
type: value
======

can be partially evaluated (or partially applied), to yield a new function
specialised for that interpretation of that value. This can be optimised and
can enforce a type abstraction.

[Lars H]: Well put. The part about late "commitment" puts a name on something I
think is very important in understanding the strengths of Tcl. I'll see if I
can find a good place to put this idea for easy access.

[DKF]: Actually, in 8.6 there is `[tcl::unsupported]::[representation]`, which
includes type cache information in its result. Don't use it for anything other
debugging. Or if you do, feel very naughty. It is ''very'' bad style to write
code that depends on types (albeit inevitable for solving certain types of
problem in the support of [Java] and [JSON] correctly, alas).

----

[SYStems] 2005-07-23:  Those are not very complete thoughts, but. I think to
really answer and understand the idiom everything is a string, we need to
identify the context, or perspective.

A Tcl script is a series, a sequence of statements, each statement receive input

   1. A string.
   1. An event.

perform action on this input and then

   1. produce output.
   1. cause a side effect.
   1. produce output and cause side effect
   1. Raise an error

Every statement input and/or output is a string, only side effects (and maybe
input events) can be NOT A STRING, but all input and output must have a string
representation.

Each input and ouput, can have a different in-memory representation, or on disk
representation. But inside a script it must have a string, or I prefer to say,
textual representation. 

Since everything written in a Tcl script, is textual. A script can be the input
of a Tcl command, for example control structure commands.

Every input, must have an inline textual representation, this is why an command
must have a textual ouput, so that when its substituted it produces a string,
the only thing that is good as an inline input.

All input and output must have inline textual representation (this is the part
I an hesistant about, I am not really sure this is correct, I am using the word
inline loosely here I propably mean infile!!)

`[[[set]]` is the command used to manage a Tcl script memory, all `[[[set]]`
variables must have a string value.

This may sound weird, but I write this hoping that Tcl doesn't lose its primary
principles. For example, I see many people talking about Tcl variables, Tcl
doesn't have variables. `[[[set]]` is a command that has a Tcl interface, gives
a Tcl script the notion of variables by associating a name with a string value.

Depending on the value-string-representation, [set] will store it differently
in-memory. [set] not Tcl. 

`[[[set]]` for example, doesn't store the variables on disk. A good Tcl'er
might create a command that give a Tcl script the notion of persistant data,
data stored on disk!

Tcl helps guide thinking by recognizing the syntax `$name`, and treating that
as `[[[set] name]]`

Anyway, back to the fact that set can only associate a name with a string. set
is used to store another tcl command's output, and pass it later as an input.

    :   [LV]:  Uh - maybe that is how you _want_ it to work. But since I can say `set abc 123` then set doesn't just store another tcl command's output...

So we can say that everything inline- a tcl script, anything that can be passed
around, a tcl script memory, a tcl script internal environment, must be a
string. Or in other words, we can say, that Tcl introduces a new in-tcl
context, where everything must have a textual representation.

Anything outside a Tcl script, outside the in-tcl context, for example, a
command side effect, or an external environment, can be not a string.



<<categories>> Concept | Discussion | Tcl Syntax