Version 16 of Tool Protocol Language

Updated 2008-10-29 10:23:06 by lars_h

TPL - Tool Protocol Language.

CMcC 28Oct08 17:11 AEST

NB: this is not a description of a new language, but an essay on thinking about tcl in a different way.

The world has many protocols for communicating between processes (IPC). Some are binary, some are ASCII. I will call ASCII protocols for IPC protocol languages.

Examples of protocol languages and binary protocols

  • XML is used in XML-RPC type IPC.
  • FCGI, SCGI, CGI have aspects of protocol languages, and some of binary protocols.
  • HTTP is used in REpresentational State Transfer, REST
  • JSON is used in AJAX
  • several data representation languages YAML, ...
  • XDR is a binary protocol used in SUN RPC
  • command-line protocol - used to invoke commands from a process, per the unix system() command, e.g.

Tcl (arguably) has many characteristics desirable in a protocol language:

The purpose of this page is to argue this assertion, to explore the truth of the foregoing assertion, to explore counterarguments, to deepen understanding of tcl as a protocol language, to explore prerequisites for its use in this role.

It seems to me CMcC that a cut-down tcl syntax, without special meaning attributed to {*}, $ and [] would serve well in this role. Such a restricted syntax should be called TPL or perhaps TDL (Tcl Data Language)

Many applications and systems provide a plugin API. Such an API will necessarily have an expression of the form command+args->result, and will therefore be well suited to a TPL representation.

Examples of plugin APIs:

In many cases, wrappers for library APIs have a similar form, command+args->result, and in fact many critcl and other such wrappers translate C APIs into this form.

The advantage and virtue of TPL is that it formalises the useful TCL syntax subset in a way which might be of interest to people outside the TCL community.

A precedent for TPL may be found in JSON. All command+arg->result forms can be represented as tcl-syntax lists, and these are clearly and obviously trivial to interpret in tcl. Providing a C library which translates a meaningful set of C data types into and out of TPL would enable conversion of any plugin API into wire-ready TPL protocol language.

One thing which would be useful in this endeavour is production of a subset of the tcl syntax *dekalog sufficient to completely define TPL as a subset, and perhaps the full tcl syntax could be expressed in terms of TPL.

Dynamics of protocol interaction:

There are several styles of protocol interaction:

  • simple request/confirmation - RPC
  • pipelined request/confirmation - HTTP
    • in-order confirmation pipelining
    • out-of-order confirmation
  • multiplexed streams of the above - FCGI

Consideration must be given to each of these styles.


RS 2008-10-28 - One Tcl-based "data language" is the one that tDom produces with the asList method:

 % dom parse "<foo><bar><grill>42</grill><grill>345</grill><qux a='b' /></bar></foo>"
 domDoc01301840
 % [domDoc01301840 documentElement] asList
 foo {} {{bar {} {{grill {} {{#text 42}}} {grill {} {{#text 345}}} {qux {a b} {}}}}}

It may not be the prettiest sight, but can be parsed with Tcl very easily, and allows to represent the same complexity that XML has, with nested structures and attributes.

  • Each element is a triplet of {name attributes children}, where
  • attributes is {key val key val ...} and
  • children is {element ...}

Lars H: Indeed a very useful format, that natively supports data is code, and for which a pure-Tcl translator from XML is available: A little XML parser (whose only failing is that it doesn't translate XML entities to ordinary characters in e.g. attribute values and #text data). I'm presently using it for a project, and have encountered no problems with it (even though it sometimes feels a bit prolix when one has to go [list [list foo {} {}]]] in order to make a list with the only child of a node). I suspect this is a slightly higher level format than the TPL suggested here, though; while pretty much anything can be encoded in XML, it might be that some applications of TPL would benefit from not adhering to the strict type–attributes–children structure of the XML-asList format. In XML-specification-speak, I think what I'm getting at is that XML-asList would be an important application of TPL, but not necessarily the only one.

In terms of Tcl syntax subsets, there are at least three useful levels:

  • Full Tcl syntax (*dekalog).
  • Tcl list syntax (as shortly explained on the lindex manpage): $, brackets, #, semicolon, and newline have no special powers, but backslash, braces, quote, and whitespace have their normal meaning.
  • Tcl list syntax, plus command separators and comments. I think this is roughly what CMcC is proposing for TPL, but no direct support exists for it (that wouldn't also support full Tcl syntax).

The extended list syntax is superior to list syntax in that it supports comments and is better at catching syntax errors — when forgetting an argument of some command it doesn't grab the name of the next command as that argument — while still keeping the nesting of braces at a tolerable level. A downside for internal processing is that it (presently) cannot preserve the internal representation of data, since everything shimmers to a string when you join separate commands into a script.

CMcC: I hadn't thought about command separators and comments. I think they might unnecessarily complicate the TPL usage of Tcl syntax, although the use of command separators to represent pipelining is an intriguing possibility (seems to me that comment is completely useless in the TPL context). I think TPL needs backslash, braces, quote and whitespace. So it looks loke Tcl list syntax is equivalent to TPL. This is roughly analogous to JSON, which is a useful parallel to keep in mind, I think.

In general, APIs and RPC are functional applications, so are directly representable by tcl's command syntax + lists (and since command syntax is list syntax, this reduces to being lists.) Clearly anything can be represented as a string (that's a Tcl mantra) and it's useful to interpret a subset of strings as lists, and those lists can be used to represent anything a protocol language can be expected to represent.

The virtues of formalising this approach, and giving the Tcl syntax subset a distinct name, are:

(a) marketing - we have a new way to think about tcl, we have a new way to provide utility to the wider world, and to give people a reason to think about tcl for applications or retrofits,

(b) support - we can produce a series of C/JS/etc language functions which will interconvert between the host language's data types and tcl's. Given such a library for a given language, the process of interfacing applications and systems written in that language to tcl is significantly simplified, and of course the processing of the API/RPC protocol language in tcl is *vastly* simplified. This allows Tcl to better fulfill its function as a Tool Control Language, by supporting Tool Protocol Language as a protocol language.

Lars H: I got the impression that you wanted TPL to be a natural language for config files; in that setting command separators and comments are highly recommended, but I can imagine use-cases also for command substitution (e.g. binary decode of some blob might be more convenient than the \x counterpart) and variable substitution there, so it is probably better served by full Tcl syntax. For communication exclusively between two pieces of software, command separators and comments are pretty useless and a significant complication, so if that is the niche for TPL then the Tcl list syntax is indeed the natural fit.