String Processing

Purpose: discuss various topics regarding Tcl processing of strings.

One of the advanced topics relating to strings that I would like to see someone discuss is dealing with arbitrary strings from the user or from databases/data files. This is frequently done wrong by people writing extensions and applications, so that when a user types for instance a file name with a space (blank) in it, or an input string containing a special quotation mark, or bracing character, the Tcl code fails. What tips do you have on this topic? I suspect this will touch upon the age old topic of strings versus lists. d


Arjen Markus I could not resist this invitation, but it will be limited for the moment to some short phrases rather than my usual verbose prose (I get comments from colleagues about it :-):

  • Always use list commands like split to turn a string into a properly formatted list.
  • Do not trust the user to enter a properly formed list themselves
  • If you are paranoid (justly or unjustly), avoid evaluating strings from users, as they may contain substrings like "[exec rm *]".
  • You may want to substitute all special characters by their escaped equivalents (or simply remove them!) first before passing the strings on to the rest. You can use string map for that.
  • If security is an issue, you need to keep track of trusted and untrusted input. Using the safe interpreters is one thing that will help! - But this is perhaps a topic in its own right.
  • A list of special characters: left square brace, right square brace, ", }, {, tab, newline, null-character
  • What about trailing blanks? [string trimleft] removes whitespace (or other chars by optional parameter) from the beginning of a string; [string trimright] does the same for the end. [string trim] trims both ends.
  • In many of my applications, I give the input data the same form as Tcl code, sourcing the input files is then enough to get the input. (Tip from Brent Welch!) However, if any special characters need to be present, try read to read the whole file, use string map to escape the special ones and evaluate the resulting string. Ought to work

See also Additional string functions | Arts and crafts of Tcl-Tk programming