Version 3 of Entities

Updated 2005-02-09 12:00:35 by suchenwi

Richard Suchenwirth 2002-11-22 - In real life, "entity" is a rather vague term for things or beings. In XML, it stands for strings marked up between "&" and ";", e.g.

 >

is equivalent to the "greater" sign, >. In Tcl, variable names prefixed with $ do string substitutions, even inside another string (if not braced). Consider this example, where initially two "entities" are defined - the trailing list produces an empty string and hence removes traces of the embedded commands; further down the text, the variables are indeed replaced with their assigned values:

 % set t "[set h Humpty-Dumpty;
        set k king; list]
       $h sat on a wall,
       $h did a great fall,
       all the $k's horses and all the $k's men
       couldn't put $h together again."
 % puts $t
       Humpty-Dumpty sat on a wall,
       Humpty-Dumpty did a great fall,
       all the king's horses and all the king's men
       couldn't put Humpty-Dumpty together again.

One of the rare cases where a sequence of statements inside the [] brackets makes (some) sense... The practical use is to make substrings that occur more than once configurable in one place.


Shorter example:

 subst {[set ht "[set h {Happy Birthday}] to you"],$ht,$h dear XX,$ht}

In RS's RSS, I render XML/HTML onto a text and wanted to generally resolve numeric entities, where

 &#xxx;

stands for the Unicode with decimal number xxx. This code seems to work fine, with and without leading zero:

 proc entity'resolve string {
   set tmp [regsub -all {&#(\d+);} $string {\\u[format %04X [scan \1 %d]]}]
   subst -noback -novar $tmp
 }
 % entity'resolve "cholesterol checked for $25 and blood pressure for $10 for 16 readings"
 cholesterol checked for \u002425 and blood pressure for \u002410 for 16 readings

Category Concept | Arts and crafts of Tcl-Tk programming