Entities

Richard Suchenwirth 2002-11-22 - In real life, "entity" is a rather vague term for things or beings. In XML, it stands for strings marked up between "&" and ";", e.g.

 >

is equivalent to the "greater" sign, >. In Tcl, variable names prefixed with $ do string substitutions, even inside another string (if not braced). Consider this example, where initially two "entities" are defined - the trailing list produces an empty string and hence removes traces of the embedded commands; further down the text, the variables are indeed replaced with their assigned values:

 % set t "[set h Humpty-Dumpty;
        set k king; list]
       $h sat on a wall,
       $h did a great fall,
       all the $k's horses and all the $k's men
       couldn't put $h together again."
 % puts $t
       Humpty-Dumpty sat on a wall,
       Humpty-Dumpty did a great fall,
       all the king's horses and all the king's men
       couldn't put Humpty-Dumpty together again.

One of the rare cases where a sequence of statements inside the [] brackets makes (some) sense... The practical use is to make substrings that occur more than once configurable in one place.


Shorter example:

 subst {[set ht "[set h {Happy Birthday}] to you"],$ht,$h dear XX,$ht}

In RS's RSS, I render XML/HTML onto a text and wanted to generally resolve numeric entities, where

 &#xxx;

stands for the Unicode with decimal number xxx. This code with a bit of regsub/subst magic seems to work fine, with and without leading zero:

 proc entity'resolve string {
    set map {}
    foreach {entity number} [regexp -all -inline {&#(\d+);} $string] {
        lappend map $entity [format \\u%04x [scan $number %d]]
    }
    string map [subst -nocomm -novar $map] $string
 }
 % entity'resolve "cholesterol checked for $25 and blood pressure for $10 for 16 readings"
 cholesterol checked for $25 and blood pressure for $10 for 16 readings

Argh... again the Wiki proactively resolves for me :} In place of the $ signs in the string above, it really has (without spaces) & # 0 3 6 ; resp. & # 3 6 ;

I put the entitiy for & in place, that way the wiki shows the string as desired.


See also: HTML character entity references char2ent