Why AndroWish switched to TCL_UTF_MAX=6

Difference between version 4 and 7 - Previous - Next
   * Support of [Emoji] requires to cover the full range of Unicode code points (0x000000...0x10ffff).
   * Unicode 8.0 support in Tcl was introduced by Jan Nijtmans during EuroTcl 2015.
   * Emoji is quite popular on mobile devices, specifically fully supported on Android starting with the 4.4 release.
   * There's a nice TrueType font named Symbola, which implements most of Emoji.
   * By redefining TCL_UTF_MAX (default value 3) the valid range of supported code points can be adapted as well as the in-core memory requirements.
   * Convention: TCL_UTF_MAX=3 and TCL_UTF_MAX=4 fits in 16 bit memory representation (sizeof (Tcl_UniChar)).
   * Convention: TCL_UTF_MAX>4 needs 32 bit memory representation (sizeof (Tcl_UniChar)).
   * TCL_UTF_MAX>=4 is able to cover the full range of code points (0x000000...0x10ffff).
   * TCL_UTF_MAX=4 requires surrogate pairs internally to represent a code point larger than 0x00ffff by using two consecutive 16 bit values resembling UTF-8 notation.
   * Usage of surrogate pairs is expensive and error prone with respect to counting the number of code points (and repeats a range of strange problems observed in the most popular programming languages of the year of this writing (whose names both start with the letter 'J')).
   * TCL_UTF_MAX=6 forces usage of 32 bit as internal Tcl_UniChar representation and eliminates counting issues as well as escaping issues.
   * 32 bit representation fits font rendering with freetype.

And last but not least:

   * We are the Borg! AndroWish need not be compatible with nothing. You will be assimilated.


But there's hope:

[https://web.archive.org/web/20150105000208if_/http://www.ch-werner.de/AndroWish/aw-no-work.jpg]
Another article explaining the problem is https://unascribed.com/b/2019-02-08-the-tragedy-of-ucs2.html%|%The Tragedy of UCS-2%|%.

<<categories>> Android | Application | Dev. Tools | Unicode