Version 7 of Mar 22 Tcl Meetup notes

Updated 2022-03-09 12:16:22 by pooryorick

Tcl Meetup, 2022-03-09

In attendance: Brad Harder, Brian Griffin, dgp, Harald Oehlmann, Hypnotoad, Jan Nijtmans, Poor Yorick, Richard Hipp, Schelte Bron, Steve Huntley, Andreas Kupries, Rolf Ade, and others.

Unicode

Summary of surrogate pair issue: 8.6 always had issues handling them. That handling changed around 8.6.10 (?) to be like 8.7. string length of certain Unicode scalar values is 2, as documented in TIP 542. string length of any single Unicode code point will be 1 in 9.0. Jan Nijtmans suggested releasing 8.7 and 9.0 at the same time

Jan and Don discussed Tcl_UniCharToUtf, which maintains some state information about the string in the output buffer so that when it encounters a surrogate pair, it can wait to output a character until it encounters the second half of the pair, even though it only takes one character as an argument each time it is called.

Tcl_UniCharToUtf is used throughout the C sources, so teasing the ad-hoc utf-16 encode/decoder from Tcl will be a chore.

Brian mentioned that 8.7 crashed when he pasted an emoji into a text widget.