Version 4 of GSM 03.38 encoding

Updated 2005-07-04 16:32:15

This page describes how to convert a string to or from the GSM 03.38 character set.

Requirements

The non-null 0 character "@"

In the GSM 03.38 character set, the character "@" is encoded as 0x00. encoding convertto interprets such characters as "not convertible" and uses the fallback character "?".

Example:

    set gsmString [encoding convertto gsm0338 "[email protected]"]
    # will return me?domain.com instead of me\x00domain.com

Workarounds

 # The string is split at each "@", and the pieces are converted
 # separately.
 proc toGsm {aString} {
  set partsWithoutAt [split $aString "@"]

  set convertedPartsWithoutAt {}
  foreach part $partsWithoutAt {
   lappend convertedPartsWithoutAt [encoding convertto gsm0338 $part]
  }

  return [join $convertedPartsWithoutAt "\x00"]
 }

An alternative workaround, which is shorter but relies on an implementation detail of encoding convertto:

 proc toGsm {aString} {
  return [encoding convertto gsm0338 [string map {@ \x00} $aString] ]
 }

Lars H, 4 July 2005: I wouldn't label that as an implementation detail of encoding convertto; aren't you simply using the fact that gsm0338 maps NUL to itself? The string map maps \x40 to \x00.

As for the encoding error, is this a shortcoming in the encoding mechanism (impossible to map non-NUL characters to NUL) or an error in the particular encoding definition file? I wouldn't be entirely surprised if other languages have trouble with NULs in strings, but Tcl handles it correctly AFAIK.


Category?