Version 32 of Datatype impedance mismatch between Tcl and SQLite

Updated 2016-03-04 17:07:36 by pooryorick

One convenient feature of the Tcl interface to SQLite is that it will use Tcl values directly in SQL statements via a $variable syntax nearly identical to Tcl variable substitution syntax. For example, to insert the value of a Tcl variable named $bigstring:

db eval {INSERT INTO t1 VALUES($bigstring)}

Because the entire INSERT statement is enclosed in braces, Tcl doesn't do the variable substitution. SQLite sees the literal string $bigstring in the INSERT statement where an expression is expected and retrieves for itself the value of $bigstring. An inferior alternative would have been to have Tcl substitute the string value into the SQL statement, leaving SQLite to to parse it back out again:

db eval "INSERT INTO t1 VALUES([string map {' ''} $bigstring])"

The first approach is faster because it doesn't make unnecessary copies of the content of $bigstring, and also avoids the possiblity of an injection attack.

When it retrieves the value of a Tcl variable, SQLite makes an attempt to get some hint as to how to interpret the value by inspecting any cached internal interpretation that may have been left in the Tcl_Obj that holds the value. If SQLite finds a cached numeric or bytearray interpretation, it takes that as the intended interpretation of the value. This is a violation of the principle that every value is a string, and sometimes it bites.

Here is a description of how SQLite utilizes the information it finds by inspecting the internal cached interpretation of a Tcl value:

  1. If the internal cached interpretation is boolean, int or wideInt, it is interpreted by SQLite as a 64-bit integer.
  2. If the internal cached interpretation is double, it is interpreted by SQLite as a double.
  3. If the internal cached interpretation interpretation is bytearray and there is no string representation, or if an @ character is used instead of $ to introduce the variable name, then the value is interpreted as a BLOB.
  4. If the variable does not exist, a NULL used as the value.
  5. Otherwise, the string representation of the value is used as a string.

NEM 2007-09-25: I wasn't aware of this behaviour, thanks for pointing it out. Surely SQLite knows what types are expected for each attribute of a relation? It could then use this information and attempt to coerce the Tcl value to an appropriate type that way, rather than attempting to guess the type based on what is passed. I.e., if a relation is declared as accepting an "id" attribute of type "integer" then it could call Tcl_GetIntFromObj on the corresponding value passed in.


DRH Replies: Your assumption that SQLite knows what types to expect for bound parameters is incorrect. SQLite uses dynamic typing. Just because a relation is declared "int" does not prevent it from storing a string. SQLite will try to convert from a string to an int if it can, but will store a string rather than lose information. This is a feature of SQLite, not a bug.