[SQLite] is an embedded, serverless [SQL] database engine that has a long and intimate relationship to [Tcl]. In fact, the author of SQLite says that SQLite is really a Tcl extension that broke loose and drifted out into the larger world of open-source software. Of particular interest in the Tcl to SQLite interface is the way that TCL variables can be inserted into SQL statements using Tcl-like syntax. Suppose you have a Tcl variable named "$bigstring" that you want to insert into an SQL table. You can say this: db eval {INSERT INTO t1 VALUES($bigstring)} SQLite sees the $bigstring in the middle of the INSERT statement (notice that the entire INSERT statement is enclosed in {...} so that Tcl does no variable substitution itself) then asks Tcl for the value of the $bigstring variable and inserts that value directly. This is much easier and faster than trying to insert the text of $bigstring directly into in the INSERT statement: db eval "INSERT INTO t1 VALUES([string map {' ''} $bigstring])" Is there any doubt that the first method is easier to read and less prone to programming errors? And because it does not make unnecessary copies of the content of $bigstring, the first method is also much faster. The problem is that in order to pull off this feat of magic, the Tcl interface to SQLite has to do something that is considered bad style in the Tcl world: SQLite checks to see the current representation of the value in $bigstring, whether or not it is a string, integer, wide-int, floating point value, or "bytearray", and it binds different data types to SQLite depending on what it sees. The problem boils down to this: Tcl has only a single data type which is "string". The dual-representation mechanism in Tcl is a behind-the-scenes cache that the programmer is never suppose to know about and whose only purpose is to make things run faster. But SQLite, on the other hand, does have a programmer-visible concept of data types. SQLite understands strings, BLOBs, integers, floating-point values, and NULL. And so there is a bit of an "impedance mismatch" in the type systems of SQLite and TCL. The interface mostly covers up this mismatch so that often the programmer is never aware of it. But sometimes the impedance mismatch does surface and cause problems. And so a programmer that uses SQLite and TCL frequently and for large projects should probably be aware of the issue. When transferring values from Tcl into SQLite using the $-substitution mechanism shown in the first example above, SQLite looks at the alternative (non-string) representation of the Tcl variable (if it exists) and uses that alternative representation to help decide what data type to insert into SQLite. 1. If the alternative representation is "boolean" or "int" or "wideInt" then the value is inserted into SQLite as a 64-bit integer 2. If the alternative representation is "double" then the value is inserted into SQLite as a double. 3. If the alternative representation is "bytearray" and there is no string representation, then the value is inserted into SQLite as a BLOB. 4. If the variable does not exist, a NULL is inserted 5. Otherwise, the string representation is inserted as a string. In practice, this algorithm usually works quite well. But purists (a category which includes most of the core maintainers) object to this on the grounds that it exposes the alternative representation to the programmer. Suddenly it is possible to discover approximately what the alternative representation of a variable is. If you want to know if your variable $x has an "double" alternative representation, you can do this: db eval {SELECT typeof($x)} And if the answer is "real" then you know that the alternative representation of variable $x is a double. I say that the algorithm above ''usually'' works quite well. There are some rare occasions where it can come back to bite you though. So the programmer should exercise a modicum of caution. Recall that SQLite, unlike most other SQL database engines, allows you to store values of different types in different rows of the same column. SQLite will attempt to coerce a value into the declared type of a column if it can do so without loss of information. But if the value cannot be converted in a reversible way, no conversion occurs. In SQLite parlance, this is known as "type affinity". For more information see http://www.sqlite.org/datatype3.html Suppose you declare a column in your SQLite database to have type "INTEGER". This gives that column integer affinity. When you go to insert values into that column from Tcl, the variables you insert might have an "int" or "wideInt" representation or they might have only a string representation. If they do already have an "int" or "wideInt" representation, then they are inserted into SQLite as integers and no unnecessary conversions occur. If they only have a string representation, then the values are inserted into SQLite as strings but the column affinity immediately causes those values to be converted back into integers. So either way, you end up with integers in the database, as you would expect. It does not matter what the dual-representation of the TCL variable is. If the alternative representation happens to be an integer type then things run faster, but the same result appears in the end. Life is good. But SQLite also allows columns to have an affinity of "none" meaning that no type conversions are ever attempted and whatever data type is inserted is the data type that gets stored. Any column with a declared type of BLOB has this property. If there is no type affinity in SQLite, then the data type of the values inserted into your database will depend on what happens to be in the alternative representation of your TCL variables. Consider this example: db eval {CREATE TABLE t2(x BLOB);} ;# No affinity set v 123 db eval {INSERT INTO t2 VALUES($v)} In the code above, you never know if you inserted an integer 123 or a string "123". It depends on how TCL chose to represent the content of variable $v at the point that the insertion occurred. The best way to avoid this problem with integer, floating-point, or string values is to use an explicit cast in your SQL: db eval {INSERT INTO t2 VALUES(CAST($v AS INTEGER))} In this example, it does not matter if the $v Tcl variable is originally a string or an integer; it ends up getting inserted as an integer. The situation with BLOB values is a bit more complicated. By rule 3 above, a value is only inserted as a BLOB if it has a "bytearray" alternative representation and no string representation. This works out well for the usual case where you are reading some binary data from a stream and writing it into a database: set f [open somefile.jpg r] fconfigure $f -translation binary set v [read $f] close $f db eval {INSERT INTO images VALUES($v)} In the code above, the $v variable almost certainly has a bytearray representation and no string representation so it gets inserted as a BLOB. But if $v does have a string representation it will be inserted as a string. And casting to BLOB on the SQLite side will not fix the problem because Tcl and SQLite cast between strings and blobs differently. The solution in this case is to use "@" instead of "$" to name the variable: db eval {INSERT INTO images VALUES(@V)} When SQLite sees @v instead of $v, it always uses the bytearray representation of the variable and inserts that bytearray as a BLOB. Even if the variable did not originally have a bytearray representation, SQLite has Tcl convert it (by calling Tcl_GetByteArrayFromObj()) and inserts the result as a BLOB. Note, however that the @v trick only works for SQLite versions 3.4.1 and later. If you are using SQLite version 3.4.0 or earlier, you just have to be careful that the variable you want to insert as a BLOB does not contain a string representation. So, in summary, I say that the Tcl interface to SQLite has proven over years of use to be remarkably powerful and easy to use. Yes, there is a goofy impedance mismatch between the type systems of SQLite and Tcl, but in practice it rarely comes up and when it does come up it is easily fixed. ---- 2007/06/21 [Alexandre Ferrieux] May I ask why, instead of introducing a new construct @v, they didn't instead weaken the rule "bytearray and nostring" to "bytearray, possibly with string" ? [drh] answers: "@v" is not a new construct. SQL Server uses the "@v" construct for host parameter names and so SQLite already supported it for compatibility. It was there and so we just appropriated it for a new use in the TCL interface. It does not work to always insert a variable as a BLOB if it has a bytearray representation because things which you want to be strings very often have bytearray representations, and you do not want them being inserted as BLOBs. [Lars H]: Would it be possible to use something like Metakit's colon-letter suffixes to express a type intent? I'm thinking that since $ seems to be an extension specific to the Tcl interface, one could perhaps modify it a bit so that what follows the $ is not just the name of a Tcl variable, but also a hint of what type the value should have in SQL. Then instead of db eval {INSERT INTO images VALUES(@v)} one could have e.g. db eval {INSERT INTO images VALUES($v:B)} To force a number to string type one could do set four [expr {2*2}] db eval {INSERT INTO numbers VALUES($four:S)} I have to admit I know very little about SQL syntax though... ---- ''[escargo] 22 Jun 2007'' - Is there a recommended way for representing dates or timestamps in SQLite? ''[drh] answers:'' Julian day number. You can also do seconds since 1970 or ISO-8601 text date strings. See http://www.sqlite.org/cvstrac/wiki?p=DateAndTimeFunctions for more information. Since the julian day is represented as a floating point value and because TCL also represents dates as either seconds since 1970, Julian day number, or ISO-8601 date strings (among other formats) there is no impedence mismatch with regard to dates. ---- [Category Database]