tdbf

Description of the software

 What: tdbf
 Where: http://sqlitestudio.pl/tdbf/tdbf-0.5.tar.gz
 Description: [DBF] file reader/writer package.
 Dependencies: IncrTcl
 Licence: Tcl licence (BSD-like)
 Current version: 0.5
 Updated: 09/2012
 Author: Paweł Salawa (aka Googie).
 Contact: http://sqlitestudio.pl/index.rvt?act=contact

It depends on Itcl, but this is just for quick namespace creation bounded with handler command. It's easy to get rid of Itcl from the code, but I just prefer to use it.


Features

  • Written in pure-Tcl (with Itcl), so it's cross-platform,
  • Simple API (see below),
  • Reads and writes dbf memo (M, G, B, P types) fields,
  • Re-uses deleted records area for new inserted records,
  • Supports some unique features of FoxPro and Clipper variants.
  • Supports various code pages of character encoding.
  • Doesn't suppport dBASE 7, but it's in TODO.

Demos

Read all records and print them

package require tdbf 0.1
tdbf::dbf myDbf
myDbf open test1.dbf

puts "Columns: [myDbf getColumnNames]"
while {[set values [myDbf gets]] != ""} {
    puts $values
}

myDbf close

Read all records and print them - 2nd method

package require tdbf 0.1
tdbf::dbf myDbf
myDbf open test1.dbf

set columns [myDbf getColumnNames]
puts "Columns: $columns"
myDbf for value {
    foreach c $columns {
        puts -nonewline "$value($c) "
    }
    puts ""
}

myDbf close

Create dbf and put some data into it

This script will create dbf file, add 2 columns, then add 3 new records and then replace second record with new value.

package require tdbf 0.1
tdbf::dbf myDbf

file delete -force test2.dbf
myDbf open test2.dbf

myDbf addColumn "col1" "N" 3 ;# (numeric type, length = 3 digits)
myDbf addColumn "col2" "C" 20 ;# (character/text type, 20 characters)

for {set i 0} {$i < 3} {incr i} {
    myDbf insert [list $i "value $i"]
}

# Modify 2nd record
myDbf update 1 [list 5 "xyz"]

myDbf close

API

  • constructor {{errorHandler ""}}

The errorHandler code will be evaluated with literal error code appended (and optionally some other arguments). These errors are more like warnings. Dbf will still work but some limitations might be applied.

Possible values for reading file are:

DBT_DOESNT_EXISTWhen the .dbt file (memo table) doesn't exist or it's not readable, but the database type requires it to exists. It can be ignored, but in this case the referenced memo values will be returned as empty strings.

Possible values for writing file are:

DBT_READ_ONLYWhen .dbt file has read-only permissions or you cannot create the new .dbt file because of permissions.
COLUMN_EXISTS columnNameWhen trying to add column that already exist. Also column name is appended to error handler arguments. The addColumn will just skip this column.
RECORDS_EXIST columnNameWhen trying to add any column while there are already some records in DBF. Also column name is appended to error handler arguments. The addColumn will skip this column.
COLUMN_NAME_TOO_LONG columnNameWhen adding column with name longer than 10 characters (this is limited by DBF format). Column will still be added, but its name will be truncated to 10 characters.
NO_RECORDS_WHILE_UPDATINGTried to update (with [dbf update]) while there's not a single record in DBF.
  • Static methods:
dbf::julianDateToUnixTime valueConverts value read from "T" or "@" type into unixtime format (+ milliseconds), but only if it's possible (i.e. the time is after start of a year 1970). If the conversion is not possible, then "0" is returned (which is actually equal to the beginning of 1970.).
dbf::unixTimeToJulianDate valueConverts unixtime (+ milliseconds) value to Julian Day date format, so it's applicable for field of type "T" or "@".
dbf::shortDateToUnixTime valueConverts "short date" format to unixtime (no milliseconds) format, but only if it's possible, otherwise returns 0. The "short date" is a date in format: "YYYYMMDD".
dbf::unixTimeToShortDate valueConverts unixtime (no milliseconds, just like from clock seconds) to the "short date" format.
dbf::getRecognizedEncodingsReturns list of available character encodings that are supported by both Tcl and DBF.
  • Object methods:
open fileOpens DBF file. If will be closed automatically at object destruction. It also opens DBT file (memo table) if it exists. If file doesn't exists, it creates new DBF file. If necessary (during writing data) might create also a DBT file.
read fd ?memoFd?Reads DBF file from open channel. You can consider this method as a variant of [dbf open] accepting channels instead of file. Channel has to be readable (in any case) and writable (if you want to modify anything in file) and also switched to binary translation and non-blocking mode. Same requirements apply to memoFd. Channel for DBT (memo table) is optional.
closeCloses file that was open with [dbf open] or passed to [dbf read].
addColumn name type ?length? ?precision?Adds new column with given type (N, C, ...). Length is optional in most cases, except for types: N and C. Precision is always optional, but may be provided. Dbf cannot contain any records in order to add column.
insert valuesInserts new data record into dbf. Number of elements in 'values' has to be the same as number of columns, otherwise error will be raised. Values have to be in same format as returned from [gets]. It inserts record in place of first record marked as deleted or if there's no deleted record, then it appends record to the end of DBF file.|&
delete indexMarks record as deleted. Note that DBF records are not physically removed from file, they are just marked as deleted so they can be reused by [insert]. Use [vacuum] to force remove deleted records from file. Returns true on success or false on failure (index out of range).
update index values ?columnName?Updates all values of record with given index, or single column of that row if columnName is provided. See [seek] for index details. Returns true on success or false on failure (index out of range).
getAllDataReturns all records in format {{row1field1Value row1field2Value ...} {row2field1Value row2field2Value} ...}. Excludes deleted records.
for arrName bodyIterates through all records and puts each record values into array named arrName with column names as keys. For each record the body is executed with arrName prepared.
seek indexMoves reading/writing pointer at given position. Index is record order number in dbf (excluding deleted records), which is in range from 0 to [getDataCount]-1. Can be end-N. Returns true on success, false on failure (index out of range).
tellReturns record index at which the handler is currently. Returns -1 if it's not at any valid index currently (before opening file or after read last record).
getsReturns record at current index and increments index. Remember to [seek] to proper record if you called any other method from this class before the [gets]. May return empty list if no data was available. You can use [tell] to find out if the handler is positioned correctly.
vacuumRemoves deleted records from the file to reduce file size. Note, that this method depends on Tcl 8.5 or newer.
getVersionReturns database file version in 2 hex characters format.
getVersionNameReturns human readable version description.
getLastModificationDateReturns last modification date in [clock seconds] format.
getColumnsReturns list of columns, where each column is a Tcl dict with keys: name, type, length, precision, indexed.
getColumnNamesReturns list of column names.
getDataCountReturns total number of records, excluding deleted records.
getRecordSizeReturns size of single record in bytes
encodingReturns currently used encoding for reading/writing the data. The initial value of the encoding is set to local Tcl encoding, then - while reading dbf header - tdbf tries to find out the best encoding that would fit codepage declared in the header and uses it. If no code page fits the one declared in the header, then the Tcl's default encoding is used.
setEncoding encodingTries to set given encoding for reading/writing the data. This will success only if given encoding is on the list of encodings supported by dbf format and recognized by tdbf itself. To see list of valid encodings use dbf::getRecognizedEncodings.
isFlagShipReturns boolean value indicating if the dbf file is of "FlagShip" type. It's useful for finding out expected data type (see data type mapping below).

Datatypes translation map

Use this datatype translation map to learn what value formats can you expect to be returned when reading DBF and use same formats to pass the data to DBF while writing DBF.

Notes:

  • The flagship attribute is a boolean flag depended on DBF version. Some DBF versions assume this flag to be set to 1. By default it's 0.
  • In case of "V" and "X" types, the "Max size" column means actually the exact size, not less, not greater. It's because these field types depend on the size being defined for it. For different size (and different flagship) they behave differently.
DBF typeDBF meaningMax sizeTcl value formatNotes
Ccharacters254 charactersstring
Nnumber/numeric18 digitsnumberIt's an integer, float, double - a number in any means. Includes a minus sign (if necessary).
LlogicalbooleanCan be also an empty string (DBF allows boolean to be undefined)
I, +integer4 bytesintegerThe '+' type identifies "autoincrement" field.
Ddate8 charactersstringA date in format: "YYYYMMDD". Use dbf::shortDateToUnixTime and dbf::unixTimeToShortDate for easy conversion.
M, Gtext memo4 GBstringThe actual values are stored in separate file named with .dbt extension.
Ffloat20 digitsfloat
B, Pbinary memo4 GBbinary dataThe actual values are stored in separate file named .fpt extension
Odouble8 bytesdouble
Ycurrency8 bytes,
4 bits precision
stringValue in range from -922,337,203,685,477.5807 to +922,337,203,685,477.5807. The value can be so big, it doesn't fit in regular Tcl double type, so it's represented as a string.
V, XvarifieldvariesvariesDepending on the exact field length and database type (the flagship attribute):
V, Xinteger2 bytesinteger(flagship=1)
V, Xdate3 bytesstring(flagship=0) The value is returned and expected in the same format as for type "D".
V, Xinteger4 bytesinteger(flagship=0)
V, Xdouble8 bytesdouble(flagship=1)
V, Xmemoinfinitestring(flagship=0) This is special kind of DBF memo, so should be a string, but this is not supported yet.
T, @timestamp01/01/4713BC to infinitelist of {date time}, which are integers.The date is the number of days since 01/01/4713 BC. Time is: hours * 3600000 + minutes * 60000 + Seconds * 1000. If the date is at (or after) year 1970 you can use dbf::julianDateToUnixTime to convert the value into format usable for clock command. There's also dbf::unixTimeToJulianDate method which makes the opposite conversion.

ChangeLog

14.09.2012 - Version 0.5

  • Fixed reading Visual FoxPro (hex version \x32) of "N" fields.
  • Fixed [getAllData] method.

15.08.2012 - Version 0.4

  • Added "isFlagShip" method.
  • Added character encoding support. This comes with methods: setEncoding, encoding and getRecognizedEncodings.
  • Fixed reading 3-length "V" columns (dates) so they are actually read as "D" column.
  • Implemented [vacuum] method.
  • Implemented [tell] method.
  • Introduced new datetime conversion methods: unixTimeToJulianDate, shortDateToUnixTime, unixTimeToShortDate.
  • Fixed reading column length from header (for big lengths).
  • Fixed internal methods: convertShortDate and convertShortDateToBin to support dates before 1970.
  • Fixed reading "@" columns.

09.08.2012 - Version 0.3

  • Field type "T" is now read correctly.
  • Added julianDateToUnixTime function to convert Julian dates after 1970 to unixtime, which is more Tcl-like.
  • Fixed flushInitialHeader when creating new dbf file in any month except of 4th quarter of the year - didn't work at all.

01.08.2012 - Version 0.2

  • Fixed reading of Visual FoxPro files.
  • "D" type is now read as string, not as unixtime, cause unixtime doesn't deal with dates before 1970.
  • "C" type is now trimmed from left side, so there are no extra white spaces before the actual value.
  • Added pkgIndex.tcl, making a complete Tcl package.

01.11.2011 - Version 0.1


TODO

  • some better validation - currently if you feed it with invalid file, it will most likely crash.
  • fix reading B and P fields - they're not stored in dbt file, but in fpt or possibly other one - it's tbd.
  • read "V"/"X" types fully
  • handle dBASE 7
  • indexes support
  • method to recover records marked as deleted

Discussions