tdbf

Description of the software

 What: tdbf
 Where: http://sqlitestudio.pl/tdbf/tdbf-0.5.tar.gz
 Description: [DBF] file reader/writer package.
 Dependencies: IncrTcl
 Licence: Tcl licence (BSD-like)
 Current version: 0.5
 Updated: 09/2012
 Author: Paweł Salawa (aka Googie).
 Contact: http://sqlitestudio.pl/index.rvt?act=contact

It depends on Itcl, but this is just for quick namespace creation bounded with handler command. It's easy to get rid of Itcl from the code, but I just prefer to use it.

Features

Written in pure-Tcl (with Itcl), so it's cross-platform,
Simple API (see below),
Reads and writes dbf memo (M, G, B, P types) fields,
Re-uses deleted records area for new inserted records,
Supports some unique features of FoxPro and Clipper variants.
Supports various code pages of character encoding.
Doesn't suppport dBASE 7, but it's in TODO.

Demos

Read all records and print them

package require tdbf 0.1
tdbf::dbf myDbf
myDbf open test1.dbf

puts "Columns: [myDbf getColumnNames]"
while {[set values [myDbf gets]] != ""} {
    puts $values
}

myDbf close

Read all records and print them - 2nd method

package require tdbf 0.1
tdbf::dbf myDbf
myDbf open test1.dbf

set columns [myDbf getColumnNames]
puts "Columns: $columns"
myDbf for value {
    foreach c $columns {
        puts -nonewline "$value($c) "
    }
    puts ""
}

myDbf close

Create dbf and put some data into it

This script will create dbf file, add 2 columns, then add 3 new records and then replace second record with new value.

package require tdbf 0.1
tdbf::dbf myDbf

file delete -force test2.dbf
myDbf open test2.dbf

myDbf addColumn "col1" "N" 3 ;# (numeric type, length = 3 digits)
myDbf addColumn "col2" "C" 20 ;# (character/text type, 20 characters)

for {set i 0} {$i < 3} {incr i} {
    myDbf insert [list $i "value $i"]
}

# Modify 2nd record
myDbf update 1 [list 5 "xyz"]

myDbf close

API

constructor {{errorHandler ""}}

The errorHandler code will be evaluated with literal error code appended (and optionally some other arguments). These errors are more like warnings. Dbf will still work but some limitations might be applied.

Possible values for reading file are:

DBT_DOESNT_EXIST

When the .dbt file (memo table) doesn't exist or it's not readable, but the database type requires it to exists. It can be ignored, but in this case the referenced memo values will be returned as empty strings.

Possible values for writing file are:

DBT_READ_ONLY	When .dbt file has read-only permissions or you cannot create the new .dbt file because of permissions.
COLUMN_EXISTS columnName	When trying to add column that already exist. Also column name is appended to error handler arguments. The addColumn will just skip this column.
RECORDS_EXIST columnName	When trying to add any column while there are already some records in DBF. Also column name is appended to error handler arguments. The addColumn will skip this column.
COLUMN_NAME_TOO_LONG columnName	When adding column with name longer than 10 characters (this is limited by DBF format). Column will still be added, but its name will be truncated to 10 characters.
NO_RECORDS_WHILE_UPDATING	Tried to update (with [dbf update]) while there's not a single record in DBF.

Static methods:

dbf::julianDateToUnixTime value	Converts value read from "T" or "@" type into unixtime format (+ milliseconds), but only if it's possible (i.e. the time is after start of a year 1970). If the conversion is not possible, then "0" is returned (which is actually equal to the beginning of 1970.).
dbf::unixTimeToJulianDate value	Converts unixtime (+ milliseconds) value to Julian Day date format, so it's applicable for field of type "T" or "@".
dbf::shortDateToUnixTime value	Converts "short date" format to unixtime (no milliseconds) format, but only if it's possible, otherwise returns 0. The "short date" is a date in format: "YYYYMMDD".
dbf::unixTimeToShortDate value	Converts unixtime (no milliseconds, just like from clock seconds) to the "short date" format.
dbf::getRecognizedEncodings	Returns list of available character encodings that are supported by both Tcl and DBF.

Object methods:

open file	Opens DBF file. If will be closed automatically at object destruction. It also opens DBT file (memo table) if it exists. If file doesn't exists, it creates new DBF file. If necessary (during writing data) might create also a DBT file.
read fd ?memoFd?	Reads DBF file from open channel. You can consider this method as a variant of [dbf open] accepting channels instead of file. Channel has to be readable (in any case) and writable (if you want to modify anything in file) and also switched to binary translation and non-blocking mode. Same requirements apply to memoFd. Channel for DBT (memo table) is optional.
close	Closes file that was open with [dbf open] or passed to [dbf read].
addColumn name type ?length? ?precision?	Adds new column with given type (N, C, ...). Length is optional in most cases, except for types: N and C. Precision is always optional, but may be provided. Dbf cannot contain any records in order to add column.
insert values	Inserts new data record into dbf. Number of elements in 'values' has to be the same as number of columns, otherwise error will be raised. Values have to be in same format as returned from [gets]. It inserts record in place of first record marked as deleted or if there's no deleted record, then it appends record to the end of DBF file.\|&
delete index	Marks record as deleted. Note that DBF records are not physically removed from file, they are just marked as deleted so they can be reused by [insert]. Use [vacuum] to force remove deleted records from file. Returns true on success or false on failure (index out of range).
update index values ?columnName?	Updates all values of record with given index, or single column of that row if columnName is provided. See [seek] for index details. Returns true on success or false on failure (index out of range).
getAllData	Returns all records in format {{row1field1Value row1field2Value ...} {row2field1Value row2field2Value} ...}. Excludes deleted records.
for arrName body	Iterates through all records and puts each record values into array named arrName with column names as keys. For each record the body is executed with arrName prepared.
seek index	Moves reading/writing pointer at given position. Index is record order number in dbf (excluding deleted records), which is in range from 0 to [getDataCount]-1. Can be end-N. Returns true on success, false on failure (index out of range).
tell	Returns record index at which the handler is currently. Returns -1 if it's not at any valid index currently (before opening file or after read last record).
gets	Returns record at current index and increments index. Remember to [seek] to proper record if you called any other method from this class before the [gets]. May return empty list if no data was available. You can use [tell] to find out if the handler is positioned correctly.
vacuum	Removes deleted records from the file to reduce file size. Note, that this method depends on Tcl 8.5 or newer.
getVersion	Returns database file version in 2 hex characters format.
getVersionName	Returns human readable version description.
getLastModificationDate	Returns last modification date in [clock seconds] format.
getColumns	Returns list of columns, where each column is a Tcl dict with keys: name, type, length, precision, indexed.
getColumnNames	Returns list of column names.
getDataCount	Returns total number of records, excluding deleted records.
getRecordSize	Returns size of single record in bytes
encoding	Returns currently used encoding for reading/writing the data. The initial value of the encoding is set to local Tcl encoding, then - while reading dbf header - tdbf tries to find out the best encoding that would fit codepage declared in the header and uses it. If no code page fits the one declared in the header, then the Tcl's default encoding is used.
setEncoding encoding	Tries to set given encoding for reading/writing the data. This will success only if given encoding is on the list of encodings supported by dbf format and recognized by tdbf itself. To see list of valid encodings use dbf::getRecognizedEncodings.
isFlagShip	Returns boolean value indicating if the dbf file is of "FlagShip" type. It's useful for finding out expected data type (see data type mapping below).

Datatypes translation map

Use this datatype translation map to learn what value formats can you expect to be returned when reading DBF and use same formats to pass the data to DBF while writing DBF.

Notes:

The flagship attribute is a boolean flag depended on DBF version. Some DBF versions assume this flag to be set to 1. By default it's 0.
In case of "V" and "X" types, the "Max size" column means actually the exact size, not less, not greater. It's because these field types depend on the size being defined for it. For different size (and different flagship) they behave differently.

DBF type	DBF meaning	Max size	Tcl value format	Notes
C	characters	254 characters	string
N	number/numeric	18 digits	number	It's an integer, float, double - a number in any means. Includes a minus sign (if necessary).
L	logical		boolean	Can be also an empty string (DBF allows boolean to be undefined)
I, +	integer	4 bytes	integer	The '+' type identifies "autoincrement" field.
D	date	8 characters	string	A date in format: "YYYYMMDD". Use dbf::shortDateToUnixTime and dbf::unixTimeToShortDate for easy conversion.
M, G	text memo	4 GB	string	The actual values are stored in separate file named with .dbt extension.
F	float	20 digits	float
B, P	binary memo	4 GB	binary data	The actual values are stored in separate file named .fpt extension
O	double	8 bytes	double
Y	currency	8 bytes, 4 bits precision	string	Value in range from -922,337,203,685,477.5807 to +922,337,203,685,477.5807. The value can be so big, it doesn't fit in regular Tcl double type, so it's represented as a string.
V, X	varifield	varies	varies	Depending on the exact field length and database type (the flagship attribute):
V, X	integer	2 bytes	integer	(flagship=1)
V, X	date	3 bytes	string	(flagship=0) The value is returned and expected in the same format as for type "D".
V, X	integer	4 bytes	integer	(flagship=0)
V, X	double	8 bytes	double	(flagship=1)
V, X	memo	infinite	string	(flagship=0) This is special kind of DBF memo, so should be a string, but this is not supported yet.
T, @	timestamp	01/01/4713BC to infinite	list of {date time}, which are integers.	The date is the number of days since 01/01/4713 BC. Time is: hours * 3600000 + minutes * 60000 + Seconds * 1000. If the date is at (or after) year 1970 you can use dbf::julianDateToUnixTime to convert the value into format usable for clock command. There's also dbf::unixTimeToJulianDate method which makes the opposite conversion.

ChangeLog

14.09.2012 - Version 0.5

Fixed reading Visual FoxPro (hex version \x32) of "N" fields.
Fixed [getAllData] method.

15.08.2012 - Version 0.4

Added "isFlagShip" method.
Added character encoding support. This comes with methods: setEncoding, encoding and getRecognizedEncodings.
Fixed reading 3-length "V" columns (dates) so they are actually read as "D" column.
Implemented [vacuum] method.
Implemented [tell] method.
Introduced new datetime conversion methods: unixTimeToJulianDate, shortDateToUnixTime, unixTimeToShortDate.
Fixed reading column length from header (for big lengths).
Fixed internal methods: convertShortDate and convertShortDateToBin to support dates before 1970.
Fixed reading "@" columns.

09.08.2012 - Version 0.3

Field type "T" is now read correctly.
Added julianDateToUnixTime function to convert Julian dates after 1970 to unixtime, which is more Tcl-like.
Fixed flushInitialHeader when creating new dbf file in any month except of 4th quarter of the year - didn't work at all.

01.08.2012 - Version 0.2

Fixed reading of Visual FoxPro files.
"D" type is now read as string, not as unixtime, cause unixtime doesn't deal with dates before 1970.
"C" type is now trimmed from left side, so there are no extra white spaces before the actual value.
Added pkgIndex.tcl, making a complete Tcl package.

01.11.2011 - Version 0.1

TODO

some better validation - currently if you feed it with invalid file, it will most likely crash.
fix reading B and P fields - they're not stored in dbt file, but in fpt or possibly other one - it's tbd.
read "V"/"X" types fully
handle dBASE 7
indexes support
method to recover records marked as deleted

Discussions

Category Package

Category Embedded Database