Version 18 of Techniques for reading and writing application configuration files

Updated 2004-03-17 20:21:32

Purpose: to collect pointers and observations regarding techniques for reading and writing configuration files by an application


  • What is a configuration file?
  • What are the issues one needs to keep in mind?
  • What internet resources are available for exploring tcl techniques in writing and reading configuration files?

What is a configuration file?

A configuration file is a (usually small) file that contains information about a number of options that control the appearance or behavior of an application program. The end user can change the configuration by either editing the file or selecting options in the application program. See also the Tk option database.

Unix configuration files are often placed in the users' HOME directory with a leading "." in the name (making them hidden), like ".mozilla". XWindows also has APP_DEFAULT files which can define the appearance of a GUI. Windows configuration files are sometimes stored with the application program, and sometimes thrown into the system directory as initialization (.INI) files. MS Windows applications often store configuration information in the registry, which is a whole 'nother ball of wax.

Configuration File Issues

Configuration files need to be readable and writable by the user. They also need a relatively simple format if they are to be edited by hand, which often introduces errors. Parsing code should be able to handle errors, and general enough to accomodate feature growth of the application.


Arrays

bach: lvirden: what I like to do is using arrays. Writing: puts array get. Reading read eval array set

lvirden: That's a pretty common technique. That's why I was surprised not to find a page describing that as well as some of the other techniques (like option, etc.)

Mike Tuxford posted an example parser to comp.lang.tcl in Dec. 2002 which reads a configuration file and sets array elements. This code allows comments and blank lines, but malformed lists or other data errors could cause problems. This should be OK if the end user is never allowed to edit the configuration file.

    proc parseConfig {fname} {
        if {[file exists $fname]} {
            set fd [open $fname r]
        } else {
            puts "Can't find $fname or perms are bad"
            exit
        }
        while {![eof $fd]} {
            set data [gets $fd]
            if {[string index $data 0] == "#" || \
                    [string index $data 0] == " " || \
                    [string index $data 0] == ""} {
                continue
            }
            switch [lindex $data 0] {
                foo {
                    global [lindex $data 0]
                    set [lindex $data 0]([lindex $data 1]) [lindex $data 2]
                }
                bar {
                    global [lindex $data 0]
                    set [lindex $data 0]([lindex $data 1]) [lrange $data 2 end]
                }
                default {
                    global [lindex $data 0]
                    set [lindex $data 0] [lrange $data 1 end]
                }
            }
        }
        catch {close $fd}
        return
    }

RS: But, as I found out, source also works on pure data, so you can write:

 array set x [source x.dmp] 

Only you cannot have comments in the data then - or you can, if you accept an array element by the name of #, the , then you can dump

 # {Saved by ... on ...}

rmax: It's downside is, that this kind of configuration file isn't very readable or editable.

suchenwi: Oh, you could format it like this:

 foreach i [lsort [array names a]] {
        puts [list $i $a($i)]
 } 

That's actually what I do, so the file still is nice to browse.

aku AK: config files - There was a page about tcl patterns somewhere. I believe this contained something. also look for the pages by Koen Van Damme on the wiki.

aku: His homepage refers to a paper by him about parsing data files

rmax: Hmm, reading and writing configuration files of various flavour seems me worth spending a tcllib module for. I could add a package that parses windows-style .ini files, I once wrote.


Source

Another option for Tcl application configuration files is to just to code Tcl script, and [source] it into the interpreter. The application code is very simple, but malformed lists, misplaced [brackets], or malevolent code can wreak havoc on an application. (Or the user!) Bob Techentin posted code for evaluating a configuration file in a safe interpreter. This has the advantage of being very general, editable by an end user, and any errors will be caught by the standard Tcl error mechanism. The down side is that an error will stop the evaluation of the configuration file.

    proc parseConfig {fname} {
        if {[file exists $fname]} {
            set fd [open $fname r]
        } else {
            puts "Can't find $fname or perms are bad"
            exit
        }
        set configScript [read $fd]
        close $fd

        #  Create a safe interp to evaluate the
        #  code in the configuration script
        set si [interp create -safe]
        catch {$si eval $configScript}

        #  The only thing we expect from this
        #  configuration is setting some
        #  global variables, so we extract
        #  them thus
        global foo
        array set foo [$si eval array get foo]

        interp delete $si
    }

Andreas Kupries posted that he uses this procedure for Tcl DevKit, which is even more robust.

  • create an interpreter
  • delete _all_ commands in it, including _unknown_.
  • alias the commands I want to be able to process into it.
  • alias a replacement for 'unknown' into it. This replacement returns an empty string and records an error.
  • Eval the configuration script in the interpreter.

The special commands give him the data in the config file, and the replaced 'unknown' gives me the errors which occurred during the evaluation.

And here's another solution, very similar to the above mentioned, but developed on my own: Matthias Hoffmann - Tcl-Code-Snippets.


See Metakit


US: I always use the following code snippet as a configuration preamble [to ensure that one has a name for the current program running]

 if {[string length [info script]]} {
    # It's just a script
    set here [file dirname [info script]]
    } else {
    # Wrapped into an executable
    set here [file dirname [info nameofexecutable]]
    }
 set owd [pwd]
 cd $here
 cd ..
 set CFGDIR [file join [pwd] cfg]
 cd $owd

See Config File Parser too.


AM If you want to use the source statement to source configuration files (and why not? it is a very flexible way!) but are worried about safety (a malevolent or simply careless user might upset your application or your whole system), have a look at this script to safely source a file.


NEM Worth mentioning for configuration files, is that XML is being used more and more these days for this kind of thing. Tcl has two excellent XML extensions (TclXML and tDOM) both of which make parsing a configuration file into an in-memory representation very easy. The advantage of this is that the parsing is taken care of by the extension, and there are plenty of third-party applications for editing XML files, making it easier if the user evers wants to edit something.

US Do you really think XML files are easier to edit than a simple Tcl script, from a users point of view? If you have only a hammer, everything looks like a nail.

NEM In many ways, yes. If you want to keep simple configuration of name/value pairs, then you can do a simple [array get]/[array set] combo for the config file. However, if you want to store more complicated configuration data, with sections and different types of parameters, then this becomes more difficult. The simple array implementation is not going to work very well here. Moving to a full Tcl script which is sourced is a) dangerous, and b) offers a lot of scope for confusing the user, as they now may have to learn a full language to do a simple task (unless you limit the command set, and remove proc/rename/trace and probably others). The alternative middle-ground is to dream up your own configuration language, and parser to read it. You can manipulate Tcl's parser to do most of the hard work, but why bother when XML is ready designed with just these sorts of tasks in mind? I'd say XML is at least as easy to read as Tcl (bad tag structure ruins this, but then bad proc structure ruins Tcl code), plus if you think it's too much for the end user, you can just ship one of the many free or commercial XML editors with your application.

In my opinion, use Tcl arrays when you are storing a small amount of simple config data. Move to XML for larger more complicated configuration data. Don't use Tcl scripts for configuration, unless you need to do something really complicated (like allowing the user to define macros and such).

You could use Metakit instead of XML, and that has plenty of advantages (especially if you're already packaged as a starkit), although the file becomes binary, and there aren't so many editors for MK datafiles.

XML has the same problems as a Tcl script, in that a single error in the file will halt the parser with an error. However, if you are using a DOM parser (and not SAX) then there shouldn't have been any side-effects by then, IIRC.

WJR Here's how I use XML config files:

 <?xml version="1.0" encoding="UTF-8"?>
 <config>
     <start_time>20</start_time>
     <end_time>7</end_time>
     <zip_exe>d:/zip2.3/zip.exe</zip_exe>
     <zip_path>{some value}</zip_path>
     <parts_path>{some value}</parts_path>
 </config>

Then I use Tdom to parse and assign variables, like this:

 package require tdom

 # Open config file and read into a variable
 set config_file [open zip-service.xml r]
 set config [read $config_file]
 close $config_file

 # Parse the config and get all config elements
 set config_parse [dom parse $config]
 set params [$config_parse getElementsByTagName config]

 # Set the various config params
 set start_time [[$params selectNodes start_time/text()] data]
 set end_time [[$params selectNodes end_time/text()] data]
 set zip_exe [[$params selectNodes zip_exe/text()] data]
 set zip_path [[$params selectNodes zip_path/text()] data]
 set parts_path [[$params selectNodes parts_path/text()] data]

I find this technique works pretty well.


Category Deployment