Serializing things via file locks

  • NOTE: Dropped vwait to avoid creating nested event loops - could be block now for the given timeout...
  • The env(temp)-issue is NOT fixed yet, as I overlooked/forgot the discussion at the bottom of this page...
 ###############################################################################
 # Package: lock.tcl
 # Author:  M.Hoffmann 2005-2015
 # Purpose: Guarantee atomic operations using a LOCKFILE, wrapper functions
 # Remarks: For intERprocess coordination, each process must specify THE SAME
 #          LOCKFILE. It's up to the programer to choose a name.-  We can let
 #          the module determine a name only if used within one program. Take
 #          care of the timeout: it depends on the type of the locked code and
 #          number of processes competing for the lock. The default is to wait
 #          only 1/4s for a lock to become available. Specifying a timeout of 0
 #          means to wait forever for a lock, effectively blocking the calling
 #          app which makes not much sense.
 # History:
 #  v0.1 2005/12/26
 #   - First release.
 #  v0.2 2006/02/25
 #   - New procs timeout and lockfile superseed the corresponding vars which
 #     are no longer availabe - they are used to query (and optionaly set new)
 #     defaults.
 #   - With v0.1 it wasn't possible to permanently overwrite the built-in
 #     defaults. [withLock] and [incrCounter] ALWAYS used these defaults because
 #     they where arg-less.
 #   - Additional proc defaults which restores timeout and lockfile to the
 #     original state.
 #   - Reworked withLock: errorState and results are given back to the caller.
 #   - withLock and incrCounter accepts optional args to pass timeout and lock-
 #     file on to acquireLock.
 #   - Added username to lockfile (perhaps usefull in search for bugs).
 #   - Addes namespace export *.
 #   - New and redesigned testproc(s) - some of them never worked right...
 #  v0.3 2015/03/02
 #   - Don't vwait in acquire! Using lock from elog, this caused hard to
 #     diagnose issues if called elog from within eventhandlers, because of
 #     nested event loops! So, variable wait is no longer used.
 ###############################################################################

 package provide lock 0.3

 namespace eval lock {

    variable defto 250; # 250ms default time to acquire/use a lock; 0 = infinite
    variable curto $defto
    variable deflf [file join $::env(temp) [pid].lock]; # default lockfile
    variable curlf $deflf
  # variable wait

    namespace export *

    # Query (and optionally set) the default timeout. Returns (new) value.
    #  Attention: no arg-checking!
    proc timeout args {
       variable curto
       return [expr {[llength $args]==1?[set curto $args]:$curto}]
    }

    # Query (and optionally set) the default lockfile. Returns (new) value.
    #  Attention: no arg-checking!
    proc lockfile args {
       variable curlf
       return [expr {[llength $args]==1?[set curlf $args]:$curlf}]
    }

    # Restore the original defaults
    proc defaults {} {
       variable defto
       variable deflf
       variable curto $defto
       variable curlf $deflf
    }

    # Problem: defaults through arglists are static. But user can change the
    # defaults through procs. So special code is required.
    proc acquireLock args {
       variable curto
       variable curlf
       # these (private) values are for now not accessible from outside
       set timeout  [expr {[lindex $args 0]==""?$curto:[lindex $args 0]}]
       set lockfile [expr {[lindex $args 1]==""?$curlf:[lindex $args 1]}]
       set trystart [clock clicks -milliseconds]
       while {([clock clicks -milliseconds] - $trystart) < $timeout || $timeout == 0} {
          if {![catch {open $lockfile {WRONLY CREAT EXCL}} rc]} {
             puts  $rc "locked by pid [pid] from $::tcl_platform(user)\
                        at [clock format [clock seconds]]"
             flush $rc
             return [list $rc $lockfile]
          }
          after 10; # set lock::wait 1; vwait lock::wait
       }
       return -code error "Error: timeout after $timeout ms"
    }

    proc releaseLock {info} {
       if {![catch {fconfigure [lindex $info 0]}]} {
          if {![catch {
             close [lindex $info 0]
             file delete -force [lindex $info 1]
             # another OPEN ... EXCL only will succeed if the file is gone
          } rc]} {
             return ""
          } else {
             return -code error "Error releasing lockfile: '$rc'"
          }
       } else {
          return "invalid lock, ignored"
       }
    }

    # Abstraction layer upon acquire/release to perform code which is protected
    # from running in parallel through a lockfile. Usefull e.g. to update a file
    # which is in use by many systems. Optionally specify timeout and lockfile
    # which are passed through to acquireLock (eval is therefore required).
    # To handle errors in this routine, use catch {withLock {...}}
    proc withLock {code args} {
       if {![catch {eval acquireLock $args} rc]} {
          set urc [catch {uplevel 1 $code} ret]
          releaseLock $rc; # report errors the the caller, no CATCHing here
          return -code $urc $ret; # give original results back, including errors
       } else {
          return -code error $rc; # cannot get lock
       }
    }

    # Abstraction layer upon withLock, to maintain a file with a counter in it.
    # The counter is incremented with each access, to maintain a unique id.
    # Optionally specify timeout and lockfile which are passed over
    # wihtLock to acquireLock.
    # Let errors raise up to the main proc.
    proc incrCounter {ctrFile args} {
       set cmd {
          if {[file exists $ctrFile]} {
             set hCntr [open $ctrFile r]; # a+ not possible in VFS::MK4...
             set counter [gets $hCntr]
             incr counter
             close $hCntr
          } else {
             set counter 0
          }
          set hCntr [open $ctrFile w]; # a+ not possible in VFS::MK4...
          puts $hCntr $counter
          close $hCntr
       }
       eval withLock [list $cmd] $args; # notice eval and list are required!
       return $counter
    }
 }
 #------------------------------------------------------------------------------

A few quick'n'dirty Tests to prove the routines work right

 # lock_test0.tcl, 25.02.2006
 # basic test(suite)
 lappend auto_path [pwd]
 puts "result of \[package require lock 0.2\] should be 0.2:\t[package require lock 0.2]"
 puts "\[lock::timeout\] should give the default timeout right after loading:\t[lock::timeout]"
 puts "\[lock::lockfile\] should give the default lockfile right after loading:\n\t[lock::lockfile]"
 puts "calling \[lock::timeout 1000\] returnes:\t[lock::timeout 1000]"
 puts "calling \[lock::lockfile hugo.dat\] returns:\n\t[lock::lockfile hugo.dat]"
 puts "\[lock::timeout\] without args should now reflect the new value:\t[lock::timeout]"
 puts "\[lock::lockfile\] without args should now reflect the new value:\n\t[lock::lockfile]"
 puts "\[lock::defaults\] should restore the original defaults (gives nothing):\t[lock::defaults]"
 puts "\[lock::timeout\] now gives back:\t[lock::timeout]"
 puts "\[lock::lockfile\] now gives back:\n\t[lock::lockfile]"
 puts "Performing \[namespace import lock::*\]:\t[namespace import lock::*]"
 puts "1st \[acquireLock]\ returns:\n\t[set lockHandle [acquireLock]]"
 puts "2nd \[acquireLock]\ withOut releasing the lock (correctly) returns an error:"
 catch {set lockHandle [acquireLock]} rc
 puts $rc
 puts "1st \[releaseLock $lockHandle]\ returns:\t[releaseLock $lockHandle]"
 puts "2nd \[releaseLock $lockHandle]\ returns:\t[releaseLock $lockHandle]"
 puts "\[acquireLock 2000]\ returns:\n\t[set lockHandle [acquireLock 2000]]"
 puts "\[releaseLock $lockHandle]\ returns:\t[releaseLock $lockHandle]"
 puts "\[acquireLock 2000 lock.file]\ returns:\n\t[set lockHandle [acquireLock 2000 lock.file]]"
 puts "\[releaseLock $lockHandle]\ returns:\t[releaseLock $lockHandle]"
 puts -nonewline "\[withLock {puts \[lock::timeout\]}\] returns:\t"
 puts -nonewline "[withLock {puts [lock::timeout]}]"
 puts "calling \[lock::timeout 500\] returnes:\t[lock::timeout 500]"
 puts -nonewline "Now \[withLock {puts \[lock::timeout\]}\] returns:\t"
 puts -nonewline "[withLock {puts [lock::timeout]}]"
 puts -nonewline "\[withLock {puts \[lock::timeout\]} 2000\] returns:\t"
 puts -nonewline "[withLock {puts [lock::timeout]}]"
 puts "(global Default is not affected, 2000 is just an overwrite for this call an cannot be queried)"
 puts "Performing \[incrCounter ctr1.dat\] three times:"
 puts "[incrCounter ctr1.dat] [incrCounter ctr1.dat] [incrCounter ctr1.dat]"
 puts "Performing \[incrCounter ctr1.dat 5000\] returns:\t[incrCounter ctr1.dat 5000]"
 puts "Performing \[incrCounter ctr1.dat 5000 lock.file\] returns:\t[incrCounter ctr1.dat 5000 lock.file]"
 file delete ctr1.dat

 # lock_test1.tcl, 25.02.2006
 # starting 10 independet process in the background
 catch {file delete ./counter1.dat}
 catch {file delete ./lockfile.shr}
 set start [expr {[clock seconds]+10}]
 for {set i 1} {$i < 11} {incr i} {
     puts $i
     exec -- [auto_execok tclsh] lock_test1sub.tcl $start &
 }
 puts "soon starting processes, please stand by..."
 puts "after all processes have finished, press RETURN to see the cmd prompt (MS-Win)"
 puts "The counter should be: 99"

 # lock_test1sub.tcl, 25.02.2006
 lappend auto_path [pwd]
 package require lock 0.2
 # make shure all process are starting in the same moment
 while {[clock seconds] < "$argv"} {}
 for {set i 1} {$i < 11} {incr i} {
     # incrCounter uses withLock internally
     puts "$i [pid]:=[lock::incrCounter ./counter1.dat 500 ./lockfile.shr]"
 }

 # lock_test3.tcl, 25.02.2006

 lappend auto_path [pwd]
 package require lock 0.2

 # use a different lockfile and timeout-value
 set t [lock::acquireLock 5000 ./test.lock]
 puts "t:=$t"
 after 8000 {lock::releaseLock $t}
 after 8500 {set ready 1}
 set u [lock::acquireLock 10000 ./test.lock]
 puts "u:$u"
 lock::releaseLock $u

 vwait ready

 # lock_test4.tcl, 25.02.2006

 lappend auto_path [pwd]
 package require lock 0.2

 set h [lock::acquireLock]
 puts ---$h
 after 600 [list lock::releaseLock "$h"]
 while 1 {
    if {![catch {lock::acquireLock 200} h]} {
       set x $h
       after 2000 {lock::releaseLock "$x"}
    }
    puts -+-$h
    after 50
    update
 }

And now, forget about all the complicated test stuff above. All you ever need is almost always the following:

 package require lock
 lock::withLock {
    #
    # put the code for which you want to assure that it only runs at one single time here
    #
 } msec lockfile

Where

  • msec is the period that a second instance that want to perform the same code waits for the lock to be released.
  • lockfile is the file used to be as a flag (you need write access to the directory, of course).

Use catch if required around the lock::withLock, as it delivers back what your code returns.

That's it!


ulis, 2005-12-23. As I said in Semaphores, 2003-05-31, this is THE portable manner to lock. Almost all OSes ensure that open $lockfile {WRONLY CREAT EXCL} is an atomic operation based on a Test and Set operation.

See How do I manage lock files in a cross platform manner in Tcl for the NFS exception.

When available, system semaphores are more efficient.


JMN, 2007-10-24. On FreeBSD systems at least, there is by default no TEMP, TEMPDIR or TMP environment variable. Presumably in this case the code should choose the default /tmp

As it stands the 0.2 package fails with 'can't read "::env(temp)": no such variable'.

tcllib's fileutil package seems to 'do the right thing' with fileutil::TempDir - but I don't know if this package should depend on tcllib or not so I'll leave it for the author to adjust. (or perhaps the lock package would be better part of fileutil anyway..)

MHo: I will add your suggestions to the code as soon as possible and make a 0.3 version out of it. What I also should do is a "official" test-suite. MHo 2008-03-01: Just took a look at fileutil::TempDir, and what I noticed is the same mistake I saw in some other modules: On windows, c:/temp etc. is used. But in practice, nearly every real-word environment is a multi user environment. So each user has her or his own temp dir, which typically resides somewhere in the profile (.../local settings/temp). On the other hand, if a process is running without a logged on user or as a system service (daemon), it uses the global temp folder, which typically is located in c:/windows/temp... Anyway, each program sees a environment var temp on windows, which always contains the right entry - so why use someting else on windows?

LV Remember after coming to a conclusion in the discussion to drop a feature enhancement or bug report at http://tcllib.sf.net/ so that the maintainers can consider your recommendations.