Version 5 of tailf

Updated 2004-02-17 12:21:08

## ********************************************************

 ##
 ## tail.tcl version 1.0
 ##
 ## provides facilities like the UNIX tail function, in Tcl.
 ##
 ## ********************************************************

 ;#barecode
 package provide tail 1.0
 ;#end

 ## ******************************************************** 
 ##
 ## Name: tail
 ##
 ## Description:
 ## A combination of tail -f and egrep.
 ## Loops forever on a file NAME.
 ## If the file doesn't exist it will wait for it to appear
 ## and then work on it.
 ##
 ## Parameters:
 ## file - a filename
 ## rx - a regular expression pattern to filter the lines on
 ##
 ## Usage:
 ##       tail filename [ regex pattern ] [ delay in ms ]
 ##
 ## If the file disappears or is replaced by a new file with the
 ## same name it is handled transparently.
 ##
 ## The behaviour of this function is based upon tail in the gnu
 ## fileutils alpha release 4.0 package.
 ##
 ## Comments:
 ## Only the filename argument is required.  If a regex pattern is
 ## given as the second argument, it will be used to filter the
 ## lines.
 ## auto-cancel if $var is deleted

 proc tail { file { rx .+ } { delay 2000 } { var "" } { stats "" } { fid "" } } {

      set inode {}
      set size  {}
      set mtime {}
      foreach { inode size mtime } $stats { break }

      if { [ string length $var ] > 0 } {
         if  { ! [ info exist $var ] } {
             catch { unset ${var}_tailId } err
             return
         }
     } else {
         ;## default var name 
         set var ::tail_$file
     }
     if  { ! [ regexp {^::} $var ] } {
             set var "::$var"
     }
      ;## if the file exists at this iteration, tail it
      if { [ file exists $file ] } {
         file stat $file fstat
         set _inode $fstat(ino)
         set _size  $fstat(size)
         set _mtime $fstat(mtime)

         ;## if the inode has changed since the last iteration,
         ;## reopen the file.  this is from tail v4.0 in the
         ;## gnu fileutils package.
         if { $_inode != $inode } {
            catch { close $fid; set fid {} }
         } else {
            if { $_size < $size } {
               catch { seek $fid 0 }
            }
            if { $_size == $size && $_mtime != $mtime } {
               catch { seek $fid 0 }
            }
         }

         ;## if the file is not open, open it!
         if { ! [ string length $fid ] } {
            set fid [ open $file r ]
            fconfigure $fid -blocking off
            fconfigure $fid -buffering line
         }

         set inode $_inode
         set size  $_size
         set mtime $_mtime

         ;## set a variable with the content of the
         ;## regex filtered line.
         ;## use a temp var to store variable
         ;## until all lines are read
         ;## then set the global var
         ;## so trace function is not called for every line
         set temp {}

         while { [ gets $fid line ] >= 0 } {
            if { [ regexp -- $rx $line match ] } {
               ;## put a trace on variable to
               ;## read the tail output.
               append temp "$line\n"
            }
         }

         ;## setting this will invoke trace function
         if { [ string length $temp ] } {
            set $var $temp
         }

      ;## if the file doesn't exist, make sure we aren't
      ;## creating an NFS orphan.
      } else {
         ;## maybe the file got nuked? Handle it!
         if { [ string length $fid ] } {
            catch { close $fid; set fid {} }
         }
      }

      ;## lather, rinse, repeat.  This is NOT recursion!
      set stats \{[ list $inode $size $mtime ]\}
      set ${var}_tailId [ after $delay \
        [ subst { tail ${file} $rx $delay ${var} $stats $fid } ] ]
 }
 ## ******************************************************** 

 ## ******************************************************** 
 ##
 ## Name: cancelTail
 ##
 ## Description:
 ## Removes the trce associated with am invocation of "tail"
 ## Parameters:
 ## file - filename being tailed
 ## cmd - command handling the trace
 ##
 ## Usage:
 ##  cancelTail filename [ command ]
 ##
 ## Comments:
 ## see the man page for "trace"

 proc cancelTail { file { var "" } { cmd "" } } {

     if  { [ catch {
         if  { ! [ string length $var ] } {
             set var ::tail_$file
             set aftervar ${var}_tailId 
         } else {
         set aftervar ${var}_tailId
         }
         catch { after cancel [ set $aftervar ] }
         if { ! [ string length $cmd ] } { 
            catch { set cmd [ eval lindex [ trace vinfo $var ] 1 ] } 
         }
         catch { trace vdelete $var w $cmd }
     } err ] } {
         return -code error $err
     }
 }
 ## ******************************************************** 

 ## ******************************************************** 
 ##
 ## Name: htmlFilter
 ##
 ## Description:
 ## Convenience function for cleaning up html lines
 ## Have your trace handler function call this to
 ## pretty-print html.
 ##
 ## Parameters:
 ##
 ## Usage:
 ##
 ## Comments:
 ## Very simpleminded.  Fairly quick.

 proc htmlFilter { text } {

      ;## table of escape characters
      array set esc {
      lt     <    gt     >    quot   \"   ob     \x7b  cb    \x7d
      nbsp   \xa0 iexcl  \xa1 cent   \xa2 pound  \xa3 curren \xa4
      yen    \xa5 brvbar \xa6 sect   \xa7 uml    \xa8 copy   \xa9
      ordf   \xaa laquo  \xab not    \xac shy    \xad reg    \xae
      hibar  \xaf deg    \xb0 plusmn \xb1 sup2   \xb2 sup3   \xb3
      acute  \xb4 micro  \xb5 para   \xb6 middot \xb7 cedil  \xb8
      sup1   \xb9 ordm   \xba raquo  \xbb frac14 \xbc frac12 \xbd
      frac34 \xbe iquest \xbf Agrave \xc0 Aacute \xc1 Acirc  \xc2
      Atilde \xc3 Auml   \xc4 Aring  \xc5 AElig  \xc6 Ccedil \xc7
      Egrave \xc8 Eacute \xc9 Ecirc  \xca Euml   \xcb Igrave \xcc
      Iacute \xcd Icirc  \xce Iuml   \xcf ETH    \xd0 Ntilde \xd1
      Ograve \xd2 Oacute \xd3 Ocirc  \xd4 Otilde \xd5 Ouml   \xd6
      times  \xd7 Oslash \xd8 Ugrave \xd9 Uacute \xda Ucirc  \xdb
      Uuml   \xdc Yacute \xdd THORN  \xde szlig  \xdf agrave \xe0
      aacute \xe1 acirc  \xe2 atilde \xe3 auml   \xe4 aring  \xe5
      aelig  \xe6 ccedil \xe7 egrave \xe8 eacute \xe9 ecirc  \xea
      euml   \xeb igrave \xec iacute \xed icirc  \xee iuml   \xef
      eth    \xf0 ntilde \xf1 ograve \xf2 oacute \xf3 ocirc  \xf4
      otilde \xf5 ouml   \xf6 divide \xf7 oslash \xf8 ugrave \xf9
      uacute \xfa ucirc  \xfb uuml   \xfc yacute \xfd thorn  \xfe
      yuml   \xff amp    &         #013        \n          
      }

      ;## special handler for list items
      regsub -all {<[Ll][Ii]>}          $text {  * } text
      ;## and for images
      regsub -all {<[Ii][Mm][Gg][^>]+>} $text {* }   text
      ;## all other tags just GO AWAY
      regsub -all {<[^>]+>}             $text {}     text
      ;## escape curlies properly
      regsub -all {\\\}}                $text \}     text
      regsub -all {\\\{}                $text \{     text
      ;## maybe we are rendering something with embedded
      ;## tcl code -- declaw it!
      regsub -all {\$}                  $text {\\$}  text
      regsub -all {\[}                  $text {\\[}  text
      regsub -all {\]}                  $text {\\]}  text
      ;## replace html escape sequences with literals
      regsub -all -nocase {&([0-9a-z#]*);} $text {$esc(\1)} text
      ;## this line causes the $esc() to be evaluated
      set text [ subst $text ]

      ;## and ship it back!
      return $text
 }
 ## ******************************************************** 

 ;#barecode
 ## ******************************************************** 
 ;## USAGE EXAMPLE ##
 ;## (filename is foo, callback "handle" provided):

 ;## uncomment the next four lines and just source this file
 ;## for a live demo.
 ;## proc handle { args } {
 ;##      puts -nonewline [ set [ lindex $args 0 ] ]
 ;##  }
 ;## set ::tail_foo {}
 ;## trace variable ::tail_foo w handle
 ;## tail foo
 ;## vwait enter-mainloop ;## or run in "wish".

 ;## now go ahead and start echo'ing stuff into a file
 ;## named "foo"; delete it, reopen it, truncate it...
 ;## tail will follow!
 ## ********************************************************

JMN Keeping the filehandle open while monitoring a logfile, as done above, can cause logfile-rotation problems on at least some systems. On windows, if a logging process tries to rename the monitored file to the next file in the rotation, it can fail. Rename on windows is apparently implemented as a copy and delete, and deletes won't occur if a file is open for reading. (e.g Windows 'ren' command fails to do even the copy, gnu cygwin 'mv' utility does the copy only.)

The logging process may then start writing to the original logfilename from the beginning - thus you lose an entire logfile worth of data. According to the comments in this perl example of a tail utility: http://www.thbz.org/ppt/tail ; closing & reopening files many times a second is a viable option. This would certainly reduce the chance of a tail utility disrupting log-rotation - but I assume there is still a small possibility of it happening.

The only solution I could think of was to create a copy of the monitored file when you start tailing, and store the new chunks as they arrive - then if you detect a file shrinkage, but the subsequent filename in the log-rotation is missing you write your received data to the file you put aside, and rename it to the next logfilename. (but assuming what rotation naming scheme? logfilename1.log ? logfilename.log1 ? time-based?) This still leaves the possibility of losing a few lines at the end of the log if the logging process does a rotation straight after writing some lines and your tail polling interval is large enough to miss them.

Assuming no control over the logging process, what is the standard way of dealing with this in a tail-like program?

Presumably it's really the responsibility of the logger to do better checking during its rotation - but I don't trust all loggers to behave correctly - and I've already found one that doesn't. I hope I'm missing something obvious.. or am just plain mistaken in my understanding of what's going on.. is there a truly 'safe' way to monitor logfiles?