Version 25 of incrfilter

Updated 2008-08-04 09:00:13 by lars_h

Purpose

Show a tcl prototype that provides full text searching of the contents of one or more files. Files can be specified on the command line or on stdin.

The program requires either one or more filenames, or the word "stdin" (which causes it to read the stdin file descriptor), reads the files specifies, takes the text entered in the small text widget at the top of the IncrFilter window, and searches for the lines that contain the text. The gimmick is that the search is done as the user types each character. The second, context, window, is used to display the context of the match. Click on a specific match line in the IncrFilter display and the surrounding text will be displayed.

Usage

The program accepts several flags.

Usage: incrfilter -v -l +l {-p list} {-f filter} file1 [file2 ...] [-]

-fallows one to specify a filter program as a second argument, which is then executed with the current file on its stdin and incrfilter reads the resulting stdout.
-pindicates that searches should return a paragraph and not just a line
-lfile mode - return the name of the file that contains the hit
+lline mode - return the line that contains the hit (default)
-read data from stdin
-vverbose mode - currently only displays the file being loaded

History of the program

The following code was written back in 1998 by JCW and Alexandre Ferrieux based on comments and feedback by Larry Virden. Further optimizations would be appreciated.

The idea behind this was this - in the olden days (back when I was a boy...) the people who had released Visicalc had a product (whose name escaped me) where searches took place in real time. The more data you added to the database, the faster you got to the data, because as you typed, the app would read more and more data as it spun under the extremely slow disk head, and by noting sectors which matched the keystrokes entered to date, could find the hit faster than waiting until the entire search term was entered and then searching all the sectors.

Features needed include:

  • more error handling - like if a file doesn't exist, if no data files are supplied and stdin is a keyboard, etc.
  • more docs on what the program does.
  • make use of Ttk
  • make the context and search box part of one frame, but allow one or the other to be torn off if the user wants to make use of two monitors, etc.
  • buttons to set and unset the verbose mode (and the other arguments that can be set on the command line)

Bugs include:

  • the program filters out too much at times. An example is this - I browse a specific file. I know the file contains 61 lines containing the string 'en, '. The program is only displaying 9 hits.

Code


 #! /bin/sh
 #untcl\
 exec tclsh $0 -- "$@"

 # --------------------- History --------------------------------
 # Authors: Jean-Claude Wippler, Alexandre Ferrieux, Larry Virden
 # Purpose: return one or more lines that match string typed into entry widget
 # default returns the line where hit occurs
 # application expects you to provide either one or more files on the 
 # command line or data via stdin.

 # Usage: incrfilter -v -l +l {-p <list>} {-f filter} file1 [file2 ...] -
 # -f allows one to specify a filter program as a second argument
 # -p indicates that searches should return a paragraph and not just a line
 # -l file mode - return the name of the file that contains the hit
 # +l line mode - return the line that contains the hit (default)
 # -  read data from stdin
 # -v verbose mode

 # v0.00         --- (Flat multi-grep.)

 #       Added interruptible search

 # v0.05         ---

 #       Added context window, highlighting, and +/-l,-f modes

 # v0.10         ---

 #       Replaced  'string  match' by 'string  first'. Faster  !  (bad
 #      perf with regexp; this delays JC's \<)
 #       Moved -f's filter to  visualization only.  (So we  match  raw
 #      data only).Faster ! (esp. useful with "nroff -man | col -b")

 # v0.15         ---

 #       Added bottom status label. Improved visual feedback.
 #       Added key bindings to both windows to help navigation

 # v0.16         ---

 #       Added color highlighting
 #       Added paragraph (-p {list of strings}) mode

 # v0.165        ---

 #       Added internal expansion of wildcards (for Windows)
 #       Added  <Control-x><Control-f> to interactively open new files
 #      on platforms with no command line arguments (MacOS).

 # v0.17  

 # v0.18         --- 
 #      Prepared for dropping into Tcler's Wiki

 # v0.19         --- (current)
 #      Added default switch settings and worked on stdin support

 # v0.20
 # added Tk dependency and defined control-u to clear the text in the search term entry widget


 # -------------------- init ---------------------------------------------

 set lmode line
 set master {}
 set idx 0

 set running 0
 set ent {}

 # -------------------- GUI setup ----------------------------------------

 set lab "Loading"

 package require Tk
 wm title . IncrFilter

 entry .e -textvariable ent
 pack  .e -side top -fill x

 label .l -textvariable lab -justify left -font [.e cget -font]
 pack  .l -side bottom -anchor w

 frame .f
 pack  .f -side bottom -fill both -expand yes

 text .f.t -yscrollcommand ".f.s set" -setgrid 1 -width 80 -state disabled
 scrollbar .f.s -command ".f.t yview"

 pack .f.s -side right -fill y
 pack .f.t -side left -expand 1 -fill both

 focus .e
 bind  .e <Control-x><Control-c> exit
 bind  .e <Up> {moveline -1}
 bind  .e <Down> {moveline 1}
 bind  .e <Control-u> {set ent [list ]}

 bind  .e <Control-x><Control-f> addfile

 .f.t tag configure hilite -background black -foreground white

 bind .f.t <Button-1> {clickline [lindex [split [.f.t index "@%x,%y linestart"] .] 0];break}
 bind .f.t <B1-Motion> break

 # --- early util ---

 proc ssplitN {s tt} {
        set l {}
        while {[string length $s]} {
                set z $s
                set n 0
                foreach t $tt {
                        set ii [string first $t $z]
                        if {$ii<0} continue
                        set z [string range $z 0 [expr {$ii-1}]]
                        set n [string length $t]
                }
                lappend l $z
                set s [string range $s [expr {[string length $z]+$n}] end]
        }
        return $l
 }

 proc ssplit1 {s t} {
        set n [string length $t]
        set l {}
        while {1} {
                set ii [string first $t $s]
                if {$ii<0} {
                        lappend l $s
                        break
                }
                lappend l [string range $s 0 [expr {$ii-1}]]
                set s [string range $s [expr {$ii+$n}] end]
        }
        return $l
 }

 proc ssplit {s tt} {
        if {[llength $tt]==1} {ssplit1 $s [lindex $tt 0]} else {ssplitN $s $tt}
 }

 proc scopesplit {s mo} {
        switch -glob -- $mo {
         line {split $s \n}
         file {list $s}
         para:* {ssplit $s [string range $mo 5 end]}
         default { ; }
        }
 }

 proc swallowfile {f} {
        global idx verbose lab master lmode para filter tab


        if {[regexp -- {[\[*?]} $f]} {
                foreach f2 [glob $f] {
                        swallowfile $f2
                }
                return
        }

        incr idx

        if {$verbose} {puts stderr " (loading $f)"}
        if {![expr {($idx-1) % 10}]} {
                set lab "Loading $f"
                update
        }

        if { $f != "stdin" } {
                set ff [open $f "r"]
        } else {
                set ff "stdin"
        }

        lappend master $idx
        if {$lmode=="para"} {append lmode :[subst -nocommands -novariables $para]}
        set tab($idx,lmode) $lmode
        set tab($idx,file) $f
        set tab($idx,filter) $filter

        set tab($idx,data) [scopesplit [read $ff] $lmode]
        if { $f != "stdin" } {
                close $ff
        }
 }

 proc modalentry {window title prompt destvar} {
        global entryvar
        global isok
        upvar $destvar dest

        catch {destroy $window}
        set oldf [focus]
        set isok 2

        toplevel $window
        raise $window
        wm title $window $title
        label $window.l -text $prompt -anchor w

        append dest ""
        set entryvar $dest
        set okcmd "set isok 1"
        set cancelcmd "set isok 0"
        entry $window.e -width 40 -relief sunken -bd 2 -textvariable entryvar
        $window.e selection range 0 end
        bind $window.e <Return>    "$okcmd;break"
        bind $window.e <Escape>    "$cancelcmd;break"
        bind $window.e <Control-g> "$cancelcmd;break"
        button $window.ok -text OK -command $okcmd
        button $window.cancel -text Cancel -command $cancelcmd
        pack $window.l $window.e $window.ok $window.cancel -side left -pady 5 -padx 5
        update idletasks
        focus $window.e
        grab set $window
        tkwait variable isok
        grab release $window
        focus $oldf

        destroy $window

        if {$isok} {set dest $entryvar}

        return $isok
 }

 proc alert s {
        tk_dialog .a Arrgh $s "" 0 Okay!
 }

 proc modedialog f {
        global lmode para

        set e "+l"
        while {1} {
                if {![modalentry .d "Mode?" "Mode for \"[file tail $f]\":" e]} {
                        return 0
                }
                switch -glob -- $e {
                 +l {set lmode line}
                 -l {set lmode file}
                 -p* {set lmode para;set para [lindex $e 1]}
                 default {alert "Bad mode: $e";continue}
                }
                return 1
        }
 }

 proc addfile {} {
        set f [tk_getOpenFile -title {Open File:}]
        if {$f==""} return
        if {[modedialog $f]} {
                swallowfile $f
                recalc
        }
 }


 # --------------------- opts and args --------------------------------

 proc usage {} {puts stderr {Usage: incrfilter [-v] [+l|-l|-p <list of strings>] [-f <filter>] [<file>] [-]};exit 1}

 set para ""
 set filter ""
 set verbose 0

 set want ""
 set f "stdin"
 foreach a $argv {
        if {$want!=""} {
                set [lindex $want 0] $a
                set want [lrange $want 1 end]
                continue
        }
        switch -glob -- $a {
         -f {set want filter;continue}
         -p {set want para;set lmode para;continue}
         -v {set verbose 1;continue}
         -l {set lmode file;continue}
         +l {set lmode line;continue}
         -  {set f stdin}
         -* {usage}
         default  {set f $a}
        }
 }
 swallowfile $f

 switch -- [llength $master] {
        0 {set tit IncrFilter}
        1 {set tit $tab(1,file)}
        default {set tit "$tab(1,file) ..."}
 }
 wm title . "IncrFilter $tit"


 # --------------------- interaction ---------------------------------------

 proc recalc {} {
        global ent result lab

        set lab "Filtering..."
        update
        if {![applyfilter $ent]} return
        .f.t configure -state normal
        .f.t delete 1.0 end
        .f.t insert end [join $result \n]
        .f.t configure -state disabled
 }

 proc tracer args {
        global running interrupt

        if {$running} {set interrupt 1} else {after idle recalc}
 }

 trace variable ent w tracer

 proc allmatches {s pat} {
        set len [string length $pat]
        set l {}
        set off 0
        while {1} {
                set res [string first $pat $s]
                if {$res<0} break
                lappend l [expr {$off+$res}] [expr {$off+$res+$len}]
                set delta [expr {$res+$len}]
                set s [string range $s $delta end]
                incr off $delta
        }
        return $l
 }

 set oldidx nope

 proc showctxwin {idx l pat} {
        global tab oldidx

        if {![winfo exists .ctx]} {
                toplevel .ctx
                if {!$idx} {lower .ctx .}

                text .ctx.t -yscrollcommand ".ctx.s set" -setgrid 1 \
                                -width 80 -state disabled
                scrollbar .ctx.s -command ".ctx.t yview"
                pack .ctx.s -side right -fill y
                pack .ctx.t -side left -expand 1 -fill both

                foreach i {1 2 3 4 5} c {red blue green brown black} {
                        .ctx.t tag configure hilite$i -background $c -foreground white
                }

                bind .ctx <Control-x><Control-c> exit
                bind .ctx <Up>         {.ctx.t yview scroll -1 units}
                bind .ctx <Down>       {.ctx.t yview scroll  1 units}
                bind .ctx <Prior>      {.ctx.t yview scroll -1 pages}
                bind .ctx <Meta-v>     {.ctx.t yview scroll -1 pages}
                bind .ctx <Next>       {.ctx.t yview scroll  1 pages}
                bind .ctx <Control-v>  {.ctx.t yview scroll  1 pages}
        }

        .ctx.t configure -state normal
        if {$oldidx!=$idx} {
                set oldidx $idx
                .ctx.t delete 1.0 end
                set reload 1
        } else {
                foreach t [.ctx.t tag names] {
                        .ctx.t tag remove $t 1.0 end
                }
                set reload 0
        }
        .ctx.t configure -state disabled
        wm title .ctx Context
        if {!$idx} return
        wm title .ctx "Context in $tab($idx,file)"
        set lmode $tab($idx,lmode)
        .ctx.t configure -state normal

        if {$tab($idx,filter)==""} {
                set data $tab($idx,data)
        } else {
                update
                set raw [exec sh -c \
                        "$tab($idx,filter);exit 0" \
                         < $tab($idx,file) 2>@ stderr]
                set data [scopesplit $raw $lmode]
        }

        if {$reload} {
                .ctx.t insert end [join $data \n]
        }

        switch -glob -- $lmode {
         line {
                set ii $l.0
                set xx [lindex $data [expr {$l-1}]]
         }
         file {
                set ii 1.0
                set xx [lindex $data 0]
         }
         para:* {
                set ss [string length [join [lrange $data 0 [expr {$l-2}]] \n]]
                if {$l>1} {incr ss}
                set ii [.ctx.t index "1.0 +${ss}c"]
                set xx [lindex $data [expr {$l-1}]]
         }
         default { ; }
        }

        set i1 ""
        set ci 0
        foreach pp $pat {
                if {$ci<5} {incr ci}
                foreach {i1 i2} [allmatches $xx $pp] {
                        .ctx.t tag add hilite "$ii + ${i1}c" "$ii + ${i2}c"
                        .ctx.t tag add hilite$ci "$ii + ${i1}c" "$ii + ${i2}c"
                }
        }
        .ctx.t configure -state disabled
        if {[catch {
                .ctx.t see [lindex [.ctx.t tag nextrange hilite $ii] 0]
        }]} {
                .ctx.t tag add hilite5 $ii "$ii + 1l"
                .ctx.t see $ii
        }
 }

 proc clickline l {
        global result2 ent clickedline

        set clickedline $l
        .f.t tag remove hilite 1.0 end
        if {!$l} {showctxwin 0 0 {};return}
        set ab [lindex $result2 [expr {$l-1}]]
        .f.t tag add hilite $l.0 "$l.0 + 1l"
        update
        showctxwin [lindex $ab 0] [lindex $ab 1] $ent
        .f.t see $clickedline.0
 }

 proc moveline off {
        global result clickedline

        incr clickedline $off
        if {$clickedline<1} {set clickedline 1}
        set nn [llength $result]
        if {$clickedline>$nn} {set clickedline $nn}
        clickline $clickedline
 }

 # ----------------------- real meat --------------------------------------

 proc applyfilter {f} {

        global running interrupt result result2 master tab lab

        set lmas [llength $master]
        set result {}
        set result2 {}
        set tot 0
        set cnt 0
        set t [lindex [time {
           foreach idx $master {
                set lmode $tab($idx,lmode)
                if {$lmas>1} {set pref "$tab($idx,file):\t"} else {set pref ""}
                set lnum 0
                foreach x $tab($idx,data) {
                        incr tot
                        incr lnum
                        if {![expr {$tot % 200}]} {
                                set running 1
                                set interrupt 0
                                update
                                set running 0
                                if {$interrupt} {
                                        after idle recalc
                                        return 0
                                }
                        }
                        set ok 1
                        #set x [string tolower $x]
                        foreach pat $f {
                                if {0>[string first $pat $x]} {set ok 0;break}
                        }
                        if {$ok} {
                                incr cnt
                                switch -glob -- $lmode {
                                 para:* {
                                        lappend result "${pref}($lnum)"
                                        lappend result2 "$idx $lnum"
                                 }
                                 line {
                                        lappend result "$pref$x"
                                        lappend result2 "$idx $lnum"
                                 }
                                 file {
                                        lappend result "$tab($idx,file)"
                                        lappend result2 "$idx 0"
                                 }
                                 default { ; }
                                }
                        }
                }
           }
        }] 0]

        set lab "$cnt items. ([expr {$t/1000}] ms)"

        clickline 0
        return 1
 }

 # ------------------------ Go ! ------------------------------------------

 recalc