Version 10 of Finding a sublist

Updated 2004-06-03 11:52:21 by jcw

if 0 {Richard Suchenwirth - 2004-06-04 - In the Tcl chatroom I saw a request for finding the position of a sublist in a list, so that

 lsubl {a b c a d e} {a d}

would return 3. A number of famous algorithms (only not to me) were mentioned: Knuth-Morris-Pratt, Rabin-Karp, Boyer-Moore-Gosper. I'm more the practical kind of guy, so I decided just to give it a try, like this:

  • Find all possible starting positions of the sublist
  • Check whether they comply; return the first to match, or -1 if none

Here's my code: }

 proc lsubl {list sublist} {
    set sl [llength $sublist]
    if {!$sl} {return 0}
    foreach i [lsearch -all $list [lindex $sublist 0]] {
        if {[lrange $list $i [expr {$i+$sl-1}]] eq $sublist} {
            return $i
        }
    }
    return -1
 }

#Tests:

 proc ? {c e} {catch $c r; if {$r ne $e} {puts "$c->$r, expected $e"}}

 ? {lsubl {a b c a d e} {a d}}  3
 ? {lsubl {a b c a d e} {a b}}  0
 ? {lsubl {a b c a d e} {a f}} -1
 ? {lsubl {a b c a d e} {}}     0 ;# empty list is everybody's sublist

if 0 {Could some algorithm-knowers comment, please? :)

FW: Hmm, algorithms for this task are probably so many many because it's pretty much the same idea as finding the index of a substring, which of course is very common. For your one-char items proc lsubs {list sublist} {string first [join $sublist ""] [join $list ""]} works. - RSever indeed - thanks! (But the question was of course for general list elements...)

03jun04 jcw - Here's a Boyer-Moore version:

  proc lsame {a b {o 0}} {
    set n [llength $a]
    while {[incr n -1] >= 0} {
      if {[lindex $a $n] ne [lindex $b [expr {$n+$o}]]} { return 0 }
    }
    return 1
  }

  proc lsubl {l s} {
    set n [llength $s]
    if {$n == 0} { return 0 }
    for {set i 0} {$i < $n} {incr i} {
      set o([lindex $s $i]) [expr {$n-$i-1}]
    }
    set i 0
    set m [llength $l]
    while {$i < $m} {
      set c [lindex $l $i]
      if {![info exists o($c)]} {
        incr i $n
      } elseif {$o($c) > 0} {
        incr i $o($c)
      } else {
        set k [expr {$i-$n+1}]
        if {[lsame $s $l $k]} { return $k }
        incr i
      }
    }
    return -1
  }

Arts and crafts of Tcl-Tk programming }