Version 17 of Simple Text Widget Sort

Updated 2006-02-06 23:30:39 by escargo

WJG (05/Feb/2006] I've been working on wordlists recently and needed a simple sorter. Here's what I came up with.

 #---------------
 # sort.tcl
 #---------------
 # Created by William J Giddings, 2006.
 #
 # Sort list of words contained within a text widget.
 #
 # Description:
 # -----------
 # Copy the words to a tcl list, sort them and the
 # re-insert into original text.
 #
 # Usage:
 # -----
 # See demo proc for example 
 #---------------

 # set some test switches
 set DEMO(sort) yes

 #----------------
 # sort text widget wordlist
 #----------------
 proc sort {{w .txt}} {
  set str [$w get 1.0 end]
  if {$str==""} { return}

  # get the size of the list
  foreach {last tmp} [split [$w index end-1c] .] {}

  # build a tcl list of the lines of text
  set lst {} 

  for {set i 1} {$i<=$last} {incr i} {
    set str [$w get $i.0 "$i.0 lineend"]
    # trim away unecessary whitespace
    set str [string trim $str]
    if {$str==""} {continue}
    set lst [lappend lst $str]
  }

  # sort it
  set lst [lsort -dictionary $lst]

  # insert it back into the widget
  $w delete 1.0 end
  set i 0
  for {set j 0} {$j<[llength $lst]} {incr j} {
    $w insert end [lindex $lst $j]\n
  }
  # trim away the last \n
  $w delete "$j.0 lineend" end

 }

 #----------------
 # demo 
 #----------------
 proc demo {} { 
  pack [text .txt] -fill both -expand 1
  .txt insert end "Zebra\nWildebeast\nParrot\nHamster\nBear\nAardvark"
  sort
 }

 if {$DEMO(sort)} {demo}

WJP (2006-02-05) Here's a version that is a bit simpler and more compact:

 proc sort {{w .txt}} {
    set str [$w get 1.0 end]
    if {$str==""} { return}
    $w delete 1.0 end
    foreach l [lsort -dictionary [split $str "\n"]] {
        if {$l != ""} {
            $w insert end [string trim $l]\n
        }
    }
 }

and here's a version that generates a sorted set (in which the entries are unique):

 proc UniqueSort {{w .txt}} {
    set str [$w get 1.0 end]
    if {$str==""} { return}
    $w delete 1.0 end
    set Previous "";
    foreach l [lsort -dictionary [split $str "\n"]] {
        set l [string trim $l]
        if {($l != "") && ($l != $Previous)} {
            $w insert end $l\n
        }
        set Previous $l;
    }
 }

Feb 6 2006 - MG offers an even simpler version of WJP's, which doesn't use the foreach (and runs, on my computer, a whole 16 microseconds faster! ;)

 proc sort2 {{w .txt}} {
    set str [$w get 1.0 end]
    if {$str==""} { return}
    $w delete 1.0 end
    $w insert end [string trim [join [lsort -dictionary [split $str \n]] \n] \n]
 }

2006-02-06 - WJP Nice, but the functionality isn't identical. My version trims each line, whereas MG's trims extra whitespace only from the beginning of the first line and end of the last. I would think that much of the time trimming each entry would be unnecessary, but it seems to be part of WJG's requirements.

MG Actually, because of the way it sorts, all the empty lines are at the beginning of the string (so the [string trim] removes them all; you could actually use [string trimleft] instead, and probably make it quicker still) - or at least that's the case when using WJG's example data with some extra empty lines thrown in. There may be some cases where something will sort before empty lines, though I don't believe so. So it should still have the effect that all empty lines are omitted.

RS: Just for simplicity, the test if {$str == ""} return can also be dropped - if the text is empty, neither deleting nor sorting will do it any harm... and lsort has a -unique switch that could be used if wanted.

WJP: Ah, I hadn't noticed "-unique". But I still think that MG's version doesn't do exactly what mine does. He's right about the blank lines, but what if there is extra whitespace at the beginning or end of non-blank lines? In that case, you've got to trim them individually. You can see this if you try it on input like this:

 set data [join [list pig dog cow {  cow  } {  dog  } {dog }] "\n"]

Of course, if you've got that kind of input you need to trim each line BEFORE the sort (as WJG does) so that the whitespace won't affect the sort, e.g.:

 proc sort3 {{w .txt}} {
    set str [$w get 1.0 end]
    $w delete 1.0 end
    foreach l [split $str "\n"] {
      lappend clean [string trim $l]
    }
     $w insert end [join [lsort -dictionary $clean] "\n"]
 }

Category Example