Version 6 of AutoCorrect

Updated 2003-06-01 22:46:19

A MS Word-like AutoCorrect feature implementation.

Here is a story you don't hear every day. You know that AutoCorrect feature in MS Word? You type 'tihs' and Word replaces it with 'this'. I really love that feature. Not so much to correct my typos, but as a shorthand tool. I type 'ill go,w or wo u' and it automatically expands to 'I'll go, with or without you'. The part of the story that you don't hear every day is that I am so addicted to it and have used it for so long that I've built up a 10,000-entry list. Writing in Word, I shorthand all the time.

So I tried to implement it in a Tk app. I load the entire list from a SQLite database into a Tcl array at startup, and every time I hit space or punctuation, the app searches for the last "word" just typed in the array, deletes that word and prints its counterpart.

I still need to implement automatic capital letter in the beginning of sentences and some method to undo the auto correction. Ctrl+Z will not yield the expected(?) result. Apart from that, it is a perfect Auto Correct feature, ready to be implemented in any Tcl/Tk-based text editor.

Thanks to Michael A. Cleverly for very useful hints.

# First, you must have this text widget: $w.textframe.texto1

# Now, the binding. We want to launch 'AutoCorrect' every time we hit the space # and/or punctuation keys. These four lines will launch the function whenever # the last key pressed (%K) is found in the 'myACkeys' list:

 set myACkeys        {space period comma colon semicolon question exclam slash backslash less greater equal asterisk plus minus parenleft parenright bracketleft bracketright braceleft braceright quotedbl quoteright}
 foreach key $myACkeys        {
         bind $w.textframe.texto1 <$key> { autocorrect }
 }

# Note: these bindings were obtained in Windows. They may vary in other platforms.

# I have a database with two columns: 'type' and 'replace'. Let's load them into # an array called 'myAClist'

         set myQuery {select type,replace from autocorrect}
         sq eval $myQuery {}        { array set myAClist [ list $type $replace ] }

# Now the autocorrect proc

 proc        autocorrect        {}        {
 global w myAClist

# get the 40 last characters every time you type, like a "trail" # 40 is, of course, an arbitrary limit

         set myTrail [ $w.textframe.texto1 get "insert -40c" insert ]

# myTypeString is a regular expression to get the last word # its value may change along the script

         set myTypeString {[^,.;: ]+}

# The loop. Here is what this loop does: # Get the last word in the "trail". If the last word ('type' string) is found, # replace it with the 'replace' counterpart. If it is not found, get the TWO # last words and search for the two-word string in the array. If it is not found, # get the THREE last words and search again. It can go on forever, but I # set the limit to 10 words. More than that is very little likely to be used and # might make everything run too slow. My actual application uses only 7.

         for        { set myIteration 1 } { $myIteration <= 10 } { incr myIteration }        {
                 regexp -line "($myTypeString)\$" $myTrail => myLastWord
                 set myLastWordWipeSize [ string length $myLastWord ]

# Note that at this time, the RE is ^,.;:+

                 if        { [ array get myAClist $myLastWord ] != "" }                {
                         $w.textframe.texto1 delete "insert -$myLastWordWipeSize c" insert
                         $w.textframe.texto1 insert insert "$myAClist($myLastWord)" 

# If the 'type' string is found, that's enough, so break the loop

                         break
                 }

# What if what I just typed is a 'type' string, but I typed it in CAPITALS? It won't # be found in the array. Not unless we repeat the previous operation, but slightly # different:

                 if        { [ array get myAClist [ string tolower $myLastWord ] ]  != "" }                {
                         $w.textframe.texto1 delete "insert -$myLastWordWipeSize c" insert
                         $w.textframe.texto1 insert insert [ string toupper $myAClist([ string tolower $myLastWord ]) ] 
                         break
                 }

# Good. Now, if we type 'TIHS', it will be replaced with 'THIS'.

# Our single-word 'type' string is not found in the array. Now what? The loop # will be run again, of course. This time, let's look for ^,.;:+ ^,.;:+ # i.e. the last two words. If it's not found either, the next iteration will look for # ^,.;:+ ^,.;:+ ^,.;:+

                 set myTypeString "$myTypeString $myTypeString"
         }
 }

One final comment: isn't it great to read a script whose variables' names make it dead clear what they represent, instead of foreach i $a { set c "[ lindex z end ] is not $i, but could be $b" }

In case you're wondering, the 'my' prefix give the variables the right color in my syntax highlighting scheme, even if they do not have $

Luciano Espirito Santo

Santos - SP - Brasil

rot13 the string below to get my mail address

yhpgpy ng OenmvyvnaGenafyngvba.arg