Version 2 of Word Reaper

Updated 2003-06-03 18:01:50

if 0 {Richard Suchenwirth 2003-06-03 - The following proc was written to extract the text content from a (Microsoft) Word document, via dde and the clipboard. It's still not pretty or very stable, but after hours of wrestling with fuzzy documentation, I now wikify it as a first shot.

The procedure is called with the filename of a Word document, and returns the reaped text, after some while (I could not think of other synchronization rather than to wait much...) A known issue is that some characters (apostrophe, "...") get specially encoded, and the output contains a question mark in these positions. Tk is needed so we can use the clipboard.

Hints on how to do this better are greatly appreciated! }

 proc doc2txt fn {
    package require dde
    package require Tk; # because we need [selection]
    eval exec [auto_execok start] [list $fn] &

    #Loop to wait until Word is really there, and ready to talk
    set word ""
    while {$word==""} {
        set word [dde services Winword System]
        after 200
    }
    after 1000 ;# wait for the window to load...
    dde execute Winword System {[EditSelectAll]}
    dde execute Winword System {[EditCopy]}
    set res [selection get -selection CLIPBOARD]
    dde execute Winword System {[FileExit 2]}
    set res
 }

Arts and crafts of Tcl-Tk programming