Version 0 of Word Reaper

Updated 2003-06-03 17:48:41

if 0 {Richard Suchenwirth 2003-06-03 - The following proc was written to extract the text content from a (Microsoft) Word document. It's still not pretty or very stable, but after hours of wrestling with fuzzy documentation, I now wikify it as a first shot.

The procedure is called with the filename of a Word document, and returns the reaped text, after some while )I could not think of other synchronization rather than to wait much...) A known issue is that some characters (apostrophe, "...") get specially encoded, and the output contains a question mark in these positions. }

 proc doc2txt fn {
    package require dde
    package require Tk; # because we need [selection]
    eval exec [auto_execok start] [list $fn] &

    #Loop to wait until Word is really there, and ready to talk
    set word ""
    while {$word==""} {
        set word [dde services Winword System]
        after 200
    }
    after 1000 ;# wait for the window to load...
    dde execute Winword System {[EditSelectAll]}
    dde execute Winword System {[EditCopy]}
    set res [selection get -selection CLIPBOARD]
    dde execute Winword System {[FileExit 2]}
    set res
 }

Arts and crafts of Tcl-Tk programming