if 0 {Richard Suchenwirth 2003-06-03 - The following proc was written to extract the text content from a (Microsoft) Word document, via dde and the clipboard. It's still not pretty or very stable, but after hours of wrestling with fuzzy documentation, I now wikify it as a first shot.
The procedure is called with the filename of a Word document, and returns the reaped text, after some while (I could not think of other synchronization rather than to wait much...) A known issue is that some characters (apostrophe, "...") get specially encoded, and the output contains a question mark in these positions. Tk is needed so we can use the clipboard.
Hints on how to do this better are greatly appreciated! }
proc doc2txt fn { package require dde package require Tk; # because we need [selection] eval exec [auto_execok start] [list $fn] & #Loop to wait until Word is really there, and ready to talk set word "" while {$word==""} { set word [dde services Winword System] after 200 } after 1000 ;# wait for the window to load... dde execute Winword System {[EditSelectAll]} dde execute Winword System {[EditCopy]} set res [selection get -selection CLIPBOARD] dde execute Winword System {[FileExit 2]} set res }
LES: I can't do it with DDE. DDE is so crude. But COM and optcl can produce a very good result. First, create and save this Word macro in your Normal.dot:
Sub wordreaper() Dim myRange As Range With Word.Application If .Windows.Count > 0 Then Set myRange = ActiveDocument.Content End If End With Open "c:\windows\desktop\extract.txt" For Append As #1 Print #1, myRange Close #1 End Sub
Then, run this Tcl code:
set _docpath {c:\windows\desktop\some.doc} package require optcl set ::hWORD [ optcl::new word.application ] set ::hDOC [ $::hWORD -with documents open $_docpath ] $::hWORD run wordreaper $::hDOC Close $::hWORD Quit
The macro in Normal.dot is often necessary because "translating" the entire macro into some way that optcl can send entirely on its own is very difficult, if possible at all. I also have no idea of how one could pass the path to "extract.txt" to the macro instead of hard-coding it.