Random Hangs on high cpu load

Difference between version 5 and 6 - Previous - Next
From time to time I face problems in several long running tcl programs on one of our Windows production Systems, which I'm unfortunally not able to track down.

The source codes are too complicated to post here...

All programs make heavy use of [after] Events and [file event]s, e.g. to periodically update a flag file on disk or reading stdout from called programs, and are [exec]ing many external excutables, either in an endless Loop or triggerd by [twapi] filesystem Monitoring. If everything works fine, the programs run many days (or weeks) without Problems, calling ten-thousands of other programs without Problems....

What I see yesterday is this:

   * We started another program on the machine, which puts the cpu under heay load and constantly allocates more and more Memory, so the machine slows down
   * We killed that program
   * Although the machine restores to a normal load then, my 3 tcl programs on that machine did not respond any more - that is, the after events did not fire anymore. There are no errors generated; it simply Looks like the programs are hang.
   * Other programs on that machine continue to run normally, so there was no general Windows error condition etc.

As I'm only experienced at the Tcl/Tk-script Level, I don't know how to track down such errors down any further. They aren't reproducable and happen from time to time. The machines are Windows VMs.

My questions are:

   * Under what circumstances is it theoretically possible that the tcl eventloop stops working?
   * Would it help to save the (few) Infos from Sysinternal's Process Explorer Output about active threads, thread state etc.? I'm not able to Interpret such Infos, I fear...
   * Is it possible that a call to [exec] ,,,,& /[open] |proc never Returns, blocking the whole program?

[jdc] Might be related to http://core.tcl.tk/tcl/tktview?name=8bd13f07bde6fb06

[MHo] Many thanks! Yes, this could it be.... I've searched the bugs already, but not found this entry.... So, I have to wait for 8.6.7 (didn't mentioned above that I'm using 8.6.6)...

[jdc] Maybe try applying the patch to your 8.6.6. and see if it helps?

[MHo] Hm, I'm using the tclkits from https://sourceforge.net/projects/twapi/files/Tcl%20binaries/Tclkits%20with%20TWAPI/ ; no source code there....
[MHo] 2018-05-17: Late, late addition: The 8.6.7 update solved the long-standing problem. Never saw the freezes again! Thank you all!!!