[AMG]: Recent Tcl doesn't attempt to clean up all its memory on exit because that takes time and serves no purpose. However, this interferes with valgrind's leak detection. Setting the TCL_FINALIZE_ON_EXIT environment variable is supposed to restore full cleanup, though I couldn't get it to work for me. Compiling with -DPURIFY works much better, but it seems to have the side effect of making Tcl unload all dynamic libraries on exit, which prevents valgrind from showing the names of functions in loaded libraries. If you're debugging a library loaded by Tcl and not Tcl itself, you can just use this suppression file: ======none { Tcl allocation 1 Memcheck:Leak match-leak-kinds: possible fun:malloc fun:GetBlocks fun:TclpAlloc } { Tcl allocation 2 Memcheck:Leak match-leak-kinds: possible fun:malloc fun:TclThreadAllocObj } { Tcl allocation 3 Memcheck:Leak match-leak-kinds: possible fun:malloc fun:TclpAlloc } { linker uninitialized branch at startup Memcheck:Cond fun:index fun:expand_dynamic_string_token fun:_dl_map_object fun:map_doit fun:_dl_catch_error fun:do_preload fun:dl_main fun:_dl_sysdep_start fun:_dl_start obj:/lib64/ld-2.19.so } ====== That last entry isn't for Tcl, but it's useful to me on my [Slackware]64 system, so I'm leaving it in for now. ---- (author unknown, dated '''2002-08-02''') ([MS] pleads guilty) The recent report on the '''valgrind''' memory debugger [http://developer.kde.org/%7Esewardj/index.html] prompted me to try it on the tcl sources - help out the poor Jeff, the only '''purify''' licensee among us. This is what I did after downloading and installing '''valgrind''' on my linux/i386 laptop: * compile tcl/tcltest with -DPURIFY * modify tests/all.tcl to incorporate the ''exit trick'' described in [How to debug memory faults in Tcl and extensions], ie, inserted before the return the line proc exit args {} * ran the testsuite under valgrind: /opt/valgrind/bin/valgrind -v --leak-check=yes --num-callers=10 \ --logfile-fd=9 --leak-resolution=high --show-reachable=no \ ./tcltest ../tests/all.pfy 9>valgrind.out The output is ==19434== valgrind-1.0.0, a memory error detector for x86 GNU/Linux. ==19434== Copyright (C) 2000-2002, and GNU GPL'd, by Julian Seward. ==19434== Startup, with flags: ==19434== --suppressions=/opt/valgrind/lib/valgrind/default.supp ==19434== -v ==19434== --leak-check=yes ==19434== --num-callers=10 ==19434== --logfile-fd=9 ==19434== --leak-resolution=high ==19434== --show-reachable=no ==19434== Reading suppressions file: /opt/valgrind/lib/valgrind/default.supp ==19434== Reading syms from /CVS/tcl_SF_clean/unix/tcltest ==19434== Reading syms from /lib/ld-2.2.4.so ==19434== Reading syms from /opt/valgrind/lib/valgrind/valgrind.so ==19434== Reading syms from /lib/libdl-2.2.4.so ==19434== Reading syms from /lib/i686/libm-2.2.4.so ==19434== Reading syms from /lib/i686/libc-2.2.4.so ==19434== Estimated CPU clock rate is 598 MHz ==19434== ==19434== ==19434== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) ==19434== malloc/free: in use at exit: 3930 bytes in 22 blocks. ==19434== malloc/free: 136637 allocs, 136615 frees, 56516658 bytes allocated. ==19434== ==19434== searching for pointers to 22 not-freed blocks. ==19434== checked 4470668 bytes. ==19434== ==19434== definitely lost: 544 bytes in 17 blocks. ==19434== possibly lost: 0 bytes in 0 blocks. ==19434== still reachable: 3386 bytes in 5 blocks. ==19434== ==19434== 544 bytes in 17 blocks are definitely lost in loss record 4 of 6 ==19434== at 0x400467C4: malloc (vg_clientfuncs.c:100) ==19434== by 0x80CF7AE: TclpAlloc (./../generic/tclAlloc.c:680) ==19434== by 0x8066041: Tcl_Alloc (./../generic/tclCkalloc.c:1002) ==19434== by 0x80BC2E9: NewVar (./../generic/tclVar.c:4266) ==19434== by 0x80B9419: TclLookupArrayElement (./../generic/tclVar.c:929) ==19434== by 0x80840B4: TclExecuteByteCode (./../generic/tclExecute.c:1775) ==19434== by 0x8082CB9: TclCompEvalObj (./../generic/tclExecute.c:1007) ==19434== by 0x80AF574: TclObjInterpProc (./../generic/tclProc.c:1074) ==19434== by 0x8061BAD: TclEvalObjvInternal (./../generic/tclBasic.c:3033) ==19434== by 0x808381C: TclExecuteByteCode (./../generic/tclExecute.c:1430) ==19434== ==19434== LEAK SUMMARY: ==19434== definitely lost: 544 bytes in 17 blocks. ==19434== possibly lost: 0 bytes in 0 blocks. ==19434== still reachable: 3386 bytes in 5 blocks. ==19434== Reachable blocks (those to which a pointer was found) are not shown. ==19434== To see them, rerun with: --show-reachable=yes ==19434== --19434-- lru: 798 epochs, 0 clearings. --19434-- translate: new 14353 (218894 -> 3060822), discard 0 (0 -> 0). --19434-- dispatch: 39900000 basic blocks, 14078/524777 sched events, 248895 tt_fast misses. --19434-- reg-alloc: 5203 t-req-spill, 573378+33302 orig+spill uis, 77003 total-reg-r. --19434-- sanity: 1014 cheap, 41 expensive checks. '''Notes''' * '''valgrind''' is amazingly fast: 'time' reports 1m18 user time for the testsuite, vs 0m45 without using it. * the option '--show-reachable=yes' produces also a stack trace of the creation of the ''reachable unfreed memory at exit''. In our case, the 5 blocks totalling 3386 bytes are due to tcl not freeing encoding-related memory properly ( See Bug #543549 [http://sourceforge.net/tracker/index.php?func=detail&aid=543549&group_id=10894&atid=110894] ) * this shows a real leak: 17 times in the testsuite, a new element is added to a local array and the corresponding variable is never freed. ''The hunt is on!'' '''UPDATE:''' ''"if it is too good to be true ... it is probably not true"'' When run as above, '''valgrind''' will not monitor children processes. But tests are normally run in separate processes, so that I was ''not'' monitoring the memory usage of the actual tests. With the option --trace-children=yes, or running the tests in the same process, the slowdown is enormous - append.test takes 75x longer under valgrind. <> Porting