This page is here to collect tips and tricks that are useful while hunting memory leaks at the C level, in the Tcl core or in extensions. To track script-level leaks (like lingering keys in global arrays, objects, channels, etc), see sibling page "Leak Hunt (Tcl level)".
Once Valgrind has given you a detailed C-level stack trace of the point of allocation and a leak is spotted, of course it's time to switch to gdb. Different techiques are needed for different kinds of allocations, but a rather frequent class is that of refcounted things (like Tcl_Obj's). Here gdb's watchpoint tool is very useful:
One thing that should never occur in Tcl is a cycle in the graph of references among Tcl_Objs. Indeed, since we depend on refcounting for memory management, a cycle is an absolute show stopper. Fortunately, by design, the language is strictly unable to produce such "reference loops": the copy-on-write principle prevents all in-place operations which would "close the loop". Any exception to this rule is a bug, not in the script, but in the core.
But here we're discussing core debugging, right ? So these things may happen. It happened to me in issue 3386417 , where the newly introduced info errorstack (TIP #348 ) was somehow plugged back into a compiled scriplet that it referred to. The interesting generalization is as follows:
If:
Then it is very likely that you have a Reference Loop. Of course you're on your own to actually track it down, but knowing the mere existence of their kind may save you hours (it would have, in my case :/ ).
Sometimes, the above techniques are insufficient. For example, valgrind may pinpoint the full stack trace of the offending allocation... but if the same stack happens many times in the program's lifespan, it is hard to identify the interesting one. In that case, switching back to Tcl's own memdebug mode is helpful: revert the above advice and do
./configure --enable-symbols=mem
memory is then the tool of choice to get to individual leaked blocks. Particularly interesting is memory onexit, since it takes a snapshot at roughly the same time as valgrind does. However, the result is slightly noisier than with valgrind, because it occurs slightly before the end of the universe (some guts remain to allow the dump to occur).
PYK 2018-05-20: It can also be helpful to call memory active right before a suspected trouble spot, once again immediately after, and then inspect a diff of the two results.
AMG: Actually, Valgrind can show this class of problem.
#include <stdio.h> int *global; void f(void) {int local = 42; global = &local;} void g(void) {printf("%d\n", *global);} int main(void) {f(); g(); return 0;}
Build and run this, and "42" is probably displayed. Maybe. Call something else between f() and g(), and who knows what you'll get.
Run using valgrind --tool=memcheck, and this is the result:
==22929== Conditional jump or move depends on uninitialised value(s) ==22929== at 0x4E834E0: vfprintf (in /lib64/libc-2.21.so) ==22929== by 0x4E8A8E8: printf (in /lib64/libc-2.21.so) ==22929== by 0x400675: g (in /home/andy/test) ==22929== by 0x400685: main (in /home/andy/test) ==22929== ==22929== Use of uninitialised value of size 8 ==22929== at 0x4E7F4BB: _itoa_word (in /lib64/libc-2.21.so) ==22929== by 0x4E837C8: vfprintf (in /lib64/libc-2.21.so) ==22929== by 0x4E8A8E8: printf (in /lib64/libc-2.21.so) ==22929== by 0x400675: g (in /home/andy/test) ==22929== by 0x400685: main (in /home/andy/test) ==22929== ==22929== Conditional jump or move depends on uninitialised value(s) ==22929== at 0x4E7F4C5: _itoa_word (in /lib64/libc-2.21.so) ==22929== by 0x4E837C8: vfprintf (in /lib64/libc-2.21.so) ==22929== by 0x4E8A8E8: printf (in /lib64/libc-2.21.so) ==22929== by 0x400675: g (in /home/andy/test) ==22929== by 0x400685: main (in /home/andy/test) ==22929== ==22929== Conditional jump or move depends on uninitialised value(s) ==22929== at 0x4E83839: vfprintf (in /lib64/libc-2.21.so) ==22929== by 0x4E8A8E8: printf (in /lib64/libc-2.21.so) ==22929== by 0x400675: g (in /home/andy/test) ==22929== by 0x400685: main (in /home/andy/test) ==22929== ==22929== Conditional jump or move depends on uninitialised value(s) ==22929== at 0x4E835BA: vfprintf (in /lib64/libc-2.21.so) ==22929== by 0x4E8A8E8: printf (in /lib64/libc-2.21.so) ==22929== by 0x400675: g (in /home/andy/test) ==22929== by 0x400685: main (in /home/andy/test) ==22929== ==22929== Conditional jump or move depends on uninitialised value(s) ==22929== at 0x4E8364A: vfprintf (in /lib64/libc-2.21.so) ==22929== by 0x4E8A8E8: printf (in /lib64/libc-2.21.so) ==22929== by 0x400675: g (in /home/andy/test) ==22929== by 0x400685: main (in /home/andy/test)
Lots of badness. But change the variable local to be static, and all the above warnings go away.