|Areas||C, extensions, cpu counters, portability|
|Good if student knows||C (required)|
|Benefits to the student||Getting familiar with cpu performance counters, learning to code a Tcl extension, learn about portability issues|
|Benefits to Tcl||Improved tools for performance estimation, both in applications and core development|
|See also||GSoC Idea: Core Performance Analysis (larger context)|
In modern computers the wall-clock time depends on so many external variables (cpu load, cache effects of other processes or threads, etc) that it does not provide a reliable for performance optimization except for order-of-magnitude effects.
Modern CPUs have hardware counters that may provide more reliable performance estimates, especially with respect to cacheing problems - today's main performance bottleneck ( slide 64). There are tools like linux's perf that provide access to these counters, providing measurements between process start and process end. Other tools that allow an estimation of the cache effects are Valgrind's Cachegrind tool, but: it is extremely slow, it measures simulated cached effects, it is not easy to use except for full process measurements.
In order to assist performance estimation for both the Tcl core and scripts it would be desirable to control the access to the hardware counters from Tcl scripts.
The goal of this project is to design and implement a Tcl extension with commands to interact with the CPUs harware counters. Initially the goal is to code an extension that works under linux using .
If time permits, the student will research the possibility of porting the extension to Windows and/or OSX. This will entail finding out about interfaces analogous to , (possibly) redesigning parts of the extension's C-code so that it can be configured to work with the three different APIs, and coding a portable extension.
Some comments here, and discussion of the idea