[MS] I ran the tclbench suite on tclsh compiled with three different compilers and several optimisation combinations. This page summarizes the results. These tests were run on a PIII/600Mhz/192MB laptop running linux RedHat7.2. The compilers were: * gcc2.96 * gcc3.1 [http://gcc.gnu.org] * icc6.0 Intel C++ compiler for linux [http://www.intel.com/software/products/global/eval.htm] [Notes on compiling tcl with icc] '''Results''' %| SPEED |SIZE |COMPILER| OPTIONS |% &| 1.00 | 1.00 |gcc2.96 |-O -march=pentiumpro|& &| 1.05 | 1.00 |gcc2.96 |-O -march=pentiumpro -fomit-frame-pointer|& &| 1.01 | 1.01 |gcc2.96 |-O2 -march=pentiumpro|& &| 1.07 | 1.02 |gcc2.96 |-O2 -march=pentiumpro -fomit-frame-pointer|& &| 1.01 | 0.97 |gcc2.96 |-Os -march=pentiumpro|& &| 1.05 | 0.98 |gcc2.96 |-Os -march=pentiumpro -fomit-frame-pointer|& &| 0.99 | 1.07 |gcc3.1 |-O -march=pentium3|& &| 1.03 | 1.08 |gcc3.1 |-O -march=pentium3 -fomit-frame-pointer|& &| 1.02 | 1.12 |gcc3.1 |-O2 -march=pentium3|& &| 1.06 | 1.13 |gcc3.1 |-O2 -march=pentium3 -fomit-frame-pointer|& &| 1.03 | 1.14 |gcc3.1 |-O3 -march=pentium3|& &| 1.08 | 1.15 |gcc3.1 |-O3 -march=pentium3 -fomit-frame-pointer|& &| 1.04 | 0.97 |gcc3.1 |-Os -march=pentium3|& &| 1.06 | 0.97 |gcc3.1 |-Os -march=pentium3 -fomit-frame-pointer|& &| 1.06 | 1.24 |icc6.0 |-O3 -xK|& &| 1.11 | 1.47 |icc6.0 |-O3 -xK -ip|& '''Conclusions (?)''' * The "-fomit-frame-pointer" flag produces faster code with gcc. It is a question if it is worth the loss of a traceable core file - tcl shouldn't dump core * The default optimisation flag for gcc "-O" seems suboptimal; both GNU compilers produce faster and smaller code with "-Os" * The new gcc3.1 is not a big improvement on 2.96 for our purposes. * Intel's compiler with "-ip" produces slightly faster code than gcc (as measured by tclbench), but a much larger image. Otherwise, icc produces larger but not faster code. '''Notes''' * These were all static builds of tclsh from the current (01-22-02) HEAD * The data presented is size/speed relative to the reference build "gcc2.96 -O". This produced a 702kB tclsh which ran the tclbench suite in 00:04:35. * All compilers were set to produce binaries exploiting the processors features ("-march" and "-x" flags). * I have not checked for the intel equivalent to gcc's "-Os" flag. * The "-fomit-frame-pointer" flag to gcc produces code that is non-debuggable - the stack trace in core files is not usable. This behaviour is also present (I think) in the optimised code produced by icc. [Brett Schwarz] These links may be of interest as well: http://www.coyotegulch.com/acovea/index.html http://freshmeat.net/articles/view/730/ <> Performance