MJ - 20071105, Sparked by an inquiry by Larry McVoy on the Tcl chatroom, the IO performance of Tcl and specifically compared to Perl has been benchmarked quite extensively and was shown to be slower than Perl. In order to get closer to the speed of Perl the IO implementation in Tcl is currently undergoing scrutiny to see where performance gains can be made.
The goal of the page is to summarize the discussion as it has been held up to this point, solutions that have been suggested and patches that have been tried. This should prevent circular discussions on the chat, because people can be directed to this page to get an update of the current state of affairs.
The benchmarks used to time the effect of changes can be downloaded from [L1 ]
JE - 20071105. In particular, it's important to note that the point of the exercise is not to make a "fast cat" in Tcl or to make a "fast grep" in Tcl: these are both easily doable. The point of the exercise is to find where the real bottlenecks are in real text processing filters, and to make those hotspots cooler.
KBK - 20071105:
Hypothesis: The variable access in the three-argument form of gets is a significant contributor to the observed poor performance.
Experiment: A disc file was created containing 1 million lines, each comprising the single letter 'x'. A modification was made to Tcl_GetsObjCmd as shown:
RCS file: /cvsroot/tcl/tcl/generic/tclIOCmd.c,v retrieving revision 1.44 diff -u -r1.44 tclIOCmd.c --- generic/tclIOCmd.c 15 Oct 2007 21:27:48 -0000 1.44 +++ generic/tclIOCmd.c 5 Nov 2007 21:02:36 -0000 @@ -287,9 +287,13 @@ lineLen = -1; } if (objc == 3) { - if (Tcl_ObjSetVar2(interp, objv[2], NULL, linePtr, - TCL_LEAVE_ERR_MSG) == NULL) { - return TCL_ERROR; + if (*Tcl_GetString(objv[2]) != '\0') { + if (Tcl_ObjSetVar2(interp, objv[2], NULL, linePtr, + TCL_LEAVE_ERR_MSG) == NULL) { + return TCL_ERROR; + } + } else { + Tcl_DecrRefCount(linePtr); } Tcl_SetObjResult(interp, Tcl_NewIntObj(lineLen)); return TCL_OK;
The following script, roughly equivalent to 'wc -l' was used as benchmark.
proc read1 {} { set lc 0 set f [open ~/tmp/1mlines r] while {[gets $f line] >= 0} { incr lc } close $f return $lc } proc read2 {} { set lc 0 set f [open ~/tmp/1mlines r] while {[gets $f {}] >= 0} { incr lc } close $f return $lc } set methods {read1 read2} foreach method $methods { set min($method) Inf } for {set i 0} {$i < 5} {incr i} { foreach method $methods { set us [lindex [time $method] 0] puts [list $method: $us] if {$us < $min($method)} { set min($method) $us } } } foreach method $methods { puts [list $method $min($method)] }
Results:
The minimum observed run time for the script where the line was returned into a variable was 2.46 seconds; the minimum observed time where the line was discarded was 2.08 seconds. (The maxima were 2.52 and 2.17 respectively, so system load was presumed not to be an issue).
Conclusion:
Storing the return string in a variable appears to consume roughly 0.28 microseconds per line, or about 12% of the total time spent in this micro-benchmark.
Further experiments by kbk as reported on the chat:
proc fakegets {chan v} { global ct if {![incr ct -1]} {return -1} return 1 } } proc readm {} { set ::ct 1000000 set f [open ~/tmp/DATA r] while {[fakegets $f {}] >= 0} { incr lc } close $f return $lc } proc readm2 {} { set x incr set ct 1000000 set f [open ~/tmp/DATA r] while {[$x ct -1] > 0} { incr lc } close $f return $lc }
Interim conclusion The time spent in the original benchmark seems to be decomposable in
Note that the file is assumed to be in the system's encoding; it might be interesting to test the timings on a binary file.
MS: Moving parts of the discussion to Tcl invoke performance
LV 2007 Nov 06 Platform: SPARC Solaris 9, using Sun's C compiler, built with 64 bit support and threads, and tcl 8.5 CVS head.
I found that the sort.tcl ran about 6 times slower than Perl, 4 times slower than Python, and almost twice as slow as Ruby. The grep.tcl ran about 3 times slower than Ruby's, almost 7 times slower than Python's and about 29 times slower than Perl's... The hash.tcl ran about 40% or so slower than Ruby, about 9 times slower than Python and about 10 times slower than Perl.