## Comparing Tcl and Python

DKF: Sometimes you come across something that makes you think “wow!” Here's one such thing: comparing the flat out single threaded performance of Tcl and Python. The problem I was looking at was to compute the sum of all prime numbers less than ten million (there are quite a few of them!) and the limiting factor is an efficient method for generating all the primes in the range. I present here two implementations for doing this in the languages under consideration, based on code originally from http://code.activestate.com/recipes/117119/ by way of StackOverflow

### Tcl

```proc sum_primes_to {n {i 1}} {
set total 0
incr n 0
for {set q [expr {\$i + \$i}]} {\$q < \$n} {incr q} {
if {![info exists d(\$q)]} {
incr total \$q
lappend d([expr {\$q*\$q}]) \$q
} else {
foreach p \$d(\$q) {
lappend d([expr {\$p + \$q}]) \$p
}
unset -nocomplain d(\$q)
}
}
return \$total
}

puts [sum_primes_to 10000000]```

### Python

```def sum_primes_to(n):
total = 0
d = {}
q = 2
while q < n:
if q not in d:
total += q
d[q * q] = [q]
else:
for p in d[q]:
d.setdefault(p + q, []).append(p)
del d[q]
q += 1

print(sum_primes_to(10000000))```

So… comparing the performance (overall for the script, with time) on a single system with production-grade builds of both languages, I get this:

Tcl 8.6: 13.250s
Python 2.7: 20.369s
Python 3.5: 22.204s
Python 3.6: 13.874s

These are all production builds that I've built locally to be as fast as possible on my hardware. (Also, they all produce the correct result, 3203324994356.)

gonwalf 2018.01.31: I tried the same code on python 3.5 and tcl 8.6.6 and 8.6.8 with different results:

Tcl 8.6.6: 10.07s
Python 3.5: 10,80s
on Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz

and

Tcl 8.6.8: 8,53s
Python 3.5: 8,08s
on Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz

Which compiler flags did you use for the Tcl build?

DKF: 04-Feb-2018: I used the builds of both Tcl and Python built by macports, all running on my laptop (on mains power). All should be in release as-much-optimisation-as-usually-reasonable mode, and with chunks of computation as large as this, the CPU should be scaled up pretty equally. (NB: The system Tcl build on OSX is actually very slow; they enable an option that adds close tracking of low-level metrics but at great performance overhead.) I've got additional experimental builds where I make the Tcl code quite a lot faster, but they're definitely not used by anyone else yet (and aren't yet correct, semantically).

I recently made a post on StackExchange about the classic "find the N most frequent words ([A-Za-z]+) in a text" problem and found that Tcl was quite slow. Any diagnostic?:

### Tcl

```#!/usr/bin/env tclsh

set data [string tolower [read [open \$path]]]
foreach word [regexp -all -inline {[a-z]+} \$data] {
dict incr wordcount \$word
}
set sorted [lsort -stride 2 -index 1 -int -decr \$wordcount]
lrange \$sorted 0 [expr {\$head * 2 - 1}]
}

foreach {count word} [wordcount {*}\$argv] {
puts "\$word\t\$count"
}```

### Python

```#!/usr/bin/env python3.9
import collections, re, sys

filename = sys.argv
k = int(sys.argv)
reg = re.compile('[a-z]+')

counts = collections.Counter()
for i, w in counts.most_common(k):
print(i, w)```

Those are the times I get:

```\$ time ./wordcount.tcl /tmp/ulysses64.txt 10
...
./wordcount.tcl /tmp/ulysses64.txt 10  24.27s user 1.02s system 99% cpu 25.287 total
\$ time ./wordcount.py /tmp/ulysses64.txt 10
...
./wordcount.py /tmp/ulysses64.txt 10  10.42s user 0.90s system 99% cpu 11.329 total```

and timing the various parts of the Tcl code gets me:

• A massive slowdown (25 -> 40 s)
• file read: 1.2 s, regexp: 15.9 s, dict incr loop: 21.6 s, lsort: 10 ms, lrange: 40 µs

 Category Performance Tcl Python