Version 43 of Wikit's performance

Updated 2003-01-11 16:18:35

January 11, 2003 - The BiggestWiki page [L1 ] was updated, but how can one get an updated count of the number of pages in this wiki?

Not easily... I've looked at it with MK database cmds: 6253 page entries, of which 2595 are empty (never filled in) -jcw

June 20, 2002 - obsolete and confusing info from this page has been removed -jcw

Performance of wiki markup rendering used to be pretty bad (see comment further down). The kiwi project includes a new renderer which is probably 10x faster (but only HTML so far, no Tk version).

It's all largely irrelevant for CGI use, now that wiki uses a static page cache. Things are more than fast enough.

Having just converted to new hyperlink tracking code in wikit, I decided to collect a bit of stats. First the output (updated Nov 15, 2002):


 4605 pages
 1476 pages have never been filled in
 9794 archive entries
 16487 hyperlinks
 2782 pages have hyperlinks to others
 2838 pages are being hyperlinked from others
 260 leaves (pages not referencing any others wiki pages):

12 33 45 80 94 98 110 112 114 115 117 119 120 124 126 130 133 134 135 142 143 157 158 160 169 186 195 205 287 288 289 292 293 306 307 310 311 315 316 318 319 326 330 331 352 368 370 372 374 381 408 416 422 424 426 429 431 433 474 475 476 558 568 570 574 575 589 616 629 636 651 670 673 675 678 679 680 681 710 716 730 762 767 785 790 811 816 832 841 842 858 865 896 905 928 939 1087 1095 1098 1135 1142 1144 1172 1173 1174 1190 1192 1201 1210 1219 1240 1244 1265 1285 1289 1342 1350 1357 1359 1521 1528 1541 1610 1629 1634 1641 1649 1654 1747 1804 1826 1828 1833 1885 1891 2040 2059 2063 2111 2125 2127 2128 2132 2134 2137 2152 2156 2168 2194 2247 2250 2346 2377 2386 2387 2407 2416 2458 2497 2515 2535 2539 2558 2606 2661 2701 2754 2815 2861 2939 2940 2975 3001 3019 3033 3050 3064 3066 3103 3115 3122 3182 3288 3289 3294 3317 3405 3406 3435 3438 3442 3449 3472 3473 3476 3491 3536 3543 3545 3585 3664 3671 3686 3703 3762 3763 3828 3843 3872 3877 3880 3881 3898 3965 3975 3983 4018 4031 4034 4040 4041 4045 4065 4069 4076 4096 4134 4140 4148 4192 4208 4238 4277 4301 4310 4311 4314 4315 4316 4318 4320 4322 4341 4342 4347 4368 4371 4382 4392 4431 4450 4469 4480 4481 4482 4499 4500 4501 4502 4540

 281 orphans (pages not referenced from any others):

23 65 151 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 328 329 384 385 386 387 388 389 390 394 404 412 415 419 432 438 439 449 452 455 456 464 465 466 467 520 521 522 536 538 545 554 572 577 578 579 580 581 582 597 598 599 601 605 641 655 658 664 665 695 700 705 725 726 742 745 746 753 754 789 798 802 806 807 810 817 818 845 854 862 867 868 892 893 899 914 925 944 965 1005 1068 1076 1131 1136 1138 1139 1160 1169 1209 1216 1221 1226 1229 1231 1238 1250 1257 1271 1282 1284 1288 1311 1319 1320 1338 1341 1349 1352 1370 1371 1376 1377 1382 1523 1524 1676 1706 1768 1774 1795 1839 1843 1858 1876 1907 1920 1929 1930 2121 2182 2184 2209 2213 2225 2388 2410 2412 2427 2438 2454 2474 2481 2505 2511 2521 2522 2526 2534 2568 2646 2654 2669 2700 2733 2742 2764 2800 2840 2847 2873 2880 2934 2938 2972 2998 3013 3017 3029 3045 3046 3049 3059 3072 3073 3088 3101 3136 3147 3160 3165 3168 3185 3196 3198 3207 3213 3214 3230 3231 3246 3247 3262 3273 3285 3297 3310 3314 3328 3329 3335 3339 3357 3391 3398 3400 3402 3434 3436 3437 3469 3470 3481 3492 3504 3587 3590 3642 3658 3693 3695 3696 3718 3719 3725 3727 3775 3799 3806 3816 3820 3995 3999 4008 4033 4092 4160 4598


The code that generated the above:


 #! /usr/bin/env tclkit

 mk::file open db wikit.tkd -readonly

 puts " [mk::view size db.pages] pages"
 puts " [llength [mk::select db.pages date 0]] pages have never been filled in"
 puts " [mk::view size db.archive] archive entries"
 puts " [mk::view size db.refs] hyperlinks"

 mk::loop c db.refs {
   lassign [mk::get $c from to] f t
   if {$f < 10 || $t < 10 || [mk::get db.pages!$t date] == 0} continue
   if {![info exists froms($f)]} {
     set froms($f) 0
   }
   incr froms($f)
   if {![info exists tos($t)]} {
     set tos($t) 0
   }
   incr tos($t)
 }

 puts " [array size froms] pages have hyperlinks to others"
 puts " [array size tos] pages are being hyperlinked from others"

 set leaves {}
 foreach {k v} [array get tos] {
   if {![info exists froms($k)]} {
     lappend leaves $k
   }
 }

 puts " [llength $leaves] leaves (pages not referencing any others wiki pages):"
 puts [join [lsort -integer $leaves] "\t "]

 set orphans {}
 mk::loop c db.pages 10 {
   set i [mk::cursor pos c]
   if {[mk::get $c date] != 0 && ![info exists tos($i)]} {
     lappend orphans $i
   }
 }

 puts " [llength $orphans] orphans (pages not referenced from any others):"
 puts [join [lsort -integer $orphans] "\t "]

06oct02 jcw - Wikit has been modified to enhance its performance when rendering pages with lots of embedded page references, including pages such as Recent Changes. This has a more-than-order-of-magnitude effect. This is especially noticeable in local mode, but CGI also benefits.


escargo - 19 Nov 2002: Are there any built-in functions for trimming the unfilled pages? To me it seems like many of them must have been created by accident, probably by new people who created pages with typographical errors in their names. I would think that a mark/sweep or generational approach, marking pages as dead in one phase, and then reaping them in a later phase, would be both straight forward and useful.


A function to generate a list of all [links] that point to completely blank pages might be nice as well.

Sure ... <insert sound of flexing fingers> ... here ya go!

 mk::file open db wikit.tkd -readonly
 puts [mk::select db.pages page ""]

Category Wikit - Category Tcler's Wiki - Category Performance