Version 48 of Wikit's performance

Updated 2008-04-28 08:41:49 by jdc

January 11, 2003 - The BiggestWiki page [L1 ] was updated, but how can one get an updated count of the number of pages in this wiki?

Not easily... I've looked at it with MK database cmds: 6253 page entries, of which 2595 are empty (never filled in) -jcw

June 20, 2002 - obsolete and confusing info from this page has been removed -jcw

Performance of wiki markup rendering used to be pretty bad (see comment further down). The kiwi project includes a new renderer which is probably 10x faster (but only HTML so far, no Tk version).

It's all largely irrelevant for CGI use, now that wiki uses a static page cache. Things are more than fast enough.

Having just converted to new hyperlink tracking code in wikit, I decided to collect a bit of stats. First the output (updated Nov 15, 2002):


 4605 pages
 1476 pages have never been filled in
 9794 archive entries
 16487 hyperlinks
 2782 pages have hyperlinks to others
 2838 pages are being hyperlinked from others
 260 leaves (pages not referencing any others wiki pages):

12 33 45 80 94 98 110 112 114 115 117 119 120 124 126 130 133 134 135 142 143 157 158 160 169 186 195 205 287 288 289 292 293 306 307 310 311 315 316 318 319 326 330 331 352 368 370 372 374 381 408 416 422 424 426 429 431 433 474 475 476 558 568 570 574 575 589 616 629 636 651 670 673 675 678 679 680 681 710 716 730 762 767 785 790 811 816 832 841 842 858 865 896 905 928 939 1087 1095 1098 1135 1142 1144 1172 1173 1174 1190 1192 1201 1210 1219 1240 1244 1265 1285 1289 1342 1350 1357 1359 1521 1528 1541 1610 1629 1634 1641 1649 1654 1747 1804 1826 1828 1833 1885 1891 2040 2059 2063 2111 2125 2127 2128 2132 2134 2137 2152 2156 2168 2194 2247 2250 2346 2377 2386 2387 2407 2416 2458 2497 2515 2535 2539 2558 2606 2661 2701 2754 2815 2861 2939 2940 2975 3001 3019 3033 3050 3064 3066 3103 3115 3122 3182 3288 3289 3294 3317 3405 3406 3435 3438 3442 3449 3472 3473 3476 3491 3536 3543 3545 3585 3664 3671 3686 3703 3762 3763 3828 3843 3872 3877 3880 3881 3898 3965 3975 3983 4018 4031 4034 4040 4041 4045 4065 4069 4076 4096 4134 4140 4148 4192 4208 4238 4277 4301 4310 4311 4314 4315 4316 4318 4320 4322 4341 4342 4347 4368 4371 4382 4392 4431 4450 4469 4480 4481 4482 4499 4500 4501 4502 4540

 281 orphans (pages not referenced from any others):

23 65 151 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 328 329 384 385 386 387 388 389 390 394 404 412 415 419 432 438 439 449 452 455 456 464 465 466 467 520 521 522 536 538 545 554 572 577 578 579 580 581 582 597 598 599 601 605 641 655 658 664 665 695 700 705 725 726 742 745 746 753 754 789 798 802 806 807 810 817 818 845 854 862 867 868 892 893 899 914 925 944 965 1005 1068 1076 1131 1136 1138 1139 1160 1169 1209 1216 1221 1226 1229 1231 1238 1250 1257 1271 1282 1284 1288 1311 1319 1320 1338 1341 1349 1352 1370 1371 1376 1377 1382 1523 1524 1676 1706 1768 1774 1795 1839 1843 1858 1876 1907 1920 1929 1930 2121 2182 2184 2209 2213 2225 2388 2410 2412 2427 2438 2454 2474 2481 2505 2511 2521 2522 2526 2534 2568 2646 2654 2669 2700 2733 2742 2764 2800 2840 2847 2873 2880 2934 2938 2972 2998 3013 3017 3029 3045 3046 3049 3059 3072 3073 3088 3101 3136 3147 3160 3165 3168 3185 3196 3198 3207 3213 3214 3230 3231 3246 3247 3262 3273 3285 3297 3310 3314 3328 3329 3335 3339 3357 3391 3398 3400 3402 3434 3436 3437 3469 3470 3481 3492 3504 3587 3590 3642 3658 3693 3695 3696 3718 3719 3725 3727 3775 3799 3806 3816 3820 3995 3999 4008 4033 4092 4160 4598


The code that generated the above:


 #! /usr/bin/env tclkit

 mk::file open db wikit.tkd -readonly

 puts " [mk::view size db.pages] pages"
 puts " [llength [mk::select db.pages date 0]] pages have never been filled in"
 puts " [mk::view size db.archive] archive entries"
 puts " [mk::view size db.refs] hyperlinks"

 mk::loop c db.refs {
   lassign [mk::get $c from to] f t
   if {$f < 10 || $t < 10 || [mk::get db.pages!$t date] == 0} continue
   if {![info exists froms($f)]} {
     set froms($f) 0
   }
   incr froms($f)
   if {![info exists tos($t)]} {
     set tos($t) 0
   }
   incr tos($t)
 }

 puts " [array size froms] pages have hyperlinks to others"
 puts " [array size tos] pages are being hyperlinked from others"

 set leaves {}
 foreach {k v} [array get tos] {
   if {![info exists froms($k)]} {
     lappend leaves $k
   }
 }

 puts " [llength $leaves] leaves (pages not referencing any others wiki pages):"
 puts [join [lsort -integer $leaves] "\t "]

 set orphans {}
 mk::loop c db.pages 10 {
   set i [mk::cursor pos c]
   if {[mk::get $c date] != 0 && ![info exists tos($i)]} {
     lappend orphans $i
   }
 }

 puts " [llength $orphans] orphans (pages not referenced from any others):"
 puts [join [lsort -integer $orphans] "\t "]

06oct02 jcw - Wikit has been modified to enhance its performance when rendering pages with lots of embedded page references, including pages such as Recent Changes. This has a more-than-order-of-magnitude effect. This is especially noticeable in local mode, but CGI also benefits.


escargo - 19 Nov 2002: Are there any built-in functions for trimming the unfilled pages? To me it seems like many of them must have been created by accident, probably by new people who created pages with typographical errors in their names. I would think that a mark/sweep or generational approach, marking pages as dead in one phase, and then reaping them in a later phase, would be both straight forward and useful.


A function to generate a list of all [links] that point to completely blank pages might be nice as well.

Sure ... <insert sound of flexing fingers> ... here ya go!

 mk::file open db wikit.tkd -readonly
 puts [mk::select db.pages page ""]

Remember, though, they aren't necessarily candidates for immediate deletion - the refs view has to be sure not to contain any links to those pages first.


jdc 28-apr-2008

Running the above script (archive lines remove) on a recent wikit.tkd results in the following:

 21023 pages
 10806 pages have never been filled in
 83414 hyperlinks
 9549 pages have hyperlinks to others
 9522 pages are being hyperlinked from others

 258 leaves (pages not referencing any others wiki pages):

    2     4     9   331   422   423   424   426   427   431
  433   568   570   574   575   577   597   598   599   629
  651   665   744   832   905  1087  1289  1311  1352  1357
 1509  1541  1653  1747  1778  1779  1780  1782  1783  1785
 1786  1804  1868  1891  1929  2017  2040  2111  2118  2152
 2156  2194  2209  2416  2481  2487  2539  2810  2939  3088
 3115  3294  3353  3435  3438  3472  3473  3476  3664  3763
 3881  3975  4040  4278  4310  4311  4314  4315  4316  4318
 4322  4371  4450  4469  4480  4499  4500  4540  5118  5241
 5731  5790  5903  5936  6030  6031  6088  6263  6577  8331
 8345  8377  8383  8416  8439  8631  8651  8734  8741  8842
 8858  8874  9044  9059  9189  9243  9296  9369  9385  9565
 9567  9579  9659  9682  9730  9766  9804  9807  9850  9989
10382 10676 10686 10729 10738 10748 10753 10756 10987 11012
11240 11302 11307 11497 11609 11630 11646 11662 11723 11807
11854 11863 11997 12000 12083 12126 12205 12582 12697 12698
12712 12843 12886 12964 12979 12981 13011 13012 13063 13108
13127 13291 13333 13339 13580 13732 14371 14380 14451 14453
14501 14503 14515 14699 14724 14725 14757 14763 14826 14830
14838 14862 14896 14925 14953 15020 15145 15174 15246 15275
15334 15343 15562 15572 15575 15698 15759 15845 15989 15998
15999 16035 16073 16093 16131 16132 16329 16620 16621 16647
16688 16707 16889 16947 17128 17159 17249 17324 17361 17480
17509 17526 17602 17624 17753 17974 17987 17999 18009 18058
18070 18143 18193 18436 19606 19656 19658 19673 19815 19888
20739 20786 20792 20794 20821 20873 20877 21019            

 363 orphans (pages not referenced from any others):

 1136  1731  2167  2219  3965  3969  3979  4823  5725  5873
 6147  6240  7106  8207  8317  8356  8372  8434  8504  8508
 8544  8650  8690  8701  8773  8848  8851  8859  8870  8877
 8879  8916  8992  9005  9017  9057  9110  9142  9241  9266
 9295  9320  9329  9350  9473  9492  9496  9564  9580  9629
 9640  9651  9667  9694  9845  9858 10034 10040 10317 10412
10499 10501 10603 10606 10653 10662 10665 10666 10667 10671
10672 10673 10680 10695 10739 10773 10873 10895 10946 11004
11108 11181 11243 11298 11334 11359 11390 11409 11463 11464
11468 11469 11480 11481 11491 11546 11565 11614 11629 11638
11756 11757 11763 11764 11802 11809 11836 11837 11844 11874
11932 11937 11977 12069 12071 12106 12125 12216 12223 12259
12263 12335 12396 12417 12474 12511 12536 12540 12541 12552
12553 12678 12696 12713 12799 12806 12819 12826 12835 12868
12870 12877 12878 12880 12884 12888 12892 12896 12904 12906
12915 12961 12962 12972 13001 13003 13006 13020 13035 13060
13102 13159 13167 13322 13334 13345 13366 13367 13406 13420
13468 13470 13538 13564 13566 13579 13612 13625 13627 13633
13650 13663 13664 13665 13673 13694 13703 13717 13718 13752
13759 13917 13922 13926 14012 14069 14083 14085 14099 14102
14134 14302 14319 14374 14486 14494 14499 14511 14533 14536
14541 14558 14568 14589 14627 14628 14651 14679 14767 14769
14790 14827 14892 14904 14984 14992 15007 15019 15059 15119
15133 15136 15138 15168 15175 15179 15187 15235 15241 15255
15257 15311 15329 15340 15357 15359 15365 15366 15367 15368
15369 15371 15373 15374 15377 15378 15380 15381 15382 15383
15384 15420 15550 15616 15617 15618 15634 15637 15645 15684
15706 15716 15890 15924 15952 15964 15980 15991 16012 16029
16040 16065 16075 16097 16103 16110 16112 16113 16134 16147
16215 16229 16248 16250 16267 16269 16271 16273 16279 16284
16286 16291 16292 16318 16326 16678 16679 16723 16739 17003
17025 17046 17113 17118 17182 17250 17290 17334 17335 17365
17500 17538 17544 17545 17546 17547 17549 17550 17553 17555
17563 17564 17565 17588 17592 17708 17719 17799 17862 17957
17965 18145 19536 19595 19911 20677 20689 20788 20789 20790
20797 20798 20799 20800 20801 20802 20803 20804 20805 20806
20808 20809 20811