TkChatistics

TR - This is fun! Since the logs of TkChat are simple Tcl files and can be read as commands, we can use them to calculate some statistics with, so-called TkChatistics. The simplest case would be to look at the number of posts per person for a specified day. The following code does this (assuming the corresponding log file is accessible and in the same dir where the script is run):

 proc m {time author msg} {
        #
        # reads one line of TkChat log and accumulates author's post count
        #
        # time -> a timestamp
        # author -> author of post
        # msg -> message posted
        #
        # This procedure uses a global array 'count' to collect the data.
        # Each element in the array has the author as key
        # and the number of posts as the value.
        #
        global count
        if {$author eq ""} return
        if {$author eq "ijchain"} {
                if {[string index $msg 0] eq "*"} return
                set index [string first > $msg]
                set author [string range $msg 0 $index]
        }
        if {! [info exists count($author)]} {
                set count($author) 1
        } else {
                incr count($author) 1
        }
 }


 proc tkChatistics {mode} {
        #
        # print statistics 
        #
        # mode -> the statistics type. Can be
        #           dayCount -> number of posts per person a day
        #           ... -> whatever you come up with!
        #
        # This procedure uses the global array 'count' for extracting
        # the data to print.
        #
        global count
        switch -- $mode {
                dayCount {
                        set max 0
                        # convert data to a nested list:
                        foreach item [array names count] {
                                lappend mylist [list $count($item) $item]
                                if {$count($item) > $max} {set max $count($item)}
                        }

                        # scale
                        set scale 5
                        set fieldWidth [expr {$max/$scale}]

                        foreach item [lsort -index 0 -integer $mylist] {
                                set num [lindex $item 0]
                                set author [lindex $item 1]
                                set stars [string repeat * [expr {$num/$scale}]]
                                puts "[format %${fieldWidth}s $stars] $num $author"
                        }
                }
                default {puts "Not implemented. You have to write this yourself ..."}
        }
 }

 ############### main code ###############
 array unset count
 source 2008-05-14.tcl
 tkChatistics dayCount

This is the output:

                           1 <suchenwi>
                           1 <tcleval>
                           1 dkf_Away
                           1 <CIA-37>
                           1 <nhalkic>
                           2 miguel
                           2 <erider>
                           3 stever
                           3 teo
                           4 rmax
                           4 <yeeling>
                           4 <jccampbell>
                           4 rfoxmich
                         * 6 <Setok>
                         * 6 nem
                         * 6 <NewHandFromCN>
                         * 6 <RockShox>
                         * 6 gerry
                         * 7 <hat1>
                         * 8 jdc
                        ** 10 steveo
                        ** 10 aku
                        ** 10 de
                        ** 11 <hat0>
                        ** 12 gwlester
                        ** 13 Zarutian
                        ** 14 drh
                       *** 15 emiliano
                       *** 17 kostix
                       *** 18 <Intel4004>
                      **** 20 <Phantom-X>
                      **** 23 <aspect>
                     ***** 27 <my007ms>
                     ***** 29 stu
                    ****** 33 GPS
                    ****** 33 jccampbell
                    ****** 34 jenglish
                   ******* 35 <miniK0bo>
                   ******* 37 arjen
                   ******* 38 suchenwi
                   ******* 38 dgp
                  ******** 43 mjanssen
                  ******** 43 colin
                 ********* 45 <gpolo>
                 ********* 47 <Mazzachre>
                 ********* 49 tclguy
                 ********* 49 <peterc>
             ************* 66 patthoyts
           *************** 76 dkf_
         ***************** 85 dkf
    ********************** 113 ∆∆
 ************************* 126 kbk

Quite interesting distribution. The "<>" around some names indicate, these people have been connected via a bridge and were not using TkChat.


Ideas

The next step would be to add fetching the logs directly from tclers.tk and doing some more comprehensive statistics, like ...

  • average number of posts per day per person
  • relation of number of posts to size of posts
  • print the data with real plots using Plotchart, BLT, whatever
  • ...

So, enhance and have fun! (Now back to work ...)

NEM: You could string trim the user names to remove punctuation, as some of the usernames are the same (e.g. I assume dkf and dkf_ are both dkf!)

PT: the bridge actually accumulates statistics as a running total as well. So I get something the following (the numbers are the count of lines posted):

  bin/print_stats | head -12
 35890 suchenwi
 30040 dkf
 29849 [email protected]
 26165 Colin
 22871 mjanssen
 21287 kbk
 20538 GPS
 20397 dgp
 19591 stevel
 16938 hypnotoad
 15757 miguel
 14113 patthoyts