Tag Cloud Generator

David Beckemeyer 2007-03-06 - This is pretty basic stuff, but I've received so many cool things from this site, I wanted to give somethign back. I know OpenACS has Tag Clouds, but below is some very simple stand-alone Tcl without any dependencies that folks may find useful.

Given an Array of Counts for the tags, it generates a resulting array with the 'bucket' for each tag within the range specified 1 thru N (default 7). These can then be used with CSS classes such as "tagsz1" "tagsz2" and so on.

This is essentially a port of the PHP snippit I saw here: http://www.hawkee.com/snippet.php?snippet_id=1485

Here's the procedure:

 proc cloud_tags { cloudname {range 7} } {
   upvar ${cloudname}_counts cloud_counts
   upvar ${cloudname}_buckets cloud_buckets
 
   set tag_sizes $range
 
   set taglist {}
   foreach tag [array names cloud_counts] {
     lappend taglist [list $cloud_counts($tag) $tag]
   }
   set total_tags [llength $taglist]
   if {$total_tags} {
 
 #  Start with the sorted list of tags and divide by the number of font
 #  sizes (buckets).  Then proceed to put an even number of tags into each
 #  bucket.  The only restriction is that tags of the same count can't
 #  span 2 buckets, so some buckets may have more tags than others.
 #  Because of this, the sorted list of remaining tags is divided by the
 #  remaining 'buckets' to evenly distribute the remainder of the tags and
 #  to fill as many 'buckets' as possible up to the largest font size.
 
     set min_tags [expr {$total_tags / $tag_sizes}]
 
     set bucket_count 1
     set bucket_items 0
     set tags_set 0
     foreach tagdata [lsort -integer -index 0 $taglist] {
       set tag_count [lindex $tagdata 0]
       set tag [lindex $tagdata 1]
 
 #  If we've met the minimum number of tags for this class and the current
 #  tag does not equal the last tag, we can proceed to the next class.
 
       if {$bucket_items >= $min_tags && $last_count != $tag_count && $bucket_count < $tag_sizes} {
         incr bucket_count
         set bucket_items 0
 
 #  Calculate a new minimum number of tags for the remaining buckets. 
       
         set remaining_tags [expr {$total_tags - $tags_set}]
         set min_tags [expr {$remaining_tags / $bucket_count}]
       }
 
 # Set the tag to the current class.
 
       set cloud_buckets($tag) $bucket_count
       incr bucket_items
       incr tags_set
 
       set last_count $tag_count
     }
   }
 }

Simple test/demo. This reads a list of tags (keywords) from stdin, counts them, and then calls the above cloud_tags procedure and generates the HTML fragment Tag Cloud

 proc add_tag { cloudname tag } {
   upvar ${cloudname}_counts cloud_counts
 
   if {[info exists cloud_counts($tag)]} {
     incr cloud_counts($tag)
   } else {
     set cloud_counts($tag) 1
   }
   return $cloud_counts($tag)
 }
 
 while {[gets stdin tag] > 0} {
   add_tag testcloud $tag
 }
 
 # only use sizes 1-7
 cloud_tags testcloud 7
 
 puts {<div style="float:left; text-align:center; margin:0px 20px 0px 20px;">}
 foreach tag [lsort -dictionary [array names testcloud_counts]] {
   puts "<span class=\"tsxl$testcloud_buckets($tag)\">$tag</span>"
 }
 puts {</div>}

The page would need CSS classes with styles for the font sizes, such as:

 .tsxl1 {font-size: 85%;}
 .tsxl2 {font-size: 95%;}
 .tsxl3 {font-size: 105%;}
 .tsxl4 {font-size: 120%;}
 .tsxl5 {font-size: 130%;}
 .tsxl6 {font-size: 140%;}
 .tsxl7 {font-size: 150%;}

Perhaps somebody will find this useful.