Counting comments in a source

In my humble opinion, quality control begins at the simple fact of evaluate how much comment lines proliferate in any program. Here is a simple GUI to perform this tasks. It prints :

  • The total line count
  • The comment line count
  • How many comment lines there are against 100 lines of code

-- Sarnold 13/01/2006

So, do you believe "more comments == high quality" or "fewer comments == high quality"? One can effectively argue both cases.

The real metric, IMO, is not how many comments, but how many useful comments. Sadly, there's no tool to give us that metric.


 # CommentCount.tcl - https://wiki.tcl-lang.org/15263 - Counting comments in a source
 # Simple GUI for counting comments in a file with a Tcl-source, it shows :
 # * The total line count
 # * Count of lines classified as code & comment
 # * Comment-Percentage: How many comment-lines exist for each 100 lines of code 

 #########1#########2#########3#########4#########5#########6#########7#####

  package require Tk

  proc LinesNComments {file} {
  #: Read file, count lines and comment-lines
        set fd [open $file r]
        set comments   0
    set lines      0
    set code_lines 0
    while {![eof $fd]} {
        gets $fd line
        incr lines
        set re {[[:blank:]]*#+[[:blank:]]*[[:alnum:]]+}
        if {[regexp "^$re" $line]} {
            incr comments
            continue
        }
        if {[regexp ";$re" $line]} {
            incr comments
        }
        # skip comment lines with no words
        if {[regexp {^[[:blank:]]*#} $line]} {
            continue
        }
        incr code_lines
        }
        close $fd
        return [list $lines $code_lines $comments]
  }

  proc Inspect {} {
  #: Inspect a file with TCL-source, show results in messagebox
      set file [tk_getOpenFile]
      if {![file exists $file]} {
        bell; tk_messageBox -title "Error" -message "No such file: $file"; return
      }
      foreach {lines code comments} [LinesNComments $file] {}
      if {$code==0} { 
        set percentage 0 
      } else {
        set percentage [expr {double($comments)*100/$code}] 
      }
      set    msg "File $file :\n"
      append msg "$lines lines of text: $code code lines and $comments comment lines,\n"
      append msg [format "for 100 lines of code there are %5.1f lines of comments." $percentage]
      tk_messageBox -title $::title -message $msg
  }

  #: Main :
  set title "CommentChecker"
  wm  title . $title
  label  .label1  -text "Check the use of comments in Tcl files"
  button .inspect -text "Inspect file" -command Inspect 
  button .quit    -text "Quit"         -command exit
  pack .label1 .inspect .quit

HJG After catching the divide-by-zero (e.g. for an empty file), I also did some beautifications: titles for messageboxes, formatting of percentage, 'doc-strings' for the procedures...

Here is a small demo-program for testing the above comment-counter:

  if 0 {
    This block never gets executed, so we can use it for comments:
    Demoprog #001
    Created 2006-01-16
    (c) by me
  }
  #
  catch {console show}  ;# When run from wish: open textmode-console
  puts "Hello World !"  ;# The traditional greeting
  # EOF #

This gives 10 text lines: 8 code lines and 2 comment lines --> 25% comments

Note: I think the check for 'comment lines with no words' is broken.

Sarnold I changed the program again : there are now 10 lines, 8 lines of code and 3 comments in the demo-program. I define the rules as follows:

  • Comment lines (lines that begin with a hash) are valid if they contain at least one word
  • After-command comments (a command ended by a semicolon, which follows a comment) are considered as valid
  • Comment lines are not counted as 'code lines'. A code line is any non-comment line, or a line that contains an after-command comment

Now we can go back to the discussion about software quality ...