Version 14 of Reading JPEG image dimensions

See jpeg for more details.

original code from "Odeen", mailto:[email protected]

 proc GetJPGDimensions { filename } {
    set fd [ open $filename ]
    fconfigure $fd -translation binary
    set ch1 "00"
    while { ! [ eof $fd ] } {
       binary scan [ read $fd 1 ] "H2" ch2
       if { ( $ch1 == "ff" ) && ( $ch2 >= "c0" ) && ($ch2 <= "c3" ) } {
          binary scan [ read $fd 7 ] "x3SS" height width
          return [ list $height $width ]
       }
       set ch1 $ch2
    }
    error "Couldn't find JPG header for $filename"
 }

It turns out that the above code will not work with images from most digital cameras. They usually include a thumbnail for display on the LCD. The new code will work on any JPEG that follows the standard. This is from a script that I wrote to produce thumbnailed HTML indexes and captioned display pages automatically. You can find the entire script at http://perrigoue.com Much thanks to Odeen for his help.

We need to find the SOF(start of frame) marker. The marker is a two byte code, ffcx where x is a value from 0 to 3. As Odeen noted, we can't just search on "ffcx" because most digital cameras use an embedded thumbnail for display on the camera's LCD. The embedded thumbnail contains all the same markers as the full size image. Thus if we search on "ffcx" we will find the thumbnail's marker first and thus would read in the wrong dimensions. This would not be a good thing. We avoid this by reading the length bytes for each frame and skipping ahead by that many bytes to the next frame marker. we keep doing this until we find the right frame. This would make A LOT more sense if you read the JPEG FAQ at: http://www.faqs.org/faqs/jpeg-faq/part1/

 proc get_jpg_dimensions {filename} {

  # open the file -- no need for write access
  set img [open $filename r]
  # set to binary mode - VERY important
  fconfigure $img -translation binary

  # read in first two bytes
  binary scan [read $img 2] "H4" byte1
  # check to see if this is a JPEG, all JPEGs start with "ffd8", make
  # that SHOULD start with
  if {$byte1!="ffd8"} {
    close $img
    puts "Error! $filename is not a valid JPEG file!"
    exit
  }

  # cool, it's a JPG so let's loop through the whole file until we
  # find the next marker.
  while { ![eof $img]} {

    while {$byte1!="ff"} {
      binary scan [read $img 1] "H2" byte1
    }

    # we found the next marker, now read in the marker type byte,
    # throw out any extra "ff"'s
    while {$byte1=="ff"} {
      binary scan [read $img 1] "H2" byte1
    }


    # if this the the "SOF" marker then get the data
    if { ($byte1>="c0") && ($byte1<="c3") } {
      # it is the right frame. read in a chunk of data containing the
      # dimensions.
      binary scan [read $img 7] "x3SS" height width
      close $img ;# FIX
      # return the dimensions in a list
      return [list $height $width]
    } else {

      # this is not the the "SOF" marker, read in the offset of the
      # next marker
      binary scan [read $img 2] "S" offset
      # the offset includes its own two bytes so we need to subtract
      # them
      set offset [expr $offset -2]
      # move ahead to the next marker
      seek $img $offset current
    } ;# end else

  } ;# end while
    # we didn't find an "SOF" marker, return zeros for error detection
    set height 0
    set width 0
    close $img
    return [list $height $width] 

 } ;# end proc

Note: This code leaks channels (missing some closes). Note 2: Can we assume that this got fixed? I see some close statements have been added since this note had been placed..

http://zdnet.com.com/2100-1104-945735.html is a discussion about patent issues over JPEG algorithms.