yahoogroups-reader - terminal application to read archived yahoogroups files from for instance these ones:


This command line application allows you to read old yahoogroup mailing lists if there were public at this time by downloading old archive files from using your terminal:

To read the group mailings you have to do the following

  • search for metadata using the terms yahoogroups GROUPNAME
  • if you have a hit for an archive, click on the search result
  • on the right you see the files available for download
  • click on the "WEB ARCHIVE GZ FILES" section
  • download the GROUPNAME.warc.gz file for you group
  • gunzip the file using for instance gzip -d GROUPNAME.ID.warc.gz
  • the use the script warc-reader.tcl like this: warc-reader.tcl GROUPNAME.ID.warc | less


#!/usr/bin/env tclsh
package require json

proc processWarc {filename} {
    set html_mapping {&quot; {"} &apos; ' &amp; & &lt; < &gt; > &#92; \\ &#39; '} ;#"
    if {![file exists $filename]} {
        return -code error "Error: File '$filename' does not exists!"
    if [catch {open $filename r} infh] {
        puts stderr "Cannot open $filename: $infh"
    } else {
        set x 0
        set flag null
        while {[gets $infh line] >= 0} {
            if {[regexp {^WARC-Target-URI.+group/([^/]+)/message/([0-9]+)/info} $line -> group msg]} {
                #puts "Message: $group $msg"
                set flag info
            } elseif {[regexp {^WARC-Target-URI.+group/([^/]+)/message/([0-9]+)/raw} $line -> group msgid]} {
                #puts "Message: $group $msg"
                set json ""
            } elseif {[regexp {^."topicId"} $line]} {
                append json "$line\n"
                set flag raw
            } elseif {$flag eq "raw" && [regexp {[a-z]} $line]} {
                append json "$line\n"
            } elseif {$flag eq "raw" && [regexp {^WARC/} $line]} {
                set d [json::json2dict $json]
                set msgflag false
                puts "[string repeat ## 40]\nMessage: $msgid"
                foreach msg [split [dict get $d rawEmail] "\n"] {
                    if {[regexp {^Date: } $msg]} {
                        set msgflag true
                    if {$msgflag} {
                        set msg [string map $html_mapping $msg]
                        if {[regexp {=.$} $msg]} {
                            puts -nonewline "[string range $msg 0 end-2]"
                        } else {
                            puts $msg
                set flag null

        close $infh


if {[info exists argv] && [llength $argv] > 0} {
    processWarc [lindex $argv 0]
} else {
    puts "Usage: [info script] WARCFILE"

Here a screenshot from the terminal:




Please discuss here.

DDG - 2023-03-30 - I could not find a tool which decodes in a usable way the JSON encoded messages, so I wrote my own.