Version 3 of Wiki format to HTML

Updated 2002-06-01 14:50:23

Richard Suchenwirth - One of the core Wiki functionalities is to translate its own simple markup "language" into HTML. In ten days Christmas vacations offline, I still wanted to prepare some Wiki pages. Being both impatient and blessed with more time than usual, I wrote me the following converter. It doesn't handle all "WikiML" features, but

  • preformatted vs. regular text,
  • links (with '.' content - this is a previewer only!),
  • rulers, bullets, bold and italic

are all there and seem to work similar to the Real Wiki. Enjoy!

 proc wiki2html fn {
        set fp [open $fn]
        set text [read $fp [file size $fn]]
        regsub {[\n\t ]+$} $text "" text
        set s [split $text \n]
        close $fp
        if [regexp {TITLE: (.+)} [lindex $s 0] -> title] {
                set s [lreplace $s 0 0]
        } else {set title $fn}
        set ofn [file rootname $fn].htm
        set fp [open $ofn w]
        set pre 0; set p 0
        puts $fp "<HTML><HEAD><TITLE>$title</TITLE></HEAD><BODY><H1>$title</H1>"
        regsub -all {\\\n} $s "\\ \n"  s ;# keep the look of "\" continuation
        regsub -all {\&}   $s {\&amp;}  s
        regsub -all <      $s {\&lt;}  s
        regsub -all >      $s {\&gt;}  s
        foreach i $s {
                if !$pre {
                        set it 0; set todo 1
                        while {$todo} {
                                set todo [regsub ''' $i <[expr {$it?"/":""}]B> i]
                                set it [expr {1-$it}]
                        }
                        set it 0; set todo 1
                        while {$todo} {
                                set todo [regsub '' $i <[expr {$it?"/":""}]I> i]
                                set it [expr {1-$it}]
                        }
                }
                if {![regexp {^[ \t]} $i] || [regexp {^   \* } $i]} {
                        regsub -all {\[\[} $i \x81 i
                        regsub -all {\]\]} $i \x82 i
                        regsub -all {\[}   $i {<A HREF=.>} i
                        regsub -all {\]}   $i {</A>} i
                        regsub -all \x81   $i {[} i
                        regsub -all \x82   $i {]} i
                }
                if {!$pre && [regsub {^   \* } $i <LI> i]} {
                        set p 0
                } elseif {!$pre && [string trim $i]==""} {
                        if !$p {puts $fp <P>}; set p 1; continue
                } elseif {$pre && [string trim $i]==""} {
                        puts $fp ""; continue
                } elseif {!$pre && $i=="----"} {
                        puts $fp <HR>; continue
                } elseif {[regexp "^ " $i]||[regexp "^\t" $i]} {
                        if !$pre {set pre 1; puts $fp <PRE>}; set p 0
                } else {
                        if $pre {set pre 0; puts $fp </PRE>}; set p 0
                }
                puts $fp $i
        }
        if $pre {puts $fp </PRE>}
        set now [clock format [clock seconds]]
        puts $fp "<HR><I>Converted by wiki2html on $now</I>"
        puts $fp "</BODY></HTML>"
        close $fp
 }

Andreas Kupries: Note that the TIP format is a close derivative of the WikiML and that Donal Fellows wrote several converters for it (to HTML, to TXT, to XML). See his website. See also the tiprender project at SourceForge http://sourceforge.net/projects/tiprender/ .