Wikit Reference Formatting

This page started out as a modification to format.tcl to allow labels to be specified in references. It has since grown to change the way paragraphs are interpreted, unfortunately it isn't easy to split the changes out into two patches.

8 Oct 2003: This page has fallen further behind, please see AKG Wikit.

The new formatting rules are:

  1. You can refer to another page by putting its name in square brackets like this: [TITLE]. To specify a label use: [TITLE | LABEL]. Note that the separator is " | " (space, vertical bar, space).
  2. URLs will automatically be recognized and converted into a hyperlink: http://your.site/contents.html . To specify a label use [http://your.site/contents.html | LABEL]. Once again, the separator is " | " (space, vertical bar, space). The label will be displayed within square brackets.
  3. Paragraphs are ended by a blank line. A carriage return in the source indicates that a new line should be started in the current paragraph, i.e. a soft return. This affects standard text and list items (bullets, numbered lists and tagged lists).
  4. Headings can be entered with the format "==X Heading ===", where X is the heading level between 1 and 6.

My personal belief is that the change to the paragraph interpretation is still fairly intuitive, but provide much more flexibility control over the final appearance, and while it will change the formatting of existing pages, the change will not be so large as to break the existing formatting. I expect that not everyone will agree with the changes.

As you can see, this will implement the changes suggested by jcw below (thanks for the suggestions).

I've made a wikit.kit available for download, please see AKG.

The patch is against Wikit version: 2003/08/20 19:06:22 68261-69214

Cheers, AKG


format.tcl

The changes to format.tcl are so extensive that I've just included the whole file.

 # -*- tcl -*-
 # Formatter for wiki markup text, CGI as well as GUI

 package provide Wikit::Format 1.0

 namespace eval Wikit::Format {
     namespace export TextToStream StreamToTk StreamToHTML StreamToRefs \
                             StreamToUrls

     # In this file:
     #
     # proc TextToStream {text} -> stream   
     # proc StreamToTk {stream infoProc} -> {{tagged-text} {urls}}   
     # proc StreamToHTML {stream cgiPrefix infoProc} -> {{html} {urls}}
     # proc StreamToRefs {stream infoProc} -> {pageNum ...}
     # proc StreamToUrls {stream} -> {url type ...}
     #
     # The "Text"   format is a Wiki-like one you can edit with a text editor.
     # The "Tk"     format can insert styled text information in a text widget.
     # The "HTML"   format is the format generated for display by a browser.
     # The "Refs"   format is a list with details about embedded references.
     # The "Urls"   format is a list of external references, bracketed or not.
     # The "Stream" format is a Tcl list, it's only used as interim format.

     # =========================================================================

     ### More format documentation

     # =========================================================================

     #
     # Ad "Tk")     This is a list of pairs {text taglist text taglist ...}
     #              which can be directly inserted into any text widget.
     #
     # Ad "Stream") This is the first time that the stream format is documented.
     #
     #     The base format is that of a list of pairs {cmd arg cmd arg ...}
     #     The available commands fall into three categories [x]:
     #
     #     1. Data carriers
     #     2. Visual markers
     #     3. Structural markers
     #
     #     [x] In the previous incarnation of this stream format the categories
     #         were essentially all mixed up and jumbled together. For example
     #         the command 'T' crossed category 1 and 3, introducing a new para-
     #         graph and also carrying the first fragment of text of said para-
     #         graph. That made for difficult creation and difficult interpreta-
     #         tion. It is the separation of the categories which makes the re-
     #         organized format much easier to generate and convert (<=> simpler
     #         code, which is even faster). (Not to mention the eviction of
     #         treating [, ], {, }, and \ as special characters. They are not).
     #
     #     Ad 1)   The empty string and 'g', 'u' and 'x'. The first is for text,
     #             the others specify the various possible links.
     #
     #             Cmd        Argument
     #             ------------------------------------------------------
     #             {}        The text to display
     #             g        Name/Title of referenced wiki page
     #             u        external URL, was unbracket'ed in sources
     #             x        external URL, bracket'ed in sources
     #             y     external URL, bracket'ed with label in sources
     #             ------------------------------------------------------
     #
     #     Ad 2)   Currently only two: 'b' and 'i' for bold and italic emphasis.
     #             The argument specifies if the emphasis is switched on or off.
     #             The permitted values are 0 (off) and 1 (on).
     #
     #     Ad 3)   These are the markers for the various distinctive sections
     #             in wiki markup.
     #
     #             Cmd        'Begin'                 Argument
     #             ------------------------------------------------------
     #             T        Paragraph                Nesting level
     #             E     Empty Line              Nesting level
     #             L     Line                    Nesting level
     #             Q        Quoted line                Nesting level
     #             U        List item (unordered)        Nesting level
     #             O        List item (enumerated)        Nesting level
     #             I        List item (term)        Nesting level
     #             D        List item (term def)        Nesting level
     #             H        Horizontal rule                Line-width
     #             ------------------------------------------------------
     #
     #             Note: The current frontend renderer provides only nesting
     #                   level 0 and a line-width 1. The current backend
     #                   renderers ignore this information.
     #

     # =========================================================================
     # =========================================================================

     ### Frontend renderer                         :: Wiki Markup ==> Stream ###

     # =========================================================================
     # =========================================================================

     ## Basic operation: Each line is classified via regexes and then handled
     ## according to its type. Text lines are coalesced into paragraphs, with
     ## some special code to deal with the boundary between normal text and
     ## verbatim quoted text. Each collected line is then separated into chunks
     ## of text, highlighting command and links (wiki page / external). This is
     ## then added to the internal representation.

     proc TextToStream {text} {
         # Based upon ideas from the kiwi renderer. One step rendering into
         # the internal representation without a script as intermediate step.

         set irep      [list] ; # Internal representation generated here.
         set empty_std 1      ; # Boolean flag. Set if the preceding line was empty.
         set first_line 1     ; # Boolean flag first line of text.
                                # Used to set initial paragraph if first line
                                # of text is STD.
         # puts "Text=$text.\n"
         foreach line [split $text \n] {
             # Per line, classify the it and extract the main textual information.
             foreach {tag depth txt aux} [linetype $line] break ; # lassign
             # puts "tag: $tag, depth: $depth, txt: $txt, aux: $aux"
             # Classification tags
             #
             # UL, OL, DL = Lists (unordered/bullet, ordered/enum,
             #                     definition/itemized)
             # PRE        = Verbatim / Quoted lines
             # HR         = Horizontal rule
             # E          = Empty line
             # STD        = Standard text

             ## Whenever we encounter a special line, not quoted, any
             ## preceding empty line has no further effect.
             #
             switch -exact -- $tag {
                 HR - UL - OL - DL {set empty_std 0}
                 default {}
             }

             ## Now processs the lines according to their types.
             #
             # Tag | depth         | txt             | pfx           | aux
             # ----+---------------+-----------------+---------------+---------------
             # UL  | nesting level | text of item    | before bullet | bullet
             # OL  | nesting level | text of item    | before bullet | bullet
             # DL  | nesting level | term definition | before bullet | term
             # PRE | 1             | text to display |
             # HR  | 0             | text of ruler   |
             # STD | 0             | text to display |
             # E   | 0             | none            |
             # Hx  | header level  | header text     |
             # ----+---------------+-----------------+---------------+---------------

             # HR     - Trivial
             # UL, OL - Mark their beginning and then render their text
             #        - like a normal paragraph.
             # DL     - Like list item, except that there are two different
             #          parts of text we have to render, term and term definition.
             # PRE    - Quoted text is searched for links, but nothing
             #          more. An empty preceding line is added to the
             #          quoted section to keep it at a distance from the
             #          normal text coming before.
             # STD    - Render lines as they are encountered.  Each line of input
             #          text starts a line of output text.  An empty line ends the
             #          paragraph.

             switch -exact -- $tag {
                 HR  {
                     lappend irep H 1
                     set empty_std 1   ; # HR forces a new paragraph, which is equivalent to an
                                         # empty line.
                 }
                 UL  {lappend irep U 0 ; render $txt}
                 OL  {lappend irep O 0 ; render $txt}
                 DL  {
                     lappend irep I 0 ; render $aux
                     lappend irep D 0 ; render $txt
                 }
                 PRE {
                     # Transform a preceding 'STD {}' into an empty Q line,
                     # i.e make it part of the verbatim section, enforce
                     # visual distance.

                     lappend irep Q 0
                     if {$txt != {}} {rlinks $txt}
                 }
                 STD {
                     if {$empty_std} {
                         lappend irep T 0
                     } else {
                         lappend irep L 0
                     }
                     render $txt
                     set empty_std 0
                 }
                 E {
                     lappend irep E 0
                     set empty_std 1
                 }
                 H1 - H2 - H3 - H4 - H5 - H6 {
                     lappend irep $tag $depth ; render $txt
                 }
                 default {
                     error "Unknown linetype $tag"
                 }
             }
         }

         return $irep
     }

     proc linetype {line} {
         # Categorize a line of wiki text based on indentation and prefix

         set line [string trimright $line]

         #
         # If the line is empty, return linetype E.
         # 
         if {[string length $line] == 0} {
             return [list E 0 $line]
         }

         ## Compat: retain tabs ...
         ## regsub -all "\t" $line "    " line
         #
         ## More compat'ibility ...
         ## The list tags allow non-multiples of 3 if the prefix contains at
         ## least 3 spaces. The standard wiki accepts anything beyond 3 spaces.
         ## Keep the kiwi regexes around for future enhancements.

         foreach {tag re} {
             UL        {^(   + {0,2})(\*) (\S.*)$}
             OL        {^(   + {0,2})(\d)\. (\S.*)$}
DL {^( + {0,2})(^:+)
(\S.*)$}
             UL        {^(   +)(\*) (\S.*)$}
             OL        {^(   +)(\d)\. (\S.*)$}
DL {^( +)(^:+)
(\S.*)$}
         } {
             # Compat: Remove restriction to multiples of 3 spaces.

             if {[regexp $re $line - pfx aux txt] } {
                 #    && string length $pfx % 3 == 0
                 return [list $tag [expr {[string length $pfx]/3}] $txt $aux]
             }
         }

         # Compat: Accept a leading TAB is marker for quoted text too.

         if {([string index $line 0] == " ") || ([string index $line 0] == "\t")} {
             return [list PRE 1 $line]
         }
         if {[regexp {^-{4,}$} $line]} {
             return [list HR 0 $line]
         }
         if {[regexp {^==([1-6]) (.*) ===$} $line whole level title]} {
             return [list "H$level" 0 [string trim $title]]
         }

         return [list STD 0 $line]
     }

     proc rlinks {text} {
         # Convert everything which looks like a link into a link. This
         # command is called for quoted lines, and only quoted lines.

         upvar irep irep

         # Compat: (Bugfix) Added " to the regexp as proper boundary of an url.
         set re {\m(https?|ftp|news|mailto|file):(\S+[^\]\)\s\.,!\?;:'>"])}
         set txt 0
         set end [string length $text]

         ## Find the places where an url is inside of the quoted text.

         foreach {match dummy dummy} [regexp -all -indices -inline $re $text] {
             # Skip the inner matches of the RE.
             foreach {a e} $match break
             if {$a > $txt} {
                 # Render text which was before the url
                 lappend irep {} [string range $text $txt [expr {$a - 1}]]
             }
             # Render the url
             lappend irep u [string range $text $a $e]
             set txt [incr e]
         }
         if {$txt < $end} {
             # Render text after the last url
             lappend irep {} [string range $text $txt end]
         }
         return
     }

     proc render {text} {
         # Rendering of regular text: links, markup, brackets.

         # The main idea/concept behind the code below is to find the
         # special features in the text and to isolate them from the normal
         # text through special markers (\0\1...\0). As none of the regular
         # expressions will match across these markers later passes
         # preserve the results of the preceding passes. At the end the
         # string is split at the markers and then forms the list to add to
         # the internal representation. This way of doing things keeps the
         # difficult stuff at the C-level and avoids to have to repeatedly
         # match and process parts of the string.

         upvar irep irep
         variable codemap

         ## puts stderr \]>>$irep<<\[
         ## puts stderr >>>$text<<<

         # Detect page references, external links, bracketed external
         # links, brackets and markup (hilites).

         # Complex RE's used to process the string
         set pre   {\[([^\]]*)]}  ; # page references ; # compat
         set lre   {\m(https?|ftp|news|mailto|file):(\S+[^\]\)\s\.,!\?;:'>"])} ; # links
         set blre  "\\\[\0\1u\2(\[^\0\]*)\0\\\]"
         set bllre "\\\[\0\1u\2(\[^\0\]*)\0\\s\\|\\s(\[^\\]]\*)\\\]"

         # " - correct emacs hilite

         # Order of operation:
         # - Remap double brackets to avoid their interference.
         # - Detect embedded links to external locations.
         # - Detect brackets links to external locations (This uses the
         #   fact that such links are already specially marked to make it
         #   easier.
         # - Detect references to other wiki pages.
         # - Render bold and italic markup.
         #
         # Wiki pages are done last because there is a little conflict in
         # the RE's for links and pages: Both allow usage of the colon (:).
         # Doing pages first would render links to external locations
         # incorrectly.
         #
         # Note: The kiwi renderer had the order reversed, but also
         # disallowed colon in page titles. Which is in conflict with
         # existing wiki pages which already use that character in titles
         # (f.e. [COMPANY: Oracle].

         # Make sure that double brackets do not interfere with the
         # detection of links.
         regsub -all {\[\[} $text {\&!} text

         ## puts stderr A>>$text<<*

         # Isolate external links.
         regsub -all $lre $text "\0\1u\2\\1:\\2\0" text
         ## puts stderr C>>$text<<*

         # External links in brackets are simpler cause we know where the
         # links are already.
         # 
         # First handle external links with labels
         # 
         regsub -all $bllre $text "\0\1y\2\\1 | \\2\0" text
         #
         # Now external links without labels.
         # 
         regsub -all $blre $text "\0\1x\2\\1\0" text
         ## puts stderr D>>$text<<*

         # Now handle wiki page references
         regsub -all $pre $text "\0\1g\2\\1\0" text
         ## puts stderr B>>$text<<*

         # Hilites are transformed into on and off directives.
         # This is a bit more complicated ... Hilites can be written
         # together and possible nested once, so it has make sure that
         # it recognizes everything in the correct order!

         # Examples ...
         # {''italic'''''bold'''}         {} {<i>italic</i><b>bold</b>}
         # {'''bold'''''italic''}         {} {<b>bold</b><i>italic</i>}
         # {'''''italic_bold'''''}        {} {<b><i>italic_bold</i></b>}

         # First get all un-nested hilites
         while {
             [regsub -all {'''([^']+?)'''} $text "\0\1b+\0\\1\0\1b-\0" text] ||
             [regsub -all {''([^']+?)''}   $text "\0\1i+\0\\1\0\1i-\0" text]
         } {}

         # And then the remaining ones. This also captures the hilites
         # where the highlighted text contains single apostrophes.

         regsub -all {'''(.+?)'''} $text "\0\1b+\0\\1\0\1b-\0" text
         regsub -all {''(.+?)''}   $text "\0\1i+\0\\1\0\1i-\0" text


         # Normalize brackets ...
         set text [string map {&! [ ]] ]} $text]

         # Listify and generate the final representation of the paragraph.

         ## puts stderr *>>$text<<*

         foreach item [split $text \0] {
             ## puts stderr ====>>$item<<<

             set cmd {} ; set detail {}
             foreach {cmd detail} [split $item \2] break
             set cmd [string trimleft $cmd \1]

             ## puts stderr ====>>$cmd|$detail<<<

             switch -exact -- $cmd {
                 b+    {lappend irep b 1}
                 b-    {lappend irep b 0}
                 i+    {lappend irep i 1}
                 i-    {lappend irep i 0}
                 default {
                     if {$detail == {}} {
                         # Pure text
                         if {$cmd != ""} {
                             lappend irep {} $cmd
                         }
                     } else {
                         # References.
 #2003-06-20: remove whitespace clutter in page titles
                         regsub -all {\s+} [string trim $detail] { } detail
                         lappend irep $cmd $detail
                     }
                 }
             }

             ## puts stderr ======\]>>$irep<<\[
         }
         ## puts stderr ======\]>>$irep<<\[
         return
     }

     #
     # Split the supplied text in to a Title (reference) and Label.
     # 
     # The input text is assumed to be in the form "Title | Label".
     #
     # Thanks to JCW (https://wiki.tcl-lang.org/jcw) for the code.
     # 
     proc SplitTitle {text} {
         if {[regexp {(.*)\s\|\s(.*)} $text - a b]} {
             set a [string trim $a]
             set b [string trim $b]
             if {$a ne "" && $b ne ""} {
               return [list $a $b]
             }
         }
         return [list $text $text]
     }

     # =========================================================================
     # =========================================================================

     ### Backend renderer                                   :: Stream ==> Tk ###

     # =========================================================================
     # =========================================================================

     # Output specific conversion. Takes a token stream and converts this into
     # a three-element list:
     #    $result - A list of text fragments and tag-lists, 
     #              as described at the beginning as the "Tk" format. 
     #    $urls -  A list of triples listing the references found in the page.
     #             This second list is required because some information 
     #             about references is missing from the "Tk" format. 
     #             And adding them into that format would make the 
     #             insertion of data into the final text widget ... 
     #             complex (which is an understatement IMHO). 
     #             Each triple consists of: url-type (g, u, x, y),
     #             page-local numeric id of url (required for and used in tags) 
     #             and reference text, in this order.
     #    $eims -  The third list is a list of embedded images 
     #             (i.e. stored in "images" view), to be displayed in text 
     #             widget.

     # Note: The first incarnation of the rewrite to adapt to the new
     # "Stream" format had considerable complexity in the part
     # assembling the output. It kept knowledge about the last used
     # tags and text around, using this to merge runs of text having
     # the same taglist, thus keeping the list turned over to the text
     # widget shorter. Thinking about this I came to the conclusion
     # that removal of this complexity and replacing it with simply
     # unconditional lappend's would gain me time in StreamToTk, but
     # was also unsure how much of a negative effect the generated
     # longer list would have on the remainder of the conversion (setup
     # of link tag behaviour in the text widget, insertion in to the
     # text widget). Especially if the drain would outweigh the gain.
     # As can be seen from the code chosen here, below, I found that
     # the gain through the simplification was much more than the drain
     # later. I gained 0.3 usecs in this stage and lost 0.13 in the
     # next (nearly double time), overall gain 0.17.

     proc StreamToTk {s {ip ""}} {
         #             ; # State of renderer
         set urls   "" ; # List of links found
         set eims   "" ; # List of embedded images
         set result "" ; # Tk result
         set state  T  ; # Assume a virtual paragraph in front of the actual data
         set pType  T  ; # Paragraph type.
         set count  0  ; # Id counter for page references
         set xcount 0  ; # Id counter for bracketed external references
         set number 0  ; # Counter for items in enumerated lists
         set b      0  ; # State of bold emphasis   - 0 = off, 1 = on
         set i      0  ; # State of italic emphasis - 0 = off, 1 = on

         foreach {mode text} $s {
             # puts "pType=$pType, state=$state, mode=$mode, text=$text, b=$b, i=$i"
             switch -exact -- $mode {
                 {}    {
                     if {$text == {}} {continue}
                     lappend result $text [tagFor $pType $state $b $i]
                 }
                 b - i {set $mode $text ; # text in {0,1}}
                 g {
                     set split [SplitTitle $text]
                     set title [lindex $split 0]
                     set label [lindex $split 1]
                     set     n    [incr count]
                     lappend urls g $n $title
                     set     tags [set base [tagFor $pType $state $b $i]]
                     lappend tags url g$n

                     if {$ip == ""} {
                         lappend result $label $tags
                         continue
                     }

                     set info [lindex [$ip $title] 2]

                     if {$info == "" || $info == 0} {
                         lappend result \[ $tags $label $base \] $tags
                         continue
                     }

                     lappend result $label $tags
                 }
                 u {
                     set n [incr count]
                     lappend urls u $n $text

                     set tags [tagFor $pType $state $b $i]
                     if {[lindex $tags 0] == "fixed"} {
                             lappend tags urlq u$n
                     } else {
                             lappend tags url u$n
                     }

                     lappend result $text $tags
                 }
                 x {
                     # support embedded images if present in "images" view
                     set iseq ""
                     if {[regexp {\.(gif|jpg|png)$} $text - ifmt]} {
                         set iseq [mk::select wdb.images url $text -count 1]
                         if {$iseq != "" && [info commands eim_$iseq] == ""} {
                             if {$ifmt == "jpg"} { set ifmt jpeg }
                             catch { package require tkimg::$ifmt }
                             catch {
                                     image create photo eim_$iseq -format $ifmt \
                                             -data [mk::get wdb.images!$iseq image]
                             }
                         }
                     }
                     if {[info commands eim_$iseq] != ""} {
                         #puts "-> $xcount $text"
                         lappend result " " eim_$iseq
                         lappend eims eim_$iseq
                     } else {
                         set n [incr xcount]
                         lappend urls x $n $text
                         # Why are both $tags and $base necessary?
                         set     tags [set base [tagFor $pType $state $b $i]]
                         lappend tags url x$n
                         lappend result \[ $base $n $tags \] $base
                     }
                 }
                 y {
                     set split [SplitTitle $text]
                     set title [lindex $split 0]
                     set label [lindex $split 1]
                     set n [incr xcount]
                     lappend urls y $n $title
                     # Why are both $tags and $base necessary?
                     set     tags [set base [tagFor $pType $state $b $i]]
                     lappend tags url y$n
                     lappend result \[ $base $label $tags \] $base
                 }
                 Q {
                     set number 0 ;# reset counter for items in enumerated lists
                     # use the body tag for the space before a quoted string
                     # so the don't get a gray background.
                     # puts "Q: Added vertical space [vspace $state $mode]."
                     lappend result [vspace $state $mode] [tagFor T T 0 0]
                     set state $mode
                     set pType $mode
                 }
                 T - I - D {
                     set number 0 ;# reset counter for items in enumerated lists
                     # puts "TID: Added vertical space [vspace $state $mode]."
                     lappend result [vspace $state $mode] [tagFor $pType $mode 0 0]
                     set state $mode
                     set pType $mode
                 }
                 L {
                     # puts "L: Added vertical space [vspace $state $mode]."
                     lappend result [vspace $state $mode] [tagFor $pType $mode 0 0]
                     set state $mode
                 }
                 E {
                     # puts "E: Added vertical space [vspace $state $mode]."
                     lappend result [vspace $state $mode] [tagFor $pType $mode 0 0]
                     set state $mode
                     set pType $mode
                 }
                 U {
                     # puts "U: Added vertical space [vspace $state $mode]."
                     lappend result \
                             "[vspace $state $mode]   *\t" [tagFor $pType $mode 0 0]
                     set state $mode
                     set pType $mode
                 }
                 O {
                     # puts "O: Added vertical space [vspace $state $mode]."
                     lappend result \
                             "[vspace $state $mode]   [incr number].\t" [tagFor $pType $mode 0 0]
                     set state $mode
                     set pType $mode
                 }
                 H {
                     # puts "H: Added vertical space [vspace $state $mode]."
                     lappend result \
                             [vspace $state $mode] [tagFor $pType T 0 0] \
                             \t                   [tagFor $pType H x x] \
                             \n                   [tagFor $pType H 0 0]
                     set state $mode
                     set pType $mode
                 }
                 H1 - H2 - H3 - H4 - H5 - H6 {
                     lappend result [vspace $state $mode] [tagFor $pType $mode 0 0]
                     set state $mode
                     set pType $mode
                 }
             }
             # puts "result=$result\n\n\n"
         }

         list [lappend result "" body] $urls $eims
     }

     # Map from the tagcodes used in StreamToTk above to the taglist
     # used in the text widget the generated text will be inserted into.

     proc tagFor {state mode b i} {

         if {"$mode$b$i" == "Hxx"} {
             return "hr thin"
         }

         switch -exact -- $mode {
             Q {
                 set result "fixed"
             }
             H {
                 set result "thin"
             }
             U {
                 set result "ul"
             }
             O {
                 set result "ol"
             }
             I {
                 set result "dt"
             }
             D {
                 set result "dl"
             }
             L {
                 switch -exact -- $state {
                     T - E {
                         set result "body"
                     }
                     Q {
                         set result "fixed"
                     }
                     H {
                         set result "thin"
                     }
                     U - O - I - D {
                         set result "li"
                     }
                 }
             }
             H1 {
                 set result "h1"
             }
             H2 {
                 set result "h2"
             }
             H3 {
                 set result "h3"
             }
             H4 {
                 set result "h4"
             }
             H5 {
                 set result "h5"
             }
             H6 {
                 set result "h6"
             }
             default {     ; # T - E
                 set result "body"
             }
         }
         switch -exact -- $b$i {
             01 {
                 append result " i"
             }
             10 {
                 append result " b"
             }
             11 {
                 append result " bi"
             }
         }
         return $result
     }

     #
     # Define amount of vertical space used between each logical section of text.
     #
     proc vspace {last current} {
         set lookup "$last$current"
         set count 1
         switch -exact -- $lookup {
             TT - UQ - OQ - IQ - DQ - QU - QO - QI - \
             UH - OH - IH - DH - HH {
               set count 2
             }
             QH {
                 set count 3
             }
         }
         return [string repeat \n $count]
     }

     variable  vspace1
     proc vs {last current n} {
         variable vspace1
         set vspace1($last$current) [string repeat \n $n]
         return
     }
     vs T T 2  ;vs T L 1  ;vs T E 1  ;vs T Q 1  ;vs T U 1  ;vs T O 1  ;vs T I 1  ;vs T D 1  ;vs T H 1
     vs L T 1  ;vs L L 1  ;vs L E 1  ;vs L Q 1  ;vs L U 1  ;vs L O 1  ;vs L I 1  ;vs L D 1  ;vs L H 1
     vs E T 1  ;vs E L 1  ;vs E E 1  ;vs E Q 1  ;vs E U 1  ;vs E O 1  ;vs E I 1  ;vs E D 1  ;vs E H 1
     vs Q T 1  ;vs Q L 1  ;vs Q E 1  ;vs Q Q 1  ;vs Q U 2  ;vs Q O 2  ;vs Q I 2  ;vs Q D 1  ;vs Q H 3
     vs U T 1  ;vs U L 1  ;vs U E 1  ;vs U Q 2  ;vs U U 1  ;vs U O 1  ;vs U I 1  ;vs U D 1  ;vs U H 2
     vs O T 1  ;vs O L 1  ;vs O E 1  ;vs O Q 2  ;vs O U 1  ;vs O O 1  ;vs O I 1  ;vs O D 1  ;vs O H 2
     vs I T 1  ;vs I L 1  ;vs I E 1  ;vs I Q 2  ;vs I U 1  ;vs I O 1  ;vs I I 1  ;vs I D 1  ;vs I H 2
     vs D T 1  ;vs D L 1  ;vs D E 1  ;vs D Q 2  ;vs D U 1  ;vs D O 1  ;vs D I 1  ;vs D D 1  ;vs D H 2
     vs H T 1  ;vs H L 1  ;vs H E 1  ;vs H Q 1  ;vs H U 1  ;vs H O 1  ;vs H I 1  ;vs H D 1  ;vs H H 2
     rename vs {}

     # =========================================================================
     # =========================================================================

     ### Backend renderer                                 :: Stream ==> HTML ###

     # =========================================================================
     # =========================================================================

     # Output specific conversion. Takes a token stream and converts this
     # into HTML. The result is a 2-element list. The first element is the
     # HTML to render. The second element is a list of triplets listing all
     # references found in the stream (each triplet consists reference
     # type, page-local numeric id and reference text).

     proc StreamToHTML {s {cgi ""} {ip ""}} {
         set result ""
         set state E           ; # bogus empty line as initial state.
         set paragraphType E   ; # Paragraph types are T, Q, U, O, I, D
         set count 0
         variable html_frag

         foreach {mode text} $s {
             # puts "state=$state, mode=$mode, text=$text"
             switch -exact -- $mode {
                 {}    {append result [quote $text]}
                 b - i {append result $html_frag($mode$text)}
                 g {
                     set split [SplitTitle $text]
                     set title [lindex $split 0]
                     set label [lindex $split 1]
                     if {$cgi == ""} {
                         append result "\[[quote $label]\]"
                         continue
                     }
                     if {$ip == ""} {
                         # no lookup, turn into a searchreference
                         append result \
                                 $html_frag(a_) $cgi$title $html_frag(tc) \
                                 [quote $label] $html_frag(_a)
                         continue
                     }

                     set info [$ip $title]
                     foreach {id name date} $info break

                     if {$id == ""} {
                         # not found, don't turn into an URL
                         append result "\[[quote $label]\]"
                         continue
                     }

                     regsub {^/} $id {} id
                     if {$date > 0} {
                         # exists, use ID
                         append result \
                                 $html_frag(a_) $id $html_frag(tc) \
                                 [quote $label] $html_frag(_a)
                         continue
                     }

                     # missing, use ID -- editor link on the brackets.
                     append result \
                             $html_frag(a_) $id $html_frag(tc) \[ $html_frag(_a) \
                             [quote $label] \
                             $html_frag(a_) $id $html_frag(tc) \] $html_frag(_a) \
                 }
                 u {
                     append result \
                             $html_frag(a_) $text $html_frag(tc) \
                             [quote $text] $html_frag(_a)
                 }
                 x {
                     if {[regexp {\.(gif|jpg|png)$} $text]} {
                         append result $html_frag(i_) $text $html_frag(tc)
                     } else {
                         append result \
                                 \[ $html_frag(a_) $text $html_frag(tc) \
                                 [incr count] $html_frag(_a) \]
                     }
                 }
                 y {
                     set split [SplitTitle $text]
                     set title [lindex $split 0]
                     set label [lindex $split 1]
                     append result \
                             \[ $html_frag(a_) $title $html_frag(tc) \
                             $label $html_frag(_a) \]
                 }
                 T - Q - I - D - U - O - E - H {
                     append result [htmlFrag $paragraphType $state $mode]
                     set state $mode
                     set paragraphType $mode
                 }
                 L {
                     append result [htmlFrag $paragraphType $state $mode]
                     set state $mode
                 }
                 H1 - H2 - H3 - H4 - H5 - H6 {
                     append result [htmlFrag $paragraphType $state $mode]
                     set state $mode
                     set paragraphType $mode
                 }

             }
         }
         # Close off the last section.
         append result $html_frag(${paragraphType}_)
         # Get rid of spurious newline at start of each quoted area.
         regsub -all "<pre>\n" $result "<pre>" result
         list $result {}
     }

     proc quote {q} {
         regsub -all {&} $q {\&amp;}  q
         regsub -all {"} $q {\&quot;} q ; # "
         regsub -all {<} $q {\&lt;}   q
         regsub -all {>} $q {\&gt;}   q
         regsub -all {&amp;(#\d+;)} $q {\&\1}   q
         return $q
     }

     # Define inter-section tagging, used between each logical section of text.

     variable  html_frag
     proc vs {last current text} {
         variable html_frag
         set      html_frag($last$current) $text
         return
     }

     #
     # Return the html fragment to transition from one type
     # of paragraph to the next.
     # If the current line type is L, we really want to 
     # transition from the last "significant" line type, i.e.
     # the paragraph type.
     # 
     proc htmlFrag {pType last current} {
         variable html_frag

         # puts -nonewline "htmlFrag $last $current = "
         if {($last == "_") || ($current == "_")} {
             # puts "$html_frag($last$current)"
             return $html_frag($last$current)
         }

         set result ""
         set lookup "$last$current"
         switch -exact -- $lookup {
             EE {
                 set result "<br>"
             }
             QQ {
                 set result "\n"
             }
             UU - OO {
                 set result "</li>\n<li>"
             }
             DI - II {
                 set result "<dt>"
             }
             DD {
                 set result "<dd>"
             }
             TL - LL - QL - UL - OL - IL - DL {
                 set result "<br>"
             }
             LT - LQ - LU - LO - LI - LD - LE {
                 set result [htmlFrag $pType $pType $current]
             }
         }

         if {$result == ""} {
             set result "$html_frag(${last}_)$html_frag(_$current)"
         }

         # puts "$result"
         return $result
     }



 #
 # Fragments when entering a paragraph type.
 # 
 vs _ T           <p>
 vs _ L            {}
 vs _ E            {}
 vs _ Q         <pre>
 vs _ U      <ul><li>
 vs _ O      <ol><li>
 vs _ I      <dl><dt>
 vs _ D          <dd>
 vs _ H "<hr size=1>"
 vs _ H1         <H1>
 vs _ H2         <H2>
 vs _ H3         <H3>
 vs _ H4         <H4>
 vs _ H5         <H5>
 vs _ H6         <H6>

 #
 # Fragments when leaving a paragraph type.
 #
 vs T _       </p>
 vs L _         {}
 vs E _         {}
 vs Q _     </pre>
 vs U _ </li></ul>
 vs O _ </li></ol>
 vs I _         {}
 vs D _      </dl>
 vs H _         {}
 vs H1 _     </H1>
 vs H2 _     </H2>
 vs H3 _     </H3>
 vs H4 _     </H4>
 vs H5 _     </H5>
 vs H6 _     </H6>

     rename vs {}

     array set html_frag {
         a_ {<a href="}  b0 </b>
         _a {</a>}       b1 <b>
         i_ {<img src="} i0 </i>
         tc {">}         i1 <i>
     } ; # "

     # =========================================================================
     # =========================================================================

     ### Backend renderer                                 :: Stream ==> Refs ###

     # =========================================================================
     # =========================================================================

     # Output specific conversion. Extracts all wiki internal page references
     # from the token stream and returns them as a list of page id's.

     proc StreamToRefs {s ip} {
         array set pages {}

         foreach {mode text} $s {
             if {![string equal $mode g]} {continue}

             set split [SplitTitle $text]
             set title [lindex $split 0]
             set label [lindex $split 1]
             set info [$ip $title]
             foreach {id name date} $info break
             if {$id == ""} {continue}

             regexp {[0-9]+} $id id
             set pages($id) ""
         }

         array names pages
     }

     # Output specific conversion. Extracts all external references
     # from the token stream and returns them as a list of urls.

     proc StreamToUrls {s} {
         array set urls {}
         foreach {mode text} $s {
             if {$mode eq "u"} { set urls($text) imm }
             if {$mode eq "x"} { set urls($text) ref }
         }
         array get urls
     }

 } ;# end of namespace

web.tcl

 238a239,241
 >         cgi_import_as Action editAction
 >         # Only actually save the page if the user selected "Save"
 >         if {$editAction == "Save"} {
 239a243
 >         }
 339c343
 <             cgi_puts [h2 [Wiki - $N]]
 ---
 >             cgi_puts [h1 [Wiki - $N]]
 351c355,358
 <               submit_button "=  Save  "
 ---
 >               # Create Save and Cancel buttons
 >               submit_button "Action=Save"
 >               cgi_puts " [nbspace] "
 >               submit_button "Action=Cancel"
 376c383
 <             cgi_puts [h2 "References to [Wiki - $N]"]
 ---
 >             cgi_puts [h1 "References to [Wiki - $N]"]
 419c426
 <             if {!$noTitle} { h2 $Title }
 ---
 >             if {!$noTitle} { h1 $Title }
 422c429
 <               isindex
 ---
 >               isindex "prompt=Enter the search phrase.  Append an asterix (*) to search page contents as well: "

utils.tcl

 199c199
 <           append result "''Older entries omitted...''"
 ---
 >           append result "\n''Older entries omitted...''\n"
 204,205c204,205
 <         append result "'''[clock format $date -gmt 1 \
 <                 -format {%B %e, %Y}]'''\n"
 ---
 >         append result "\n'''[clock format $date -gmt 1 \
 >                 -format {%B %e, %Y}]'''\n\n"

gui.tcl

 183d182
 <       
 186d184
 <       
 334a333,335
 >         font create wikit_h1 -family $family -size [expr $default + 6] -weight bold
 >         font create wikit_h2 -family $family -size [expr $default + 4] -weight bold
 >         font create wikit_h3 -family $family -size [expr $default + 2] -weight bold
 398a400,405
 >         $D tag configure h1 -font wikit_h1 -lmargin1 3 -lmargin2 3 -foreground $Color::linkFg
 >         $D tag configure h2 -font wikit_h2 -lmargin1 3 -lmargin2 3 -foreground $Color::linkFg
 >         $D tag configure h3 -font wikit_h3 -lmargin1 3 -lmargin2 3 -foreground $Color::linkFg
 >         $D tag configure h4 -font wikit_h3 -lmargin1 3 -lmargin2 3 -foreground $Color::linkFg
 >         $D tag configure h5 -font wikit_h3 -lmargin1 3 -lmargin2 3 -foreground $Color::linkFg
 >         $D tag configure h6 -font wikit_h3 -lmargin1 3 -lmargin2 3 -foreground $Color::linkFg
 404a412
 >         $D tag configure li -font wikit_default -lmargin1 30 -lmargin2 30 -tabs 30

03sep03 jcw - I have two comments on this. Main one is that of the three ways in which linking is supported in this wiki, I suggest only adding the "| text" label trick on urls of the form:

  ... [http://blah | blurb] ...

The reason for this is that right now, one always knows what clicking on a link does:

  1. if it looks like an url, that's where it goes
  2. if it has no brackets, hyperlinks are always within this wiki
  3. if it shows in brackets, then you cannot assume the link is local

The label rules you're introducing break this model. I'd rather not introduce the ability to have a link show as a page title which turns out not to be one, nor for example to click on what looks like an url, but ends up being a completely different one. Hence my suggestion to only allow labeling bracketed external urls.

My other comment is about the use of pipe as separator. Souns like a good choice, but to break fewer cases of current url's having "|" in them, I suggest splitting on "<space> | <space>", and on taking the very last such occurrence. Code to do this could be:

    # Split "url | tag" into two parts, if they exist.
    # Based on code written by AKG, see https://wiki.tcl-lang.org/9733

    proc SplitTitle {text} {
      if {[regexp {(.*)\s\|\s(.*)} $text - a b]} {
        set a [string trim a]
        set b [string trim b]
        if {$a ne "" && $b ne ""} {
          return [list $a $b]
        }
      }
      return [list $text $text]
    }

If someone wants to pursue this and figure out how to make the changes in format.tcl, please do... I'll be happy to integrate such a patch.

09sep03 CM - I have a patch to offer but it is not implemeting the same spec :-). I have added the functionality of labels only for "http:// " type of URL (i.e., unbracketed) but I fully agree with your remark that it should be possible to distinguish between a wiki page and an external link. I have added some CSS style to perform this distinction (different colors and a small icon for labelled external links) but the idea of surrounding with brackets like the numbered pages is actually quite nice too (my code would need to be fixed for this though). I also prefers the [email protected] syntax because on other wikis we see a mix of "url | label" and "label | url" conventions and people tends to get confused on which order to use. We've been using the "@" convention internally for quite some times now and people get easily accustomed with it. Also if the functionality labels for urls seems "nice to have" I do not see the needs of having labels for wiki pages, that would mean the page name is broken, and seems quite un-wiki to me.. :-)

JC, let me know if you are interested in some of the features I have been porting from my very old modified wikit-impl (from 2001) to the newest code. I detailed the list of ports in Ideas for Wikit enhancements early August. I am sorry not to have patches available feature by feature.. as I basically dedicated some time this summer to get the max I could in the current code. Also I do not intend to get all my experiments accepted in the official wikit but tell me if some features appeal to you.