Version 1 of Converting the Tcl tutorial to Jupyter notebooks

Updated 2020-07-20 19:46:38 by jos

Arjen Markus (19 july 2020) With the realisation of the tcljupyter kernel for creating and working with Jupyter notebooks for Tcl, it became possible to use that technology to make the tcltutorial interactive. Of course, you do not want to do this manually, so I wrote a small program to convert the Wiki text to Jupyter notebooks.

It is not perfect yet and there are a number of things you need to be aware of - which means the conversion may be automated but there is still work to be done to make sure that the notebooks work as intended. For instance: the notebooks work in a single interpreter and that keeps its state and Tcl code is eecuted as is. An innocent statement in the tutorial like:

    puts $varName

is meant to illustrate the syntax, but if run, the notebook prints an error message. So, either make that simple but non-executable code or provide a value for the variable varName.

Anyway, the conversion is on its way - I fully intend to keep the Wiki text pages as the one and only source and have the conversion done automatically when necessary.

Here is the script sofar:

# convjson.tcl --
#     Attempt to convert the Wiki tutorial to JSON-styled Jupyter notebooks
#
#     Conventions:
#     ======tcl introduces a code block that should be run
#     ======none introduces output from a code block and will be skipped
#     ====== is treated as any text that needs to be shown "as is" (a trifle ambiguous, but the Wiki offers no further choice)
#
#     Note that this is used in the tutorial pages: the "======" blocks are shown as Tcl code in the Wiki, but show up as
#     plain preformatted text in the Jupyter notebooks.
#
#     TODO:
#     - get the one picture to show up correctly
#     - references
#     - get rid of the extra newline between blocks
#

# textToMarkdown --
#     Convert Wiki meta-characters to Markdown
#
# Arguments:
#     string          String to be treated
#
# Result:
#     String with converted meta-characters
#
# Note:
#     Hyperlinks to other pages in the tutorial require special care
#
proc textToMarkdown {string} {

     #
     # Check for a table header ...
     #
     if { [string first "%|" $string] >= 0 && [string first "%|%" $string] < 0 } {
         set string  [string map {% ""} $string]
         set string2 [regsub -all {\|[^|]+} $string {|---}]

         return "$string\\n\",\n   \"$string2"
    }

    #
    # Handle links
    #
    set string [string map {\[\[ ` \]\] `} $string] ;# Avoid [[ and ]] causing an endless loop

    while { [string first "\[" $string] >= 0 } {
        set pos1 [string first "\[" $string]
        set pos2 [string first "\]" $string]

        set substring [string range $string $pos1 $pos2]
        set subtext   ""

        if { [string first "%|%" $substring] >= 0 } {
            set post1     [string first "%|%" $substring]
            set subtext   [string range $substring [expr {$post1 + 3}] end-4]
            set substring [string range $substring 1 [expr {$post1 - 1}]]

            if { [string first "Tcl Tutorial Lesson" $substring] >= 0 } {
                set substring "Tcl[lindex $substring end].ipynb"
            }
            if { [string first "Tcl Tutorial Index" $substring] >= 0 } {
                set substring "Tcl0.ipynb"
            }
        } else {
            # No URL given - construct the URL from the title
            set subtext   [string range $substring 1 end-1]
            set substring "http://wiki.tcl-lang.org/[string map {" " +} $subtext]"
        }
        #
        # Construct the link
        #
        set string "[string range $string 0 [expr {$pos1-1}]]~$subtext^($substring)[string range $string [expr {$pos2+1}] end]"
    }

    return [string map {\" \\\" ''' * ~ \[ ^ \] '' * \\ \\\\} $string]
}

# codeToMarkdown --
#     Convert double quotes to Markdown - escape
#
# Arguments:
#     string          String to be treated
#
# Result:
#     String with converted meta-characters
#
proc codeToMarkdown {string} {
    return [string map {\" \\\" \\ \\\\} $string]
}


set input [lindex $argv 0]
if { $input == {} } {
    puts "Usage: $argv0 name-of-wiki-file (no extension)"
    exit
}

set infile   [open $input.wiki]
set outfile  [open $input.ipynb w]
set codefile [open $input.tcl w]

puts $outfile "{"
puts $outfile " \"cells\": \["

set code      0
set text      1
set start     1
set copycode  0
set starttext 0

while { [gets $infile line] >= 0 } {
    puts ">>$line"
    #
    # Replace tables by lines
    #
    if { [string range $line 0 3] eq "!!!!" } {
        puts $outfile "    \"_ _ _ _ _\\n\","
        continue
    }
    if { $line eq "----" } {
        puts $outfile "    \"\\n\","
        puts $outfile "    \"_ _ _ _ _\\n\","
        continue
    }

    #
    # Skip discussion and subsequent output example
    #
    if { [string first "discussion>>" $line] > 0 } {
        continue
    }
    if { $line eq "======none" } {
        while { [gets $infile line] >= 0 } {
            if { $line eq "======"} {
                break
            }
        }
        continue
    }

    #
    # Convert headers
    #
    if { [string range $line 0 2] eq "***" } {
        set line "## [string trim $line *] "
    }

    if { [string range $line 0 1] eq "**" } {
        set line "# [string trim $line *] "
    }

    #
    # Introduce code ...
    #
    if { $line eq "======" || $line eq "======tcl" } {
        if { $text } {
            puts $outfile "  \"\\n\""
            puts $outfile "  \]"
            puts $outfile " \},"
        }

        set text 0
        set code [expr {!$code}]

        if { ! $code } {
            set text 1
            set start 1
            puts $outfile "  \"\\n\""
            puts $outfile "  \]"
            puts $outfile " \},"
            continue
        } else {
            set start 1
        }

        if { $line eq "======tcl" } {
            set copycode  1
            set prefix    ""
            set starttext 0
        } else {
            set copycode  0
            set prefix    "    "
            set starttext 1
        }
    }

    #
    # Write a text cell ...
    #
    if { ($text && $start) || ($code && $starttext) } {
        set start     0
        puts $outfile " \{"
        puts $outfile "  \"cell_type\": \"markdown\","
        puts $outfile "  \"metadata\": {},"
        puts $outfile "  \"source\": \["
        if { $starttext } {
            set starttext 0
            continue
        }
    }

    #
    # Write a code cell ...
    #
    if { $code && $start && ! $starttext } {
        set start 0
        puts $outfile " \{"
        puts $outfile "  \"cell_type\": \"code\","
        puts $outfile "  \"metadata\": {},"
        puts $outfile "  \"execution_count\": null,"
        puts $outfile "  \"outputs\": \[\],"
        puts $outfile "  \"source\": \["
        continue
    }

    if { $text } {
        puts $outfile "   \"[textToMarkdown $line]\\n\","
    } else {
        puts $outfile "   \"$prefix[codeToMarkdown $line]\\n\","
        if { $copycode } {
            puts $codefile "$line"
        }
    }
}

#
# Close it
#
puts $outfile "   \"\\n\""
puts $outfile "   \]"
puts $outfile "  \}"
puts $outfile " \],"
puts -nonewline $outfile \
{ "metadata": {
  "kernelspec": {
   "display_name": "Tcl",
   "language": "tcl",
   "name": "tcl_kernel-0.0.1"
  },
  "language_info": {
   "file_extension": ".tcl",
   "mimetype": "txt/x-tcl",
   "name": "tcl",
   "version": "8.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
puts $outfile "\}"