Wibble wish list

AMG: This page lists to-dos, feature requests, wishes, etc. for the Wibble web server.

Fetching backrefs...

Current wishes

Character encoding

AMG: Wibble currently doesn't support character encodings (neither Accept: nor Content-Type: charset); it's hard-coded to only use ISO8859-1. It should at least support UTF-8.

I think the zone handlers should be able to encode their responses however they like, so long as they correctly identify the encoding, then a general facility can convert encodings if necessary. Or I could standardize on UTF-8, but this becomes a problem when serving files from disk which may not be UTF-8. If the client will accept ISO8859-1 and the file is in ISO8859-1, it doesn't make sense to read the file, convert it to UTF-8, convert it back to ISO8859-1, then send it. Just use chan copy.

By the way: I don't like how HTTP crams charset into the Content-Type: header; I'd rather it have a separate header, so that I can parse and update it more easily. There's a lot not to like about HTTP. :^)

AMG, 2010-11-06: I enhanced header parsing so that charset and other parameters are available as dict entries. Nothing actually uses charset yet, but this was a necessary and very difficult first step.

AMG, 2011-04-25: But more is required. Not only does Wibble need to know how to interpret the text coming from the client, it also needs to know how to convert its outgoing text into the character set to be set to the client. This means that the [process] command will have to peek at the response header content-type charset, then use the [encoding] command to do the conversion. The trouble there is that the response headers are currently HTTP-formatted, which (as far as I can tell) is designed to be difficult to parse. Instead I'd like for the response headers generated by the zone handlers to be formatted in the same way as the request headers, so that Wibble can easily get at that data, charset in particular. An [enheader] command (described below [L1 ]) will be used by [process] to convert the response dictionary to HTTP immediately prior to sending to the client.

AMG, 2011-05-09: Duh! I can just use [chan configure -encoding] to set the encoding when reading from and writing to the socket. Now I need to make a table mapping between HTTP charsets and Tcl encodings.

Cross site requests

dzach: Here is another one: Wibble at present throws an error if it receives a cross-site request [L2 ], [L3 ] with a "preflighted" OPTION method. A minimal cross site request implementation would accept silently the OPTION method (and the other HTTP/1.1 methods currently not handled by wibble) without throwing the error and present a simple cross site request interface to the browser (i.e. handle headers Origin and Access-Control-Allow-Origin), which could possibly be done by using the rawheader field in the request data.

Zone handlers

AMG: I should write more zone handlers for sessions, Content-Type:, etc. This page is getting fairly long, so they'll be collected on a separate page of this Wiki. Also the current zone handlers are very bare-bones, particularly dirlist, and maybe could stand to be improved. Originally I wrote them to be as short and simple as possible so that they clearly demonstrate the concept without being cluttered with features.

AMG, 2010-11-06: Again, see Wibble zone handlers. There's not much on it at this point, but hopefully that will change.

Caching

AMG: I'm not necessarily thinking of caching entire pages; I wonder if I can be a bit more general than that. Perhaps I cache zone handler outcomes, using the incoming request as part of the key. (See memoizing.)

One tricky part is identifying the rest of the key, that is to say, all other inputs that play a part in determining the outcome. A closely related issue is identifying the irrelevant parts of the request. For example, if the socket name affects the outcome, there's no point in caching, since that changes all the time. Except for debugging, I can't think of a reason why the socket name would matter. Likewise a dependency on the peer's network name would impede caching, and sometimes that dependency exists legitimately.

I have to be very careful with this one, because caching can incur overhead that exceeds the cost of recalculating from scratch every time. In my line of work (real-time simulation), this is nearly always the case. For that reason I don't want to experiment with caching until performance becomes a problem and other optimizations cease to bear fruit.

AMG, 2010-11-06: [wibble::template] now caches the scripts it generates.

AMG, 2011-05-09: I removed the template caching some time later. I didn't like Wibble writing into the docroot.

AMG, 2011-05-13: Cache control is a separate issue which requires thought.

sdw, 2012-01-02: You could try the approach Tclhttpd uses. It looks at the mod time of the template vs. the generated static file. If the template is newer, the html file is replaced. Tclhttpd also defines a command "Doc_Dynamic" which, if called from within the template, prevents the static html page from being generated.

AMG: Wibble used to do something similar. It cached tmpl files as script files. I removed this feature because I was uncomfortable with Wibble writing into the docroot. I could add it back, with the ability to customize where the cached files are stored, plus the ability to cache the output of the script files. It should be possible to configure staticfile to look in multiple places for the file, simply by putting staticfile into the zone handler list more than once, each time with a different docroot.

The larger issue is dependency tracking. Generated pages depend on more than their source scripts or templates. To do this right, the script or template would need to be able to declare dependencies, whose timestamps and/or checksums would have to be stored alongside the cached file. This is tricky, hence I left it out of Wibble for now. Wibble's goals change over time (it's transitioning from being educational to being practical), so I may add this someday.

Reviewing my original comments... it may be useful to cache the outcome of the zone handler search system, rather than (or in addition to) simply caching files. Not sure. Profiling is necessary to see if this is worthwhile.

Virtual hosts

AMG: (I don't really have a use for this at the moment, but I'll add it to the list while I'm thinking about it.) The Host: HTTP header can be used to serve multiple "sites" from a single server process listening on a single IP address. Wibble can do this by prepending the hostname to the zone name. But I want this to be optional, with the ability to script the mapping from hostname to zone prefix. I mean, it would suck to require that the entire zone handler table be replicated for each of "example.com" and "web.example.com" and "www.example.com" and "example.com." and every other variant name of the same site. Also it doesn't make sense to pay for this feature when it's not needed.

sdw: Actually... the ability to run multiple chains of handlers would be very useful for one application I have. The linode running www.etoyoc.com also runs a bevy of other websites, my website, one for a folk festival I operate, one for my wife's business, another the podcast of a friend, and if I could get away with it, I would hand off the hosting of several fossil repositories through Fossil's CGI interface.

Instead of defining a new handler chain for each unique host name, why not devise a decision that wub makes before domain handling that allows it to pick from a pattern list to determine which chain responds.

  wibble::zone_match etoyoc.org etoyoc.org
  wibble::zone_match etoyoc.com etoyoc.com
  wibble::zone_match etoyoc.com www.etoyoc.com
  wibble::zone_match etoyoc.com *.etoyoc.net
  wibble::zone_match hypnotoad yoda.etoyoc.com
  wibble::zone_match hypnotoad *.seandeelywoods.com
  wibble::zone_match folkfest camping.etoyoc.com
  wibble::zone_match folkfest pff.etoyoc.com

  # By default, we use the fallback "no zone given" handler
  #
  set root /opt/httpd/sites/etoyoc.com
  ::wibble::handle /vars vars
  ::wibble::handle / dirslash root $root
  ::wibble::handle / indexfile root $root indexfile index.html
  ... 

  # To define the handers for a virtual host, we signal that
  # we are using a different zone
  #
  set root /opt/httpd/sites/etoyoc.org
  ::wibble::zone_set etoyoc.org
  ::wibble::handle /vars vars
  ::wibble::handle / dirslash root $root
  ::wibble::handle / indexfile root $root indexfile index.html

AMG: This can work, though I have a few other ideas which are similar. I'll explore later, when I have some free time. For now I'll say that it may be simpler for [zone_set] to take a list of virtual host name match patterns, rather than accept a name which is itself an index into another table. Also, is [zone_set] really the name you want? I think [vhost_set] would be better.

Gzipped payload

dzach: I use the following snippet inside ::wibble::process, copied from CMcC's code at the end of gzip:

 } elseif {[dict exists $response header content-encoding] && 
   [dict get $response header content-encoding] eq "gzip"
 } {
   set gzip [binary format "H*iH*" "1f8b0800" [clock seconds] "0003"]
   set content [dict get $response content]
   append gzip [zlib deflate $content]
   append gzip [binary format i [zlib crc32 $content]]
   append gzip [binary format i [string length $content]]
   set response [dict merge $response { 
     header {Vary Accept-Encoding Accept-Ranges bytes}
   }]
   dict set response content $gzip
   set size [string length [dict get $response content]]

 } elseif {[dict exists $response content]} {

To use it, just add a content-encoding gzip header to the responce, e.g.:

  dict set response header content-encoding "gzip"

It requires the zlib package and costs peanut ms. It would be nice to add gzip capability to wibble.

AMG: This will only work when content is defined in the response dictionary. If contentfile or contentchan are used instead, there needs to be some kind of channel stacking mechanism so that the data can be gzipped during streaming.

AMG, 2011-05-09: Oh hey, there's [zlib push gzip]! That should do the trick.

AMG, 2011-05-13: Well, not quite. Streaming from a contentfile or contentchan precludes calculation of the compressed size, even when the uncompressed size is known in advance. This suggests two modes of operation: buffered and streamed. Buffered mode behaves like sending straight from the content key, as in the above code snippet: the entire file is read into a buffer and compressed before anything is sent to the client. Streamed mode compresses during sending, but the content-length response header isn't sent and the range request header isn't honored. Now, how would Wibble know which mode to use?

CGI, FastCGI

AMG: It should be possible to implement CGI and FastCGI as zone handlers.

Others?

AMG: Insert feature request here.


Granted wishes

Connection: Close

dzach 2011-3-30 I think wibble has to honor header "Connection: close" when requested and close the socket immediately. This happens e.g. when wibble is used as a proxy behind another server which only speaks HTTP/1.0 to the backend, in my case nginx [L4 ] (with a proxy module), and inserts the above header into the stream. Unless wibble closes the socket, nginx keeps the client connection alive (this side speaks HTTP/1.1) without sending anything and then times out.

A fast workaround is to replace in ::wibble::process:

  # Flush the outgoing buffer.
  chan flush $socket

with:

  # Flush the outgoing buffer.
  if {[dict exists $request header connection] &&
    [dict get $request header connection] eq "close"} {
      chan close $socket
      break
  } else {
    chan flush $socket
  }

AMG: Fast workaround? That's more like a complete solution! I dropped it straight into my development version of Wibble, and it worked perfectly. It fixed a bug I just noticed when streaming music to Winamp: Winamp hangs at the end of each song, waiting for the server to close the connection.

AMG: I wound up implementing this a bit differently. Now [defaultsend] returns false on Connection: Close, which causes [process] to terminate the connection. A custom sendcommand should do similar.

Content type

AMG: The Wibble static file server doesn't send Content-Type:, so the client has to guess. (Internet Explorer second-guesses the text/plain Content-Type: anyway.) I think a contenttype zone handler might be able to fill the gap, but I'm not sure what heuristics it should use to detect the type.

makr (2009-11-19): A pure-Tcl solution could employ fumagic (fileutil::magic). tmag on the other hand is an extension built on top of libmagic (only available for Unix so far).

AMG: Sounds good, thanks for the pointer. So that's two possible approaches, and I can think of a third one: file extension matching. Since zones stack, it's possible (but untested) to override contenttype (or its arguments) for individual files, by declaring single-file zones. Anyway, I might try implementing all three, once I have made a separate Wiki page for extra zone handlers.

AMG, 2010-11-06: Now I have the separate Wiki page! See Wibble zone handlers.

AMG, 2011-04-25: I made a cheesy content-type zone handler that operates by matching the file extension present in the request URI. It does this for all requests (in its zone), even those that ultimately turn out to be 404'ed, etc. It simply sets the content-type response header and passes on to the next zone handler. The trouble is that the current published version of Wibble has a bug that causes the zone handlers to ignore an inherited response dictionary. It's a really easy fix, but I haven't gotten around to publishing it yet. I was hoping to move to Fossil before the next published version (see [L5 ]), but I'm not getting the support I need in order to make that happen. Eventually I'll just bite the bullet and make another release on this wiki, even though that means more work for me in the future when (if) I move to a "proper" revision control system.

AMG: Published! Wow, I held on to that fix far too long.

WebSocket

AMG: I have yet to research the WebSocket page written by AGB. It may be worth incorporating into Wibble, or at least updating to match the latest version. I want to see how much of it can be made into zone handlers and how much of it requires modifications to the core.

jbr - 20110430 - Andy there are notes on the WebSocket page describing a patch I made to allow Wibble to release the socket so that it can be used for the direct client - server connection. Can you add an official API that would allow this? Thanks.

AMG: Here's some code you should be able to add to your application that will do what you want, without the need to modify Wibble itself. I recognize that it's kind of a hack, but it should be good enough until I properly integrate WebSockets with Wibble. Sorry, I can't completely test it; I don't have any browsers that support WebSockets.

proc ::wibble::abortclient {} {
    upvar #2 cleanup cleanup
    set cleanup [lsearch -exact -all -inline -not $cleanup {chan close $socket}]
    return -code 7
}

This removes the socket close call from the current coroutine's list of cleanup scripts, then it returns with a weird error code that's not caught anywhere. I don't believe there's a need for an explicit handler for code 7, since it's not necessary to set a $keepalive variable.

Please let me know if this does what you need.

jcw 2011-05-01 - Had to tweak it a little, but this works:

proc ::wibble::abortclient {} {
  upvar #1 cleanup cleanup
  set cleanup [lsearch -exact -all -inline -not $cleanup {chan close $socket}]
  return -code 6 {status 200}
}

Idea: if cleanup were a dict, then you could clean up things in it using some nice mnemonic tag.

AMG, 2011-05-09: I have a complete different idea for WebSockets and Server-Sent Events which I have yet to implement. It's very simple though. Look for it the next time I have free time... this Friday perhaps?

AMG, 2011-05-13: Well, I implemented it. I don't think it's quite ready for release yet, needs testing. I added an optional "sendcommand" key to the response dictionary. If set, its value is used as a command prefix to send the response to the client. The socket, request dictionary, and response dictionary are appended as arguments. After doing any custom I/O that it wants, the sendcommand must return true if processing is to continue and false if the connection is to be closed. It's free to loop, read, write, open additional sockets, whatever it wishes. Since it runs inside a coroutine, it shouldn't do any blocking operations directly; that's what [getline], [getblock], and [icc] are for.

As for WebSockets, I had a look at the Wikipedia page [L6 ] and was quite put off by it. The protocol doesn't seem to be quite ready. There have been incompatible changes, more changes are likely to come, and some browsers that implemented WebSockets removed or disabled support because of security concerns. This isn't to say that I won't support WebSockets nor that I will never implement it, only that I'm not going to do it today. ;^) This new sendcommand feature ought to make it possible for anyone to add WebSocket support, or Server-Sent Events for that matter.

AMG: Published! See WebSocket for the code. It's not integral to Wibble, but you can add it to your application. There's a custom zone handler to link Wibble to the new WebSocket code which does initial handshaking and other protocol junk, then chains to your custom WebSocket application on various events.

User-defined error handling

jcw 2011-03-25 - It would be nice to be able to override the error handler in some standard way one day. The current ::wibble::process always generates a stack-trace, but sometimes you just want to redirect to a normal page , with perhaps an error box inserted (as well as logging and/or emailing the techie-info). Perhaps errors could be "caught" by a specially-marked zone handler?

AMG: The zone handler approach sounds interesting. The error handler exists outside of the zone handler loop (the I/O loop too), since it's there to catch my coding errors more than anything else, and coding errors can happen anywhere! I don't see a good way to reenter the loop and try again. Maybe instead have two levels of error handling, one to catch Wibble coding errors and one to catch zone errors. Then the error information can be placed in a special "error" zone (no leading slash means that it's distinct from all possible URIs). What do you think?

jcw - I can't quite wrap my mind around the different trade-offs, to be honest. Ideally, you'd want to be able to catch errors in different ways, i.e. various zone handlers may need specific error handling (an Ajax zone would not return an error page, for example). Then again, maybe the obvious solution is also the most flexible for such cases: try / on-error within the zone handlers themselves. Hm - I think a single settable handler might then be sufficient after all, perhaps simply a ::wibble::onerror proc to set the error handler? Or even simpler: define and call ::wibble::onerror - anyone can then redefine it, in the same way as ::wibble::log works.

AMG: That last sentence sounds good to me. If you like, you can make your [onerror] call selected zone handler procs directly, e.g. to put standard headers and footers on the page. Since you're customizing the code, and you want to have a simple path for error handling, dynamic dispatch isn't needed.

AMG, 2011-05-13: I implemented a [panic] procedure that can be customized or replaced, similar to [log]. It takes many arguments: options port socket peerhost peerport request response. options is the error options dictionary, and the others are the same as like-named arguments and dictionary keys found elsewhere in Wibble. A custom [panic] procedure shouldn't rely on zone handlers or any other such feature; it should be as simple as possible. If you want common behavior between [panic] and your normal zone handlers, have them use common utility and page formatting procedures. [panic] is directly responsible for sending the bad news to the client, as well as logging whatever is appropriate.

[enheader]

AMG: Wibble has a [deheader] command to decode HTTP headers to a much more manageable key/value list format. Most of the values get further decoding in order to transform all formatting and framing characters into semantically equivalent Tcl dicts and lists. [deheader] is called by [getrequest] to process data received from the client. This makes it easy for zone handlers to interpret header data.

Going the other direction, there's much less infrastructure. The response headers generated by the zone handlers are mere key/value lists, and the zone handlers are responsible for formatting and framing the values. Most of the time this is okay, but it can get dicey when quoting is required. I'll have to do some more research to identify other problematic cases.

Anyway, this asymmetry displeases me. I'm considering writing an [enheader] command that does the opposite of [deheader], such that if $http is a valid HTTP header string, [enheader [deheader $http]]] produces either $http again or something semantically equivalent. [getresponse] or [process] will call this command, and the zone handlers can format their response headers in the same way they read their request headers.

AMG: This is now implemented in my development version of Wibble, though I will test it some more.

AMG, 2011-05-13: Looking good! However, this change breaks pretty much all existing zone handlers. :^( The response headers are no longer specified directly in HTTP format, but rather in Wibble's internal format. This makes it possible for Wibble to inspect and act upon the response headers before sending, e.g. to see what charset is being used.

JSONRPC

MaxJarek: Wibble do not support application/json-rpc header. This is proposition

...
} application/json-rpc {
    set post [json::json2dict [dehex $data]]
} application/x-www-form-urlencoded - default {
...

With this code Wibble is ready for JSON-RPC server implementation.

AMG: This requires [package require json], correct? The json package isn't currently required by Wibble, and I plan to keep that optional. Adding code like you suggest above makes it required. I'd like to add this feature, but I need to find a way to have JSON be an optional plugin. Hmm.

Parsing the POST body is the last thing that's done by [getrequest], so it comes immediately before [getresponse], i.e. the zone handler system. This means it should be possible to move POST body parsing to a zone handler, which would make it easy for the user to configure without modifying the Wibble core. [getrequest] can just store the rawpost key into the request dictionary, which a zone handler can parse to produce the post key, to be read by later zone handlers. A user who wishes to add JSON or whatever else can add more zone handlers to support new POST content-types, without having to modify the core.

Sound good? You can actually do this right now, though at a performance penalty: [getrequest] will run [dequery] as part of the default content-type handler. Hmm, there's a simpler solution. Maybe I should change the default content-type handler to simply not generate a post key. That would also work.

package require json
proc ::wibble::zone::post_json-rpc {state} {
    dict with state request {}
    if {$method eq "POST" && [dict exists $header content-type ""]
     && [dict get $header content-type ""] eq "application/json-rpc"} {
        dict set state request post [json::json2dict [dehex $rawpost]]
        nexthandler $state
    }
}

Then add ::wibble::handle / post_json-rpc to the start of your zone handler list.

I considered putting in extra json "" keys between post and the data, but I don't think that's necessary. I thought I might need it when I saw that I had xml "" for text/xml, but that's required since post should be a dict, but xml data is just plain text produced by [dehex], so [dumpstate] and others get confused. I assume from the name that [json::json2dict] produces a dict, so there's no problem here.

Feedback appreciated!

MaxJarek: I change the default content-type handler to not generate a post key and change my json_rpc.proxy code to adopt your proposition. This works great.

You right, with this configuration JSON is an optional plugin.

AMG: Cool, glad to hear it works. I updated my development copy of Wibble to also not generate a post key, only a rawpost key, when it encounters an unknown content-type.

By the way, there's a subtlety you may not have noticed. When no content-type is supplied by the browser, that's supposed to be the same as application/x-www-form-urlencoded. The reason I did the [if] trickery in the first argument to [switch] was to make the lack of content-type result in empty string, which I now key on instead of default.

By the way #2, I'm considering switching over to the [dict getnull] command I defined on the [dict get] page, since there are several places where Wibble treats missing values as if they had empty string.

AMG: This update has been incorporated into the latest version of Wibble.

More basic examples

JM is it possible to have some more basic examples (for the rest of us trying this for the first time)?
you know, some forms showing how to process the query parameters or some more templates.
I am very interested in this project but I cannot completely understand the big picture.

AMG: Sure thing, I'll post something this weekend. I'll gladly take your request, if you have something in particular you would like the forms and templates to do.

10-Feb-2012 JM I would say:

  • an echo page, you know a simple form that returns the entered values as a response.
  • believe it or not...a static page mapped to /about /gallery /whatever
  • any other that shows the basic possibilities

is it a good idea to create a page called "Wibble Questions"?
thanks for sharing.

AMG: Yeah, that's a great idea! I'll get to it when I have a bit more time. For now, I posted an example form [L7 ]. It echoes the values back in the form of default values. Look at the page source in your browser to confirm.

There are several ways to get a page to show up at "/about" (for example):

  1. You can make a subdirectory in your docroot called "about", and put a file called "index.html" in there.(You can call the file "index.html.tmpl" or "index.html.script", but it won't necessarily be static)
  2. You can make a file (instead of directory) called "about" at the top level of your docroot, or you can call it "about.tmpl" or "about.script".
  3. You can define a staticfile zone handler on "/about" that maps to any directory you want.
  4. You can do all sorts of tricks with custom zone handlers.

The built-in "/vars" is a tremendous resource, both for quick reference and to help debug your forms. Just use <form action="/vars"> and Wibble will give you a nicely formatted display of everything it got from your browser, structured in exactly the same way you would access it from a template, script, or custom zone handler. You might consider modifying my example to add action="/vars" to the <form> attributes, just to see what I'm talking about. You can also simply call the [vars] command from your zone handler, template, or script.

AMG: Regarding "Wibble questions"... how would that be distinct from "Wibble discussion"? It would make sense to have separate pages for design discussion and usage questions; that's probably what you're going for. Maybe it should be called "Wibble help" or "Wibble support". What do you think?

JM 11 Feb 2012 - first, let me tell you that your example is providing me what I needed. It works as described (I also tried using /vars for the form's action and I am now on track to better understand zone handlers)
and yes, I think design dicussions and usage discusion will be useful for two different type of audience: for the ones with some more programming experience and on the other hand, people that (as me) that just want to use the server and learn along the way.
"Wibble Help" sounds good to me, I may be the first to post a question there (^:

AMG: Go for it! See: Wibble help.

SSL

AMG: I really don't know how this is done...

jbr 2010-12-18 Here is a bit of code I tested a couple weeks ago, cribbed from A Server Template....

 package require Trf
 package require tls

 proc wibble::secure { certfile keyfile args } {
     ::tls::init -certfile $certfile -keyfile $keyfile \
         -ssl2 1 -ssl3 1 -tls1 0                       \
         -require 0 -request 0 {*}$args

    proc ::wibble::socket { args } {
        ::tls::socket {*}$args
    }
 }

AMG: Thanks. I guess I still need to research the meaning of all those options, especially $certfile and $keyfile.

MaxJarek My changed ::wibble::listen for https

# Listen for incoming connections.
proc ::wibble::listen {port {type http}} {
    if {$type eq "https"} {set socket "tls::socket"} else {set socket "socket"}
    $socket -server [list apply {{port socket peerhost peerport} {
        coroutine $socket ::wibble::process $port $socket $peerhost $peerport
    } ::wibble} $port] $port
}

In startup file:

...
package require tls
tls::init -keyfile server.key -certfile server.pem -require 0
wibble::listen 8080 https
vwait forever

AMG: Cool! This can also work as a plugin, so that Wibble doesn't always require tls. Here's my take on it, just a touch different than your approach:

proc ::wibble::listen {port {socketcommand socket}} {
    {*}$socketcommand -server [list apply {{port socket peerhost peerport} {
        coroutine $socket ::wibble::process $port $socket $peerhost $peerport
    } ::wibble} $port] $port
}
[...]
wibble::listen 8080 ::tls::socket

Now any custom socket command can be plugged in. It must take the -server option and a port number. Extra options can be prepended by including them in the socketcommand argument, which is a command prefix.

Jarek: Yes. This is better. This open way for transport other than socket.

AMG: This update has been incorporated into the latest version of Wibble.