**Wibble web server** [AMG]: Wibble is a small, [pure-Tcl] web server inspired by [Wub], [DustMote], [Coronet], and [Templates and subst]. One fine day I wanted to put together a site using Wub, but I needed help and couldn't find [CMcC] or [JDC] in the [Tcl chatroom]. The need to hack would not be denied! So I wrote this. This code is intended to be customized for your application. Start by changing the zone handlers and root directory. Feel free to create your own zone handlers; it's very easy to do so. Just make another [proc]. ***Name*** "Wibble" is similar in sound to "Wub", and according to the Jargon File [http://www.catb.org/jargon/html/W/wibble.html], it is one possible pronunciation of "[www]". ***Zone handlers*** Zones are analogous to domains in Wub and [TclHttpd]. Zones are matched against the [URI] by prefix, with an important exception for directory names. I use the name "zone" instead of "domain" because I don't want any confusion with [DNS] domain names. Someday I may add support for virtual hosts via the Host: header, which means that DNS domain/host names will be included in zone names. Handlers can be stacked, so zones are defined by a combination of handlers which are executed in sequence. Each handler receives a request and response [dict]ionaries as its last two arguments. The request dictionary is augmented with configuration options and a few extra parameters. The request dictionary is derived from the HTTP request. The extra parameters indicate the match prefix and suffix, plus the (possible) filesystem path of the requested object. The response dictionary passed to the handler is a tentative response that the handler can update, replace, or simply ignore. The handler then returns using the [[nexthandler]] or [[sendresponse]] command. [[nexthandler]] takes an even number of parameters as arguments, which are alternating request/response pairs to pass to subsequent handlers. [[sendresponse]] takes only one parameter: the final response dict to send to the client. Zones also stack. For example, if no handlers for zone /foo return a response, then the handlers for / are tried. Just as the handlers within a zone must be specified in the order they are to be executed, the zones themselves must be specified in order of decreasing specificity. To inhibit this stacking behavior, be sure that a default handler is defined for the zone, e.g. '''notfound'''. The '''$wibble::zones''' variable defines the zones and their handlers. '''$wibble::zones''' is structured as a dict mapping from each zone prefix to its list of handlers. Each handler is a list of positional arguments (the first of which is the handler command name) and a list of key/value arguments which are merged into the request dictionary. The [[wibble::handle]] command is used to update this variable. Statically, the zone handlers form a stack or list. But dynamically (during program execution), the zone handlers can branch from a list into a tree, which is traversed in a breadth-first manner to search for a response to send to the client. The tree branches whenever [[nexthandler]] is given more than two arguments; each pair forms a new alternative handler stack operating on a modified request/response pair. When [[nexthandler]] is given zero arguments, the "node" is a leaf node, the tip of a dead branch; the request/response pair that was passed to the handler is removed from consideration. ***Request dictionary*** * '''socket''': The name of the Tcl [channel] that is connected to the client. * '''peerhost''': Network address of the client. * '''peerport''': TCP port number of the client. * '''method''': HTTP method (GET, PUT, POST, HEAD, etc.). * '''uri''': HTTP URI, including query string. * '''path''': Path of client-requested object, excluding query string. * '''protocol''': HTTP/1.0 or HTTP/1.1, whatever the client sent. * '''header''': Dictionary of HTTP header fields from the client. * '''rawheader''': List of HTTP header lines. * '''query''': Dictionary of query string elements. * '''rawquery''': Query string in text form. ***Extra parameters merged into request dictionary*** * '''prefix''': Zone prefix name. * '''suffix''': Client-requested object path, sans prefix and query string. * '''fspath''': Object path with '''root''' option prepended. Only defined if '''root''' option is defined. ***Configuration options merged into request dictionary*** * '''root''': Filesystem directory corresponding to zone root directory. * '''indexfile''': Name of "index.html" file to append to directory requests. Support for configuration options varies between zone handlers. Zone handlers can also take positional configuration options by including them in the command argument to [[wibble::handle]], which is actually a [list]. ***Response dictionary*** * '''status''': The numeric HTTP status to send. 200 is OK, 404 is Not Found, etc. * '''header''': Dictionary of HTTP header fields to send to the client. * '''content''': Message body to send to the client. * '''contentfile''': Name of a file containing the message body to send to the client. ***Predefined zone handlers*** * '''vars''': Echo request dictionary plus any extra or optional arguments. * '''dirslash''': Redirect directory requests lacking a trailing slash. * '''indexfile''': Add '''indexfile''' to directory requests. * '''static''': Serve static files (not directories). * '''template''': Serve data generated from .tmpl files. * '''dirlist''': Serve directory listings. * '''notfound''': Send 404. ---- **TODO/Wish list** ***Character encoding*** Wibble currently doesn't support character encodings (neither Accept: nor Content-Type: charset); it's hard-coded to only use ISO8859-1. It should at least support UTF-8. I think the zone handlers should be able to encode their responses however they like, so long as they correctly identify the encoding, then a general facility can convert encodings if necessary. Or I could standardize on UTF-8, but this becomes a problem when serving files from disk which may not be UTF-8. If the client will accept ISO8859-1 and the file is in ISO8859-1, it doesn't make sense to read the file, convert it to UTF-8, convert it back to ISO8859-1, then send it. Just use [chan copy]. By the way: I don't like how HTTP crams charset into the Content-Type: header; I'd rather it have a separate header, so that I can parse and update it more easily. There's a lot not to like about HTTP. :^) ***Content type*** The Wibble '''static''' file server doesn't send Content-Type:, so the client has to guess. ([Internet Explorer] second-guesses the text/plain Content-Type: anyway.) I think a '''contenttype''' zone handler might be able to fill the gap, but I'm not sure what heuristics it should use to detect the type. [makr] (2009-11-19): A pure-[Tcl] solution could employ [fumagic] ('''fileutil::magic'''). [tmag] on the other hand is an [extension] built on top of ''libmagic'' (only available for [Unix] so far). [AMG]: Sounds good, thanks for the pointer. So that's two possible approaches, and I can think of a third one: file extension matching. Since zones stack, it's possible (but untested) to override '''contenttype''' (or its arguments) for individual files, by declaring single-file zones. Anyway, I might try implementing all three, once I have made a separate Wiki page for extra zone handlers. ***Zone handlers*** I should write more zone handlers for sessions, Content-Type:, etc. This page is getting fairly long, so they'll be collected on a separate page of this Wiki. Also the current zone handlers are very bare-bones, particularly '''dirlist''', and maybe could stand to be improved. Originally I wrote them to be as short and simple as possible so that they clearly demonstrate the concept without being cluttered with features. ***Caching*** I'm not necessarily thinking of caching entire pages; I wonder if I can be a bit more general than that. Perhaps I cache zone handler outcomes, using the incoming request as part of the key. (See [memoizing].) One tricky part is identifying the rest of the key, that is to say, all other inputs that play a part in determining the outcome. A closely related issue is identifying the irrelevant parts of the request. For example, if the socket name affects the outcome, there's no point in caching, since that changes all the time. Except for debugging, I can't think of a reason why the socket name would matter. Likewise a dependency on the peer's network name would impede caching, and sometimes that dependency exists legitimately. I have to be very careful with this one, because caching can incur overhead that exceeds the cost of recalculating from scratch every time. In my line of work (real-time simulation), this is nearly always the case. For that reason I don't want to experiment with caching until performance becomes a problem and other optimizations cease to bear fruit. ***SSL*** I really don't know how this is done... ***File uploads*** I'll be needing this for a project, but I haven't done any research on it yet. ***Others?*** Insert feature request here. ---- **Sample index.html.tmpl** ====== % dict set response header content-type text/html $uri % set rand [expr {rand()}] % if {$rand > 0.5} { random=[format %.3f $rand] > 0.5
% } else { random=[format %.3f $rand] <= 0.5
% } time/date=[clock format [clock seconds]]
milliseconds=[clock milliseconds]
clicks=[clock clicks]
% if {![dict exists $query noiframe]} {