[AMG]: I archive older discussion for [Wibble] on this page.  For current discussion, see [Wibble discussion].

<<backrefs:Category Wibble>>

----
[CMcC] likes this, it's neat and minimal, but flexible and fully functional.  A couple of observations as they arise:  all the header keys have to be set to a standard case after you parse them out of the request stream, as the spec doesn't require a client to use a standard case.

[AMG]: HTTP allows case insensitivity?  Damn.  Case insensitivity will be the death of us all!  HTTP specifications (or at least clients) probably require a specific case for the server, which unfortunately is neither all-lowercase nor all-uppercase.  What a pain!

[CMcC] AFAIK you can return whatever you like (case-wise) from the server, so no ... no requirement.  It's all case-insensitive for the field names.

[AMG]: Still, I must be able to deal with clients which assume HTTP is case sensitive and that it requires the case style shown in the examples.  Most folks only read the examples, so they draw this conclusion: [http://cr.yp.to/proto/design.html].  Just look at Wibble itself!  It fails when the client uses unexpected letter case in the request headers!  I didn't spot where the specification allowed case independence, and none of the examples suggested this to me.

[AMG]: Update: I now force all request headers to lowercase and normalize all known response headers to the "standard" case observed at [http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html].  Unrecognized response headers are left untouched.

[CMcC]: That's consistent with the networking principle (be tolerant of what you accept, consistent in what you provide.)  Wub, FWIW, sends everything out in lowercase (IIRC) on the principle of 'screw 'em if they can't take a joke.'

[CMcC]: Also, I'm not sure what you do with multiple query args which have the same name, you have to be prepared for that, and do you handle query args without value?  Unsure.

[AMG]: An earlier revision supported multiple query arguments with the same name, plus query arguments without values.  Then I decided those two features weren't really important to me, that it was simpler to just require that sites made using Wibble wouldn't depend on those features.  But if you can give me a compelling application for multiple like-named and argument-less queries, I'll re-add support.  For now, later query arguments replace like-named earlier query arguments, and query arguments without value are taken as having empty string as the value.  My earlier solution was for queries with arguments to be in a key-value list, then for all query argument names (even those without values) to be in a second list, sorted by the order in which they appeared in the URI.

[CMcC] yeah, it's much easier if you can ignore that requirement.  Wibble's free to do that, Wub's not (sadly.)

[AMG]: How does Wub make use of argument-less query arguments and repeated query argument names?

[CMcC]: it's all set up in the Query module, a decoded query is parsed into a dict which contains a list of metadata and values per query element, accessors return the number of values, the metadata, and the specified element.  Of course, most of the time in use I just flatten the dict into an a-list and ignore all the detail.

[CMcC]: Adding virtual host support is trivial as you've noted, you just need to combine the host with your zone regexp.

I note you don't fall-back to HTTP/1.0 (not really necessary, I guess,)

[AMG]: I have the beginnings of support for falling back to HTTP/1.0, in that I remember the protocol advertised by the client.  In the future I can use that information.

[CMcC] I really wouldn't bother - there's no real need to support HTTP/1.0 IMHO - the only existing client still using it is the Tcl client (and that should be fixed soon.)  Again, Wub doesn't have the option of taking the sensible path.

[AMG]: I'll have to check a couple streaming music players to see if they all grok HTTP/1.1.  They would have to if they support seeking.

[CMcC]: nor do you stop processing input on POST/PUT as the spec requires (you ought to make sure this is done, as some things require it.)  Your pipeline processing requires run-to-completion of each request processing, I think, but there are definitely cases where you would not want this (they're not common, but when you get such a requirement there's no way around it, I think) so that's a limitation, although not a show-stopper.

[AMG]: I don't have any experience with POST/PUT.  I just put in the few rudiments I could figure out from the HTTP specification.  I'll have to read up on POST/PUT in more detail.

[CMcC] the spec says you can't process any requests (and the client oughtn't to send any requests) on a pipeline until the POST/PUT is handled completely.  It's subtle, but it's (just) conceivable that something could be screwed up by it.  Since your approach is to do things which make sense for most apps, you could probably get away with it by just documenting the behaviour.

[AMG]: Wibble processes each request immediately after reading it.  Its main loop is: get request, compute response, send response, repeat.  Computing a response for a POST/PUT necessarily involves committing its side effects to whatever database is used.  So subsequent responses remain unread, waiting in the receive buffer, until Wibble is completely finished with the POST/PUT, or any other type of request.

[CMcC]: it's more for situations like when you have asynchronous background type processing.  Updating a DB, for example.

[AMG]: I don't think I'll be doing any asynchronous background processing within a single connection.  Someday I do plan to process multiple connections in parallel as separate processes or threads.  But that's a completely separate issue.

[CMcC]: I like the way zone handlers stack, but the necessity of returning a list is less good, IMHO - I prefer the default case to be easy.  I'd consider using a [[return -code]] to modify the return behaviour, or perhaps using a pseudo response key element to specify further processing.

[AMG]: I haven't done much with [[return -code]], so it hadn't occurred to me.  That's an interesting idea, thanks.  I think I'll change it to return the operation as the return code and the operand as the return value.

[CMcC] yah, you might want to map the normal error codes from Tcl (including Tcl_OK) to reasonable values (e.g. Tcl_OK=>200, Tcl_Error=>500)

[AMG]: I wound up using [[return -opcode]] (wrapped by the [[operation]] sugar proc), which puts a custom "-opcode" key in the -options dict, then I receive this opcode using [catch].  The purpose of [[return -code]] is already defined, and it requires integers or a predefined enumeration, so I decided not to mess with it.  Also the purpose of this general mechanism is not to report errors or status, but rather to tell the response generator what operation to take next: modify request, send response, etc.  I do map error to HTTP 500 using '''[try] {...} on error {...}''', then I print the error options dictionary to both stderr and (if possible) to the client socket.  On error, I always close the client socket, forcing tenacious clients to reconnect, which is almost like rebooting a computer following a crash.

[CMcC]: I think the idea of starting a new dictionary for response is pretty good (as it means you don't have to filter out the old stuff,) but I'm not sure that doing so retains enough data for fully processing the original request.  Do you pass the dict to the zone handlers converted to a list?  That's not so good, as it causes the dict to shimmer.

[AMG]: Both the request and response dictionaries are available everywhere in the code.  They're just in separate variables.  Yeah, I convert the dict to a list by means of [{*}] into [args].  If that causes shimmering, I'll change the zone handlers to accept a single normal argument.  By the way, extra non-dict arguments can be passed to the zone handler by making the command name a list.  This makes it possible to use [namespace ensemble] commands, etc. as zone handlers.

[AMG]: Update: I have made this change.  The shimmering is eliminated.

[CMcC]: I'm not sure that the idea of jamming new requests into the pipeline is a good one.

[AMG]: It was the best solution I could think of for handling index.html in the face of the template generation.  With the example setup, if there is a file called index.html in the directory being requested, '''static''' will serve it straight from disk.  If not, '''template''' will try to make one from index.html.tmpl.  And--- very important!--- if that doesn't work, '''dirlist''' will generate a listing.  If '''indexfile''' simply replaced requests for a directory with requests for index.html, '''dirlist''' could never trigger.  And if '''indexfile''' only did this replacement if index.html actually existed on disk, '''template''' would not be used to generate the index.html.  I couldn't think of any other way to get all these different handlers to work together.

[CMcC] this is one of the subtle and intriguing differences between Wub and Wibble architectures - firstly you don't transform the request dict, you create a new one, and as a consequence you have to keep the original request around, and as a consequence of that you have to be able to rewrite the current request (if I understand it correctly.)  Those are all (slightly) negative consequences of that architectural decision.  The upside is that you don't have to keep track of protocol and meta-protocol elements of the response dict as tightly as Wub does - Wub has to cleanse the fields which make no sense in response, and that's time-consuming and unsightly - Wibble doesn't, and can also easily treat those elements using [[[dict] with]] which is a positive consequence of the decision.

[AMG]: Keeping the original request is easy and natural for me; all I had to do was use two variables: `set response [[getresponse $request]]`.  To be honest, I didn't notice that Wub operated by transmuting the request dictionary into the response dictionary, so I didn't try to emulate that specific behavior.  Instead it made sense to generate a new response from the request: from the perspective of a packet sniffer, that is what all web servers do.  Also I enjoy the ability to not only rewrite requests, but also to create and delete alternative requests which are processed in a specific priority order.  Branch both ways!  The goal is to get a response, and handlers are able to suggest new requests which might succeed in eliciting a response.  Or maybe they won't, but the original request will.  Rewriters can leap without looking: they don't have to predict if the rewritten request will succeed.  And in the '''indexfile'''/'''template'''/'''static'''/'''dirlist''' case, '''indexfile''' doesn't have the power to make this prediction.

[CMcC]: this took me a couple of read-throughs to get.  You are expecting zone handlers which would otherwise fail to re-write the request to something they expect might succeed.  It worries me that you may end up with more responses than requests (which would be disasterous) and I'm not sure what you do to prevent this (except you only ever have the latest request around, right?  Because you don't model a pipeline directly, because you don't try to suspend responses?)

[AMG]: Yes, zone handlers can rewrite the request (or create a new request) that might succeed.  It's not possible to get more responses than requests, since processing stops when the first valid response is obtained.  The stacking order of zone handlers must be configured such that the first response is also the desired response.  For example, putting '''dirlist''' before '''indexfile''' will prevent index.html from ever being served unless it is explicitly requested.

[CMcC]: One thing to bear in mind in the rewriting of requests:  if you silently rewrite fred to fred/index.html in the server, next time the client requests fred, your server has to go through exactly the same process.  Another way to do it is have the fred request result in a response which says that fred content has moved to fred/index.html.  That way, the client and proxies can remember the relocation, and will ask for fred/index.html when they want fred, so the client does the work for you.  So I'm not certain that your processing model is an unqualified good idea (nor am I certain it's not - the layering effect is powerful.)

[AMG]: This does not involve rewriting requests.  To implement this behavior, the zone handler sends a Found or Moved or Whatever response to the client, which might then make an entirely new request unless it broke, lost interest, or found index.html in cache.  It's up to the site administrator whether to rewrite requests within the server or to redirect the client.  For an example of this kind of redirection, look at '''dirslash'''.  Personally, I don't like instructing the client to make a new request for index.html, since I think it's ugly to have "index.html" at the end of the URL.

[CMcC]: You should probably add gzip as an output mode, if requested by the client, as it speeds things up.

[AMG]: I figured [gzip] can wait until later.  It's more important for me to bang out a few sites using this.  Also I need to look into multithreading so that clients don't have to wait on each other.

[CMcC] [gzip]'s easy to add, and well worth adding.  I should probably get around to adding Range to Wub, too.

[AMG]: Okay, I'll look into [gzip] and [zlib deflate].  Wibble never sends chunked data, so it should be as easy as you say.  I'll just look at the response headers to see if I need to compress before sending.

Wibble doesn't support multipart ranges.  I doubt any web servers do; it's complicated and it's worthless.  Clients are better off making multiple pipelined individual range requests.

[AMG]: Update: I'm not sure how encodings and ranges are supposed to interact.  Content-Length gives the number of bytes being transmitted; that much is clear.  What about Content-Range?  Do its byte counts reflect the encoded or unencoded data?  And the request Range--- surely its byte counts are for the unencoded data.

[CMcC]: I completely ignored Range stuff, so I don't know.  Guessing they either say 'you can't encode a range with anything but none', but for all I know they give a harder-to-implement stricture.

[AMG]: I'll just make a few guesses then see if it works with [Firefox].

[AMG]: I think I'll ignore the qvalues; they're tough to parse and kind of dumb.  Why would a client advertise that it accepts gzip but prefers uncompressed?  Or why would it give something a qvalue of 0 rather than simply not listing it?

[CMcC]: yeah, you'd think the spec would just say 'list the things in the order you would prefer them' instead of the whole q= thing.  I dunno, there are lots of anomalies.  For example, a lot of clients claim to be able to accept content of type */*, but then users complain if you take 'em at their word :)

[AMG]: If I ever get "Accept-Encoding: *" then I will encode the response using lzip [http://web.archive.org/web/20050514074112/http%3a//lzip.sourceforge.net/].  It'll be awesome. :^)

[CMcC]: All in all, good fun.  I still wish you'd applied this to Wub, and improved it, rather than forking it, but oh well.  I wish you'd been around when I was developing Wub, as some of your ideas would have (and could still) contribute to Wub's improvement.  I definitely like the simplicity of your processing loop, although I think that Wub Nub's method of generating a switch is faster and probably better (having said that, it's harder to get it to handle stacking.)

[AMG]: Yeah, I wish the same.  I really haven't spent much time on Wibble.  I wrote it in one afternoon, then a week later, spent a couple hours tidying it up for publication.  I don't have a huge amount of time to spend on this sort of thing, so when I hack, I hack ''furiously!''  And I really couldn't wait for you and [JDC] to answer my [Wub] questions.  Sorry! :^)  I invite you to absorb as much as you like into Wub.  If you give me direction and I have time, I'll gladly help.

Now that this code is written, I think it should stay up as a separate project.  I think it fills a slightly different niche than Wub.  Both are web servers, of course.  But Wub is a large and complete server made up of many files, whereas Wibble is a smallish server hosted entirely on the Wiki.  That makes it a lot more accessible and useful for educational and inspirational purposes, kind of like [DustMote].  Maybe think of it as Wub-lite, a gateway or gentle introduction to some of the concepts that undergird Wub.

Thank you for your comments.

[CMcC] You're welcome - as I say, this is interesting.  It's interesting to see where you've taken the request- response- as dict paradigm, it's also interesting to see how you've used [coroutine]s - very clean indeed.  Wub has two coros per open connection, and a host of problems with keeping them synchronised.  The idea was to keep protocol-syntax and semantics distinct, and therefore to make the server more responsive.  I'm scratching my head, wondering whether to move to single-coro per pipeline, as Wibble does, but have to think through the implications.

It's good to see Wibble, because you started with [dict] and [coroutine] as given, and evolved it with them in mind, where Wub evolved in mid-stream to take advantage of them, Wibble seems to make better considered use of the facilities as a consequence.

I would definitely advise keeping Wibble as a distinct project - it addresses the problem of a minimal server (like Dustmote et al,) but still tries to provide useful functionality (unlike Dustmote et al.)

I'd be interested to see what you make of Wub/TclHttpd's Direct domain functionality.

[AMG]: I started with [Coronet], which you and [MS] wrote.  I dropped everything I didn't need, merged [[get]] and [[read]], absorbed [[terminate]] and $maxline into [[get]], eliminated the initial [yield]s in [[get]], renamed [[do]] to [[process]] and [[init]] to [[accept]], changed [[accept]] to use the [socket] name as the [coroutine] name, and defined the readability handler before creating the coroutine.  I did that last step because the coroutine might [close] the socket and [return] without yielding.  Yeah, for an instant I have the readability handler set to a command that doesn't yet exist (or, in case of return without yield, will never exist), but this is safe since I don't call [update].

[CMcC]: Noticed that small window, but you're right it can't ever be open.  It's interesting to see the iterative evolution: [Coronet] was cribbed from Wub, you adapted and targetted it, now I'm considering cribbing your adaptation to simplify Wub.  :)

[AMG]: I will try to look into Direct later today.  I see Wub has directns and directoo; I am guessing that these map URIs to command invocations, possibly in the manner of [Zope].

[CMcC]: not familiar with Zope, but yes Direct domain maps URLs to proc elements of a namespace or method elements of a TclOO object, and maps query elements to formal parameters by name.  It's quite powerful.

[AMG]: Then it's like Zope.  Disclaimer: It's been a long time since I used Zope, and I never did anything more with it than play with the examples.

[AMG]: Hey, for a fun read, check out [http://www.and.org/texts/server-http].  Does Wub support individual headers split across multiple lines?

[CMcC]: Yes, Wub supports that.  You sort of have to, for things like Cookies, which can (and do) span several lines of text.  You just have to remember the last field you filled in, and if your key value regexp has no key, you append that to the immediately prior field.  Not too hard.

[AMG]: I've been thinking about cookies.  Multiple cookies get posted as multiple Cookie: or Set-Cookie: headers or lines, which the current Wibble code can't handle.  I could fix headers in the general case to support multiple lines, or I could add special support for cookies.  I think I'll go with your approach of appending; it sounds much easier than adding a special case.

[CMcC]: Having read the RFC walkthrough you posted, I'll have to amend my answer to 'Yes, Wub supports that, but not all the stupid variants possible.'  If a client sends stuff that looks like chaff, I'm ok with returning next-to-nothing.  There's a school of thought, btw, which holds that you can best identify malicious spiders and bots by their poor implementation of the RFC, and so being ruthless in applying the standard can help cut your malware load.  It's mainly about not being permissive with things which don't send Host: etc.  Anything that cuts spiders short is OK with me.

[CMcC]: is considering modifying Httpd to have a single coro per connection.  It's split into two largely for historical reasons (Wub used to parse headers in one thread, process in another, but there seemed to be no good performance reason - it was optimising for the wrong, and uncommon, case.)  You do need to be able to defer responses, and sometimes delay processing of subsequent requests pending completion, WubChain relies upon it, and it's a documented and reasonable requirement.

I'm also considering moving a lot of the request pre-processing stuff into a distinct proc.  Things like Cache, Block and Honeypot are done in-line, which is marginally more efficient than processing the whole request before making decisions, but I suspect the gains in code cleanliness more than compensate for a few cycles sacrificed.

[AMG]: I'll gladly trade a few milliseconds for increased code accessibility.  The more people are able to understand the code, the more people are able to suggest architectural improvements that will yield more substantial performance boosts.

----
[AMG]: Something I need to think about is allowing multiple zone handlers to contribute to a response.  For example, the '''static''' handler doesn't set content-type, but maybe a '''contenttype''' handler can fill that part in by inferring from the request URI or the response content or contentfile.  That is, if content-type wasn't already set by some other handler.  Another example: a '''setcookie''' zone handler might want to inject a cookie or two into the response.  The way to do this is by defining a new opcode in addition to "sendresponse".  Perhaps "updateresponse"?  Or maybe I split "sendresponse" into two separate operations: "setresponse" and "finish".  Then I give zone handlers the ability to return multiple opcodes.  Also zone handlers probably need to see the under-construction response so they can modify it without outright replacing it.  That would simply be a second parameter.

As for actually implementing multiple opcodes: I would need to drop the trick with the custom -opcode return options dictionary.  I see now that it doesn't actually provide any benefit, since I already have [[operation]] to wrap around [return].  One possibility is for [[operation]] to return a [list] or dict of opcode/operand pairs.  Another is to have some more fun with [coroutine]s, where the zone handlers can [yield] as many opcode/operand pairs as needed before finally returning.  Perhaps the latter approach can open the door to a more interesting interplay between zone handlers.  I'll have to think about it.

[CMcC] Wub passes a kind of 'blackboard' dict around, it starts as a lightly annotated request dict, and ends up as a response dict.  Each step of the way it's transformed.  Wibble passes a request dict in, and expects a response dict back.  Wub has a series of filters after response generation, in the Convert module, and makes good use of the transformation model.  Wibble could do something similar by allowing modifications to the request dict (which it does) to be response fields.  Then you could use a series of zone calls to progressively build up a response.  I'm not recommending this, I'm merely suggesting it as a possible approach.

[AMG]: I did a little thinking about the computer science theory behind Wibble's quest for a response.  The zone handlers form a tree which branches every time a ''prependrequest'' or ''replacerequest'' is done.  ''deleterequest'' and ''sendresponse'' establish leaf nodes.  Wibble performs a breadth-first search of this tree.  When it encounters a ''sendresponse'' node, it terminates its search and sends the HTTP response proposed by the node.  If there is no ''sendresponse'' node in the tree, it sends HTTP 501.

Now that I know what my own code is doing :^) , I can invent more flexible alternative implementations of the same algorithm.  One thing comes to mind: each zone handler returns a list of request-response pairs.  The length of this list indicates the branching behavior of the tree.  If it's empty, the node is a leaf, like ''deleterequest''.  If there's one request and one response, the tree doesn't branch, like with ''replacerequest'' and ''pass''.  If there are multiple pairs, the tree branches, like with ''prependrequest''; except that it's possible to branch to have more than two children, and the list order gives the BFS priority order.  If there's one response but no request, the node is a leaf and it's time to send, like ''sendresponse''.

[AMG]: Okay, done!  This works.

----
[APN] Why is the encoding of the output channel set to ISO8859-1 ? Should it not be UTF-8 ?

[AMG]: I don't remember!  Does anyone around here know why someone would want to use ISO8859-1 in this situation?

[AMG]: I finally figured it out.  Character sets other than ISO8859-1 require a proper charset parameter in the Content-Type response header.  Also when there is an Accept request header, the server should try its best to honor it.  I didn't want to deal with any of this, so I stuck with ISO8859-1.  Support for other character sets is a possible future enhancement.

''The "charset" parameter is used with some media types to define the character set (section 3.4) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character sets other than "ISO-8859-1" or its subsets MUST be labeled with an appropriate charset value. [http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1]''

----
[MAKR] (2009-10-12): I just stumbled over this nice little thing ... I'd like to know what the license of this code is? Would it be possible to categorize it under the same license as Tcl's?

[AMG]: Sure, no problem.  Wibble uses the same license as Tcl.

----
[jcw] 2009-11-11: Thanks Colin for mentioning this page. Wow, Andy, this is a fascinating server - and a great way for me to dive into new 8.6 stuff...

I was wondering why you need to pass around both requests and responses separately. Couldn't each newly-created response dict have the request as one of its entries? The same request would then be shared by each tentative response you create. If responses are derived from other responses (not sure that happens), the original request would be in there, but buried a bit deeper.

[AMG]: Thanks for the words of encouragement!  Tonight I will contemplate the request/response dichotomy.  For now, I just noticed that wibble::filejoin probably doesn't not work right when a directory name has a trailing dot.  For example "foo./bar" might mutate into "foobar".  I'll verify this as well.

[AMG]: Updates: I fixed wibble::filejoin by removing it--- it's actually not needed.  I incorporated [APN]'s wibble::getline.  I put in some license information.

As for keeping request and response separate, I think this makes the zone handlers easier to write.  They receive two arguments, the input request dict (which tells them what to do) and the output response dict (which they update).  What benefit is there to merging the two arguments?  I want to keep the dict structure as flat as I can, but I also want to avoid collisions between the request and response dicts.  The simplest way to accomplish both is to make these two be separate variables.

[APN]: Updates: fixed non-chunked form handling (getline should have been getblock). I now use wibble in place of [tclhttpd] in [BowWow]. Smaller and more malleable.

[AMG]: I put that bug there to test your vision. :^)  Actually I bet it crashed the first time you did a non-chunked post.  I was so busy testing chunked transfers that I forgot to test non-chunked!  Also, thanks for taking the leap; I would be delighted if you shared your experiences using Wibble.  I haven't actually used it for anything yet, even though I had plans.

----
[AMG]: [MS], I am curious about your edit (changing wibble::handle to use [variable] instead of a namespace qualifier).  I had thought that using namespace qualifiers is faster when the variable only gets used once or twice.  Is there something I'm missing?

[MS] Try sourcing wibble from a different namespace: ''namespace eval foo source /path/to/wibble''. The new version works, the old one doesn't. Since 8.5 [variable] is bcc'ed and does not incur the old perf penalty.

[AMG]: Thanks!  Good to know. :^)

----
[jnc]: In regards to multiple form elements of the same name, it comes in very hand when dealing with lists of items. Two examples: you are editing a parent and you have a table below containing <input name="child_id" type="hidden"/> <input name="child_name"/> <input name="child_dob"/>. When you get the data in your web app, it typically is a list. Therefore you can loop through the data easily. Now, another very helpful situation is when using list data again, and this time with checkboxes to select elements. Say you have a list of emails from your pop3 box that you are previewing the From/Subject for. The idea is you are going to delete these messages before downloading them into your mail program. Again, you have a list: <input type="checkbox" name="delete_email_ids" value="$email_id"/>. Now, with checkboxes, you only get the data submitted via the web browser if the item is checked. So, in the end, you get a list of only email_id's that the user wants to delete. The programmer can easily do: foreach id $delete_email_ids { ... }

[AMG]: That's (part of) what '''rawquery''' is for.  (I admit that '''rawquery''' isn't easy to pick apart.)  Since I anticipate that queries will most frequently be accessed like a [dict], the '''query''' element of the request dictionary is a dict.  If it turns out that like-named form elements are frequently needed, I can make a '''listquery''' or similar element that like '''query''' except constructed using [[[lappend]]] instead of [[[dict set]]].  Both would be compatible with [[[foreach]]] and [[[dict get]]].  Because of this compatibility, I could also make '''query''' itself be a key-value list instead of a dict.  However, this would result in a copy of '''query''' shimmering to dict every time [[[dict get]]] is used to get a query variable.  Suggestions?

[jcw] - Just one... fix [dict] so it keeps the data structure as a list, and stores the hash somewhere else ;)

[AMG]: I'm not entirely sure how that would work.  Using [[dict set]] should overwrite existing elements, so I would have to use [[lappend]] in order to support multiple like-named elements.  But later if I access '''query''' using [[dict get]], a temporary dict-type copy would be constructed.  You're suggesting (tongue-in-cheek) that the hash table not be temporary but somehow get attached to the list-type [Tcl_Obj] for future use by [[dict]].  [Cloverfield] has ideas in that direction, but [Tcl] doesn't work that way.  Unless... perhaps the list and dict types can be partially unified to reduce the cost of shimmering.  It would be like a free trade treaty between neighboring nations. :^)  Conceptually, that would make sense because list and dict are so similar, especially since dict was changed (in Tcl 8.5 alpha) to preserve element order.  Practically, there are obstacles, for example changing or extending the C API.

----
[AMG]: For the last couple days, I've been working on parsing the HTTP headers into lists and dicts and such.  This is actually very complicated because the HTTP header grammar is an unholy abomination, and that's only counting the part that's documented!  I'll post my work soon, but it's not going to be pretty.  Also I hope to add support for file uploads through POST (not just PUT), since that is required for my project.  This will also add the complexity of MIME-style multipart messages, possibly nested.  It all makes me want to vomit.  HTTP works better as a human language than as a computer language, like it was optimized for producing deceptively simple examples, not genuinely simple code.  To paraphrase Dan Bernstein [http://cr.yp.to/qmail/guarantee.html]: "There are two types of interfaces in the world of computing: good interfaces and user interfaces."

[jcw] 2010-02-24 - I'd like to try bringing TclOO into this thing, so that some of the state can be managed as object variables, and to simplify adding utility code which can be used in each request (also mixins, forwards, etc). My previous code was based on an httpd.tcl rewrite (see [http://code.jeelabs.org/viewvc/svn/jeelabs/branches/jeemon2009/library.vfs/code/Httpd.tcl] and [http://code.jeelabs.org/viewvc/svn/jeelabs/branches/jeemon2009/library.vfs/code/Web.tcl] - each sourced in its own namespace), but wibble makes a more powerful substrate to build on. My idea would be to create a separate TclOO class, with objects named the same as the coroutines to allow a simple mapping without having to change this clean core. Andy, have you considered this option? Any thoughts on how to best bring it together?

[AMG]: Sorry, I don't have time now. :^(  But I am interested.  I hadn't thought of using [TclOO], since I've never used TclOO in a project before.  I really would like to see how it can improve Wibble.  First I want to finish the list+dict project we discussed in email (I have code now!!), then I'll get back to the HTTP header parsing (I have about half of that code done).  However I am stupidly starting another art project, plus I have a massive work schedule which will soon include another two weeks "vacation" from both the Internet and my personal computers.  Oh well, Wibble started as a project to benefit the person I'm doing the art for, so I guess it's okay. :^)  Perhaps this weekend I will have time to review your code to see how TclOO is used.

[jcw] - No worries. I can save you some time though: the links I mentioned don't use TclOO, so you can ignore them :) - I just think by now it has become a good option to consider. I'll try some things out.

----
[AMG]: I found this project by [JCW] which uses Wibble: [http://news.jeelabs.org/2010/03/02/jeemon-as-web-server/] [http://cafe.jeelabs.net/sw/jeemon/]

----
[AMG]: Let's see, what's news...  [AGB] wrote [WebSocket], which is an add-on to Wibble.  I'd like to research it more thoroughly and see if I can better integrate it.  Last Christmas or so I wrote a [[deheader]] proc that attempts to not only split the header into lists but also break the elements into lists and dicts.  The main thing holding me back from posting it is its complexity; I'd like to find a way to trim it down.  At the moment it's 87 lines replacing what used to be 9 lines.  [[deheader]] was the first step in my quest to support file uploads; I need to slay the multipart [MIME] dragon before I can declare victory.  Maybe PUT too.  Also I need to do some work with content-type: and accept:, including mapping between the encoding names used by HTTP and Tcl.  All that qvalue garbage will be interesting to sort out.  [APN] has made some customizations to Wibble in [BowWow], which I should consider for inclusion.  Earlier today I changed the way [[nexthandler]] and [[sendresponse]] work; instead of [[[return] -level]], they do [[return -code]] with custom codes caught by [[[try]]].  That works better.  Hmm, [JCW] earlier suggested I research [TclOO]; I did some reading on that front.  For now I think I'll just keep everything as lists and dicts, rather than making objects.

[bch]: AMG -- put up a [fossil] repo and let others help share the load.

[AMG]: Maybe someday, but not while I'm in the middle of a major edit.

I'm making good progress on parsing headers.  The regexps are quite hairy due to handling quotes and backslashes, but they pass all my tests so far.  Again, I'd really like to find a way to unify the headers; right now there's a fair bit of duplication between the routines that handle each of the following:

   * Cache-Control, Pragma: key=val,key=val list
   * Connection, Content-Encoding, Content-Langauge, If-Match, If-None-Match, Trailer, Upgrade, Vary, Via, Warning: elem,elem list
   * Accept, Accept-Charset, Accept-Encoding, Accept-Language, Content-Disposition, Content-Type, Expect, TE, Transfer-Encoding: elem;param=val;param=val,elem;param=val;param=val list
   * Cookie: list that allows both , and ; as separators, weirdness with $
   * everything else: single element

I suspect I can't unify them, since HTTP is quite irregular at heart.

I got the qvalues taken care of.  It took a whopping eighteen lines, which I'm not happy about.  Basically I sort the Accept-* stuff by decreasing qvalue.  Why oh why doesn't HTTP simply do that directly!?

<<categories>> Wibble