Name for the concept of placing the contents of a whole website into a single file and serving it directly from there. By Neil Madden.
He, Neil, already uses something near to this for his personal website, based on Metakit and tDOM. I.e. he stores XML and converts that to HTML. This is currently done offline. The moment this is changed to online generation the StarSite is complete.
AK: Related to this is a notion by BrowseX. This browser allows the retrieval of a website and its storage into a zip-archive. This is usually meant for easy transfer of a website, but also allow display of the archive via BrowseX, without unpacking it (IIRC - AK).
AK: Obvious extensions of the concept.
Depending on the exact nature of how web pages are stored in the StarSite this can have significant overlap with the code base providing the Tcl'ers Wiki. Especially as some discussed extensions to it would allow the storage of not only Wiki markup, but other types of data as well, like images, HTML, or XML. The similarity should be clear by now.
AK: [*] I should explain. My first association was that the code providing access to the contents of the StarSite was part of the StarSite Metakit, in a VFS, making the StarSite a StarKit, or StarPack. As the Wiki codebase shows that doesn't have to be the case. The StarKit containing the code can be distinct form the Metakit database containing the StarSite.
Regarding editing: For HTML this might have to be a free-form editor. For XML we can use StarDOM. Also note that the AlphaTk editor already has a Wiki mode. Extending this to HTML and XML modes might be simple. This implies that the StarSite access StarKit does not need to have the editor embedded into it, although that is an option too. It just has to have way of invoking editors we can hook our preferred editor into.
NEM - Yes, I was thinking along these lines. Some other things I am considering are:
Lots to think about. I think that getting the authentication/access-control stuff right will be the toughest bit. Looking at zope, everything seems to be an "object" (including users, scripts, static content), which can have access permissions granted to it. I know adding security features might seem like overkill, but it needs to be there from the start if anyone wants to use it (which I will). Adding it in later would be a bit hackish.
NEM - Excellent summary. This is exactly what I was planning. My main interest was in XML/XSLT generation of content, but really anything should be possible. The StarSite would sit on the server, and intercept requests using the PATH_TRANSLATED variable. So, for instance, in my website currently, the script xml.cgi can be invoked like http://www.tallniel.co.uk/cgi-bin/xml.cgi/home.xml which grabs the home.xml file and applies necessary stylesheets to it. Likewise, images could also be requested and returned from the database. The fact that MetaKit is the backend, allows for sophisticated searching and user interface options (session management, personalization etc). Mirroring a site would be a case of copying one, highly-compressed MetaKit datafile. I find this concept quite exciting.
Note that the Wikit is a case of putting the contents of a website into a file. I see above that Starsite would include a web server and, rather than using a markup style and conversion like the wikit does right now, would use xml as the markup and tdom or tclxml as the conversion software. Another difference appears to be that wikit is about content management, in a sense, in that visitors to the web site have the ability to update the pages. What other differences are envisioned?
Well - as far as I am aware, wikit only allows the inclusion of the textual content in one file. The StarSite concept takes this a bit further, by allowing images, media etc to be stored in the same file, as well as other information (e.g. a user database). The idea of a starsite, as I (NEM), envision it, is that it should be able to do whatever a normal website can do, but with the added advantage of having everything in one file. So, you could, theoretically, put a wikit inside a StarSite. That is how I see it developing. At the moment it is nothing but this collection of ideas. When things start to reach a more coherent state, I (and any others who wish to join me) will sit down and start making it. The ability to update a StarSite (or parts thereof) over the web, is a feature I would like to include. The XML references are just there as that is what I like to create my site in. However, I feel StarSite should be broader than that. It should be a means of encapsulating a whole web site, with various common functionality available to make things easier (collaborative editing, authentication, session management, data storage etc). In the simplest case, a person would fire it up at home and use the Tk GUI to add static content (HTML, pictures etc). When finished, they would simply ftp the file to the webspace they use (in a cgi directory), and it would just work (just like starkits - no hassle installation). Alternatively, it could run as its own webserver, for intranets and the like. StarKits solve installation problems for regular applications. StarSites would solve it for web applications.
AK:
NEM 30Nov2002: Latest brainstorming on this (flow of control of a request coming into a starsite):
This is just some brainstorming, and hasn't been thought through to the bitter end. I quite like the design, but I'm willing to take criticism to perfect this in the design stage. Consider this, a request for comments! Neil Madden.
30nov02 jcw - Interesting ideas... can you elaborate on the usage scenarios? Is this for deployment, i.e. creating a complete site and shipping it? Or more to to keep things manageable and self-contained?
Note that authorization per dir/subdir is supported in Apache through ".htaccess" - if tclhttpd has similar capabilities, that might be a very quick path to add such features to StarSite, since tclhttpd can work with (as well as *in*) VFS.
Currently, it is not easy to extend VFS with extra info such as a mime type, even though Metakit could easily deal with it. The reason is that the VFS layer opens with a certain layout, which would lose any fields added. Hm, having said that - it's probably possible to open, and immediately reset the layout to include those fields again - data would not be lost. But this leads to another problem: how to make VFS aware of fields such as a mime type. My hunch is that you're best off maintaining a separate data structure for mime types. If stored as a Metakit view could still be in the same VFS file (i.e. starkit), with some tricky hacking. If you'd like an example of how to store other views in a starkit, next to the VFS file system tree, let me know.
30nov02 NEM - To answer your questions in order: I see starsite as being able to create self contained sites and then deploy them as complete items, but also to allow editing after they have been deployed. I envision created a general web interface which allows creating new sections etc. This could be enabled or disabled, even on a per-section basis. Also, section handlers running under appropriate permissions (the permissions of whoever is accessing them, not whoever created them) could update content, and add new content, to that section.
Authorization: I intend to make this fully customizable. Someone could write an authorization routine which uses .htaccess, for instance (although I don't know how this would work in a VFS). Other authorization methods could be used as well. For instance, for my own personal site, I would probably use a custom login procedure, as I do not like .htaccess and I only have CGI (with no SSL). I was thinking about including tclhttpd into the basic starsite so that it can run standalone. It would also be able to run in a cgi environment.
VFS: Yes, currently it would not be easy to add mime-types etc to a VFS. My usage of the term VFS was perhaps confusing, as I was thinking more in general terms of accessing a sort-of filesystem through an API, rather than particularly using Tcl's VFS layer. It would be nice to use it, but I'm not sure how useful it would be (the API commands would probably be quite different to open, read, close etc).
Examples would be nice. I think I could figure it out, but probably best to find out the best way of doing it.
30nov02 jcw - Ah, ok, hence the reference to Zope - a site, ready to be filled in by content providers, comes to mind. One more thought: maybe WebDAV makes sense in this context? I'll try to come up with an example for storing data alongside VFS in the same file, one such use would be to have wikit store its pages in the same file.
1dec02 NEM = WebDAV is certainly very interesting. I'll read through the RFC and see which bits make sense here (probably a large proportion of it). If I can use an existing standard, then so much the better.
30 Nov 2002 escargo - I don't know if it is practical, but perhaps your permissions could be more general. In a Multics system, one of the permissions was append. That meant that you could not modify existing content, but you could add to the end. (This applied to directories, but the notion should be transferable to other domains.) This would be more of a notes file capability than a wiki capability, but still might be worth considering.
1dec02 NEM - Good idea. The permissions were off the top of my head. Append is a good one. Can we come up with a complete list? Maybe it would be possible to allow a site maintainer to define their own set of permissions? Hmm.. I think a reasonable list would probably be best. Time to think of use-cases, I guess.
25jan-3 NEM - A complete list of permissions for the initial version will be:
This would be specified as a a single byte:
page section raecdcda
In general, anonymous users would have just page read access only (permission 10000000), whereas a webmaster would have 11111111. People could be designated as editors of a section with permission 11110000 - i.e. they have ability to read, edit and create pages, but cannot delete pages or change permissions.
Another item for the implementation will be to associate a lock with each section, so that updating of the database can be done safely, and with a per-section lock granularity. This could be upped to a per-page lock, if deemed necessary (I think per-section will be acceptable for most sites). This locking will be done automatically in the VFS layer, so section handlers need not worry about it. To start with, locking will be implemented for single-threaded tclhttpd implementation. Later work will expand this to work with threads, and CGI. CGI is the most difficult, as without marshalling all accesses through a single process it is difficult to perform effective locking. Lock files would have to be used (for CGI), but these are nasty. A possible implementation would have all updates written as separate files into a directory, and then a separate process would lock the whole database and apply all the changes at some point in time (for instance, when the web-master logs in a runs a command).
Time to get coding...
26jan03 NEM - Well, starting coding has brought me round to a new implementation idea:
Generalize Starsite into a Persistent, Authenticated Object System
After spending some time yesterday contemplating design issues, I have hit upon a design which I think could be useful. Instead of writing starsite as an application with a secure database API, why not write a secure, persistent web application framework and then implement StarSite in that system?
The details:
class Foo { field title field body method foo {args} { # Accessing a member field: $this get title # Accessing a specific mimetype $this get title -mimetype text/html # Setting a field $this set body "<h1>Hello, World!</h1>" # Setting for particular mimetype $this set body "<header>Hello, World!</header>" -mimetype text/xml # What is the mimetype requested?: $this mimetype # Change the mimetype (changes the HTTP output headers as well) $this mimetype "text/xml" # Append to a field $this append title "<h2>This is a comment</h2>" # etc. } method foo image/gif {args} { # Override the general foo method for image/gif mimetype requests } }
Objects can be instantiated in a hierarchy. There is always one main object for the site, which is the starsite object (or a different object if a different web application is being created). This is similar to the Tk widget/object structure, with "starsite" or whatever being the root object which always exists. You can then do:
Foo /starsite/myfoo
and this will create an object of type "myfoo" under the starsite object.
Calls to the objects are handled by processing incoming HTTP requests. For instance, if a request came in to:
http://my.server.com/cgi-bin/starsite/myfoo/foo?arg1=hello&arg2=world
This would cause a lookup to see what the most specific object being requested is. In this example it is "myfoo" (otherwise it would be "starsite" - the root). So, a call is made to the "foo" method of the object "myfoo", like so:
$myfoo foo {arg1 hello} {arg2 world}
There are a couple of details here too:
It will be possible to set up objects as redirections:
starsite::redirect /starsite/newfoo -> /starsite/myfoo
would cause all method invocations on /starsite/newfoo to result in an HTTP redirect to /starsite/myfoo.
A mapping is maintained between hierarchy positions (URLs) and objects. This is a strictly one-to-one mapping.
Each object has the following properties associated with it:
The initial method of authentication will be based on username and passwords. All the object system cares about is having a user context in which to run objects. To this end, the method by which username and password is sent to the object system is left up to the specific application. This could be simple HTTP Basic Authentication, or it could be a secure connection over SSL. When a user authenticates, a session will be created for that user and returned. This session consists of a constructed safe interpreter with all the necessary API aliases created in it. The application can then set up some sort of session key (again, how this is done is an application implementation detail - could be a cookie, or some other, more secure session key). All requests for that session are processed within the context of that interpreter. This allows per-session data to be held as well (in global variables). The system will be configurable to destroy sessions after a given amount of time of inactivity.
Phew! This is another brainstorming session. It is quite possible that I am just creating another method of creating web applications, when I should be sticking to standards like WebDAV. To this end, I'm going to read up on WebDAV and SOAP a lot more before implementing this. I quite like the idea though of designing web applications purely around the logic of the site, providing content in whatever forms you want, and then letting the system worry about which content is suitable for which client. An extension to this would be to write content in XML and the system deals with selecting and applying XSLT stylesheets. However, the method proposed above is more general and can deal with selecting static content such as images, which are not easily expressed in XML (although there is SVG...). As an XML parser will have to be present to handle SOAP and WebDAV requests, providing XML and XSLT capabilities would be pretty easy.
Well, once again - comments welcomed!. Cheers, Neil.
NEM22May2003 - Haven't looked at this in a while. Now my final exams are coming to a close, I'd better start thinking about implementing all these ideas. In the meantime, I'm going to play around with as many open source/free projects of a similar nature (both Tcl things like Apache Rivet and OpenACS/AOLServer, and also non-Tcl stuff like Zope), to get a feel for what is useful in such a beast. Please read the above and comment, although this page is getting pretty big now. Please note, also, that I'm not at all sure if the above design is good anymore. Some tell-it-like-it-is criticism would be welcome!
escargo - I had been asking about e-mail notification of changes to wiki pages for this wiki, but part of the problem is that it realistically requires authentication before people can sign up (since you want to prevent people from signing up for notifications anyone but themselves). You are already planning on doing authentication. You could extend your design (or allow hooks that could be used to extend it) so that if a wiki page changed (or your more generalized objects), then an action could be triggered (in my case, an e-mail notification could be generated). This would require user data include an e-mail address; objects would have to have change listeners; somewhere there would be per-object metadata (for the listener list and the application data needed to keep the mail notification list). This is just "bells and whistles" as far as your desired features are concerned, but I thought this might give you an idea of how somebody might want to build on what you are doing.
NEM - Good idea. This can probably be added in by the application author, if I design things right, and wouldn't have to be present in the base system. Nevertheless, I'll bear this in mind. Right now, I'm thinking performance. The problem with my "everything's an object" system above is that it adds unnecessary overhead to things which are just static content (images, static HTML etc). For these, I'm thinking of object wrappers which are automatically created when a feature is needed, but not before. This way if a piece of data is just returned without any processing, it has no overhead. Ahh... much more thinking is needed on this.
PT 23Jun2003: I've recently been looking at Twiki which does support e-mail notification. There the user's home page contains the e-mail address to use and each 'web' (a twiki term which basically means a sub-tree of the entire wiki) can be configured to enable e-mail notification on a per user basis by adding usernames to the relevant page. Authentication is provided by having the user registration script automatically append a suitably constructed line to the site's .htpasswd file. In this wiki, changes tend to require authentication so it's always possibly to know who made what change.
I'm not certain that we need such overhead in this site, but Twiki has been designed to be useful in a corporate intranet environment, where tracability is important. As a side note - Twiki is reasonably easy to install under unix and windows.