SCGI

Difference between version 41 and 42 - Previous - Next
'''SCGI''' is a protocol by which wWeb applications talk to a Web server.  It is much simpler than its competitor [FastCGI], but does not support serving multiple requests over one [TCP] connection.  See the Description section below for more information.



** Implementations **

*** HTTP servers that implement SCGI ***

   * [Apache]
   * [Lighttpd]
   * [nginx]   * [IISS] with the [ISAPI SCGI extension for IIS%|%ISAPI SCGI extension]

*** Tcl SCGI servers ***

   * `httpd::server.scgi` in [httpd (Tcllib)]
   * [Tanzer]
   * [tcl-scgi]
   * [Wapp]
   * [Woof!]



** Description **

[MJ] - SCGI (Simple Common Gateway Interface) [http://en.wikipedia.org/wiki/SCGI] is a replacement for [CGI] which has the benefit that all requests can be handled by a single instance of the SCGI server (the Tcl script in this case) eliminating the overhead of starting a new process for every request. Its goals are similar to [FastCGI] but the protocol between client (the webserver) and server (the script) is much simpler.

An advantage over Tcl modules embedded in Apache ([websh], [rivet], [mod_tcl]) is that it separates the Tcl part from the webserver, allowing a restart of the Tcl script without restarting the webserver or vice-versa.

The code below implements a simple SCGI server in Tcl which will display the information of the request as a result. This can easily be extended to fit your own purpose by overriding [[scgi::handle_request sock headers body]] The code below has some 8.5-isms but it should not be too difficult to make it 8.4 compatible.

I am not completely happy with the redefinition of the fileevent handlers that's going on (I am not sure if it's very elegant or a terrible hack), but I can't see another way to prevent the use of [global] variables containing the data already read. Comments are welcome.  



** Example implementation **

======
package require html

namespace eval scgi {
    proc listen {port} {
        socket -server [namespace code connect] $port
    }

    proc connect {sock ip port} {
        fconfigure $sock -blocking 0 -translation {binary crlf}
        fileevent $sock readable [namespace code [list read_length $sock {}]]
    }

    proc read_length {sock data} {
        append data [read $sock]
        if {[eof $sock]} {
            close $sock
            return
        }
        set colonIdx [string first : $data]
        if {$colonIdx == -1} {
            # we don't have the headers length yet
            fileevent $sock readable [namespace code [list read_length $sock $data]]
            return
        } else {
            set length [string range $data 0 $colonIdx-1]
            set data [string range $data $colonIdx+1 end]
            read_headers $sock $length $data
        }
    }

    proc read_headers {sock length data} {
        append data [read $sock]

        if {[string length $data] < $length+1} {
            # we don't have the complete headers yet, wait for more
            fileevent $sock readable [namespace code [list read_headers $sock $length $data]]
            return
        } else {
            set headers [string range $data 0 $length-1]
            set headers [lrange [split $headers \0] 0 end-1]
            set body [string range $data $length+1 end]
            set content_length [dict get $headers CONTENT_LENGTH]
            read_body $sock $headers $content_length $body
        }
    }

    proc read_body {sock headers content_length body} {
        append body [read $sock]

        if {[string length $body] < $content_length} {
            # we don't have the complete body yet, wait for more
            fileevent $sock readable [namespace code [list read_body $sock $headers $content_length $body]]
            return
        } else {
            handle_request $sock $headers $body
        }
    }
}

proc handle_request {sock headers body} {
    array set Headers $headers

    parray Headers
    puts $sock "Status: 200 OK"
    puts $sock "Content-Type: text/html"
    puts $sock ""
    puts $sock "<HTML>"
    puts $sock "<BODY>"
    puts $sock [::html::tableFromArray Headers]
    puts $sock "</BODY>"
    puts $sock "<H3>Body</H3>"
    puts $sock "<PRE>$body</PRE>"
    if {$Headers(REQUEST_METHOD) eq "GET"} {
        puts $sock {<FORM METHOD="post" ACTION="/scgi">}
        foreach pair [split $Headers(QUERY_STRING) &] {
            lassign [split $pair =] key val
            puts $sock "$key: [::html::textInput $key $val]<BR>"
        }
        puts $sock "<BR>"
        puts $sock {<INPUT TYPE="submit" VALUE="Try POST">}
    } else {
        puts $sock {<FORM METHOD="get" ACTION="/scgi">}
        foreach pair [split $body &] {
            lassign [split $pair =] key val
            puts $sock "$key: [::html::textInput $key $val]<BR>"
        }
        puts $sock "<BR>"
        puts $sock {<INPUT TYPE="submit" VALUE="Try GET">}
    }
    puts $sock "</FORM>"
    puts $sock "</HTML>"
    close $sock
}

scgi::listen 9999
vwait forever
======



** Discussion **

[MJ] 20071220 - Instead of reading the length one byte at a time, I changed the code to read as much as possible. This may or may not be better performing, but at least it fixes a DoS attack when the part before the first : is sent very slowly. This would result in a very tight while loop being executed pegging the CPU a 100%.

[MJ] - For file uploads the current implementation is not ideal. Here you should really override the ''read_body'' proc to fcopy the socket to the local file. Generally the code below can be expanded a bit to make integration into your app easier.

[sdw] - 2007-04-02 - And a "standard" template for you apps will can be found [Tcl Web Object Standards]

[APN] - [Woof!] uses a descendant of the above code for its SCGI support. Note the above code does not protect against malformed (and malicious) protocol input. Will update here once I fix Woof.

[MJ] - Usually a webserver forms the SCGI requests and I think it's a fair assumption that those requests are valid. But because it has been a while since I looked at this, what would malicious protocol input be?

[APN] I overlooked that the requests come from your own webserver so you are right. I missed that. By malicious, I meant input that would cause DoS attacks, e.g. sending a header length greater than the actual data would cause the above code to spike to 100% CPU, I think easily fixed by a EOF check. Thanks for this code BTW, as it is likely to be [Woof!]'s preferred web server interface mechanism as it supports Apache (with mod_scgi), nginx (mod_scgi), lighttpd (built-in) and IIS (with isapi_scgi).

[MJ] - The request length is also determined by the server, so if that forms the requests correctly, that's not really a problem either. Of course an [[eof]] check never hurts. Also you are very welcome, I am glad this is useful to someone. For me it was just a nice small project.

[MS] - I do wonder what webservers do with a POST request having a form variable ''CONTENT_LENGTH=987654321''. IIUC, the SCGI protocol [http://python.ca/scgi/protocol.txt] forbids form variables named ''CONTENT_LENGTH'' and ''SCGI'', as it forbids duplicate header names and those two are obligatory. Also ''REQUEST_METHOD'' and ''REQUEST_URI'' are likely to get you in trouble; any others?

[APN] Form variables are not sent as HTTP headers. They are part of the content. The SCGI restriction refers to HTTP headers only.

<<categories>> Protocol | Internet | Web