[Jacob Levy] 05/30/2003: Recently there was some discussion on comp.lang.tcl about whether Tcl is vulnerable to DoS attacks and what to do about it, if anything. First of all, the term DoS is shorthand for '''denial of service'''. A denial of service results when someone malicious succeeds in preventing legitimate users from accessing and using a shared resource. For example, a denial of service attack could prevent people from withdrawing money from an ATM, if the ATM back end processing was somehow prevented from processing legitimate transactions. Most denial of service attacks mounted via the internet involve sending massive amounts of bogus data to a server accessible via the net, thereby burying legitimate data in a pile of garbage. The result is that to find the legitimate data, the server has to examine and reject all the bogus data, which tremendously slows things down. Often, when you hear about DoS attacks, you also hear about a variant called DDoS, in which a perpetrator uses a lot of other computers on the net to mount a coordinated attack, making it less likely that the server will detect that the attack is even occurring. Security attacks are often evaluated and analyzed in terms of their practicality and effectiveness. Practicality measures how much work it takes to mount the attack; if it's not a lot of work, the attack is deemed practical. Effectiveness is a measure of how successful the attack is at achieving its goal, which could range from obtaining information that should stay protected or preventing legitimate access to a shared resource. The concern about Tcl and DoS attacks was raised by Scott Crosby in a note on the Tcl Core mailing list; you can read his note here: http://sf.net/mailarchive/forum.php?thread_id=2467138&forum_id=3854, and Scott and Dan Wallach wrote a paper, describing what they see as potential DoS vulnerabilities, here: http://www.cs.rice.edu/~scrosby/hash/. Here's my take: I think that the paper is significant, and not "drivel", because they showed that with, very little input bandwidth consumption, they were able to exploit a flaw in the deterministic algorithm used in an example application so that that application ceased to effectively function. That makes the attack practical, and this means that other, legitimate users, could not make use of this shared resource -- a DoS attack. One of the other example applications that was mentioned is Perl 5; there is no reason to believe that Tcl 8.4, for example, would be invulnerable to a practical attack along these lines. I do think this is serious and worthy of consideration. For a Tcl based networked server, there is a DoS risk here if you let a remote, untrusted application determine the hash keys you're going to use. That happens, as was pointed out, when you parse mime headers, when you parse HTTP headers, and (a new one from me) when you represent XML as nested arrays. In fact, even an assignment to an element of an array (or any local variable) causes a hash in Tcl. Fortunately, it's very easy to fix this DoS risk: instead of making Tcl's array resizing algorithm deterministic, let it choose, at Tcl_Init time, a resize factor in a bounded range. This effectively defeats the attack, because the remote attacker has no way to predict what inputs will clog a specific invocation of Tcl. Of course they could still get lucky and hit a sequence of inputs that causes a particular invocation to lock up tight. However, a sufficiently large range for the resize factor is sufficient to make the attack impractical. To make the array resizing algorithm even more resistant to this attack, allow it to choose a resize factor each time an array is created, and store the resize factor in the array metadata. Now, as to whether it's worthwile to fix this: Since Tcl accepts data as code, and allows you to execute strings received from untrusted sources, it is already exposed to DoS and hack and crash attacks, even without this specific vulnerability. Even if you run all untrusted code in a safe interpreter, you're still open to DoS attacks. For an example, try this simple code in a safe interpreter and wait for the Tcl prompt to return (hint: it won't :)): set i [interp create -safe] $i eval { proc donothing {} {} for {} {true} {} {donothing} } If some hacker sent this code to your networked server, and you accepted it for execution, and followed all the rules about executing untrusted code only in safe interpreters, your application would become catatonic to the world. This seems to me to be an even more reliable and low bandwidth method for DoS than what Scott Crosby and Dan Wallace propose. Maybe after we fix '''this''' problem, it'll be worthwile to fix the hash table DoS attack also :) ---- [AM] 2003-06-03 In the chatroom this was discussed as well, and one idea that was put forward is to see how chroot() and setrlimit() could be used as a model to limit the possibilities of a potentially malicious or careless script. It was noted that chroot() could be implemented via a script: redefine the [[open]] command in the slave interpreter. (Actually, you would have to redefine [[cd]] and [[pwd]] as well and the details are a bit complicated, as you do not want a file name as "../../myfile.inp" to compromise the starting point. Or on Windows, "c:/c:/myfile.inp", and you only strip off one copy of "c:/" ... Oh well, just some thoughts.) The infinite loop, as noted on the c.l.t. later on, could be broken by watching the command count. Here too, there are gory details to consider, but it can be done. ''[DKF] notes (on 04-Jun-2003) that Tcl already provides the tools for ensuring that nothing bad happens when doing a chroot()-alike, through the [[ [file] normalize]] command, as well as safe interpreters as normal.'' ---- [Jacob Levy] 2003-06-03 I'm waiting for Kevin Kenny to describe another idea he was proposing, a new [['''timelimit''' ''script ms'']] command. ---- '''DOS''' also sometimes refers to MSDOS. ---- [Jacob Levy] 2003-06-07 Recently there was a thread on comp.lang.tcl about a supposed bug in safe slaves where an "update" command made it appear that safe slaves somehow break out of an infinite loop. In the master, if control flow ever returns to it, the slave interpreter is destroyed. The scenario reported as a "bug" is that the master evaluates a script in a safe slave. The slave's script is an infinite "while" loop, and on each iteration the script calls "update". The problematic behavior observed was that the slave was destroyed somehow even though control was never supposed to return to the master. The person reporting the problem said that the problem only appears in "wish", not in "tclsh". This difference in the behavior can probably be attributed to the fact that the event loop is inactive in tclsh whereas it is active in wish. The "update" command probably allowed a nested script to start in the slave, and when control flow returned from it to the master the slave was destroyed. This makes it appear that the outter script somehow returned. Also, recently we had a discussion on how to implement resource limits for computations in general, in particular in the context of evaluating untrusted scripts in safe interpreters. One idea floated in that discussion was to place a clock time limit on the computation, so that if control does not return by the time limit, the computation is terminated. Also, nested time limits are unwound until the time limit that hit is reached. Now consider what happens when you place a time limit on a script that calls "update". The call to "update" does not return until all pending events are processed. If you use the event loop to start untrusted scripts e.g. those arriving on a socket, this script will cause nested untrusted computations to be started in the slave, and then when the time limit hits for the outter script, the nested computations are unwound as part of servicing the time limit. This in essence has the effect of time limiting these nested computations with the time limit of the outter script, denying service to these nested computations. Note that the outter script is malicious but the nested ones may be legitimate. I dont have a solution for this problem (yet). ---- [Category Acronym]