Version 17 of DoS

Updated 2003-09-04 00:02:30

Jacob Levy 05/30/2003: Recently there was some discussion on comp.lang.tcl about whether Tcl is vulnerable to DoS attacks and what to do about it, if anything.

First of all, the term DoS is shorthand for denial of service. A denial of service results when someone malicious succeeds in preventing legitimate users from accessing and using a shared resource. For example, a denial of service attack could prevent people from withdrawing money from an ATM, if the ATM back end processing was somehow prevented from processing legitimate transactions. Most denial of service attacks mounted via the internet involve sending massive amounts of bogus data to a server accessible via the net, thereby burying legitimate data in a pile of garbage. The result is that to find the legitimate data, the server has to examine and reject all the bogus data, which tremendously slows things down. Often, when you hear about DoS attacks, you also hear about a variant called DDoS, in which a perpetrator uses a lot of other computers on the net to mount a coordinated attack, making it less likely that the server will detect that the attack is even occurring.

Security attacks are often evaluated and analyzed in terms of their practicality and effectiveness. Practicality measures how much work it takes to mount the attack; if it's not a lot of work, the attack is deemed practical. Effectiveness is a measure of how successful the attack is at achieving its goal, which could range from obtaining information that should stay protected or preventing legitimate access to a shared resource.

The concern about Tcl and DoS attacks was raised by Scott Crosby in a note on the Tcl Core mailing list; you can read his note here: http://sf.net/mailarchive/forum.php?thread_id=2467138&forum_id=3854 , and Scott and Dan Wallach wrote a paper, describing what they see as potential DoS vulnerabilities, here: http://www.cs.rice.edu/~scrosby/hash/ .

Here's my take: I think that the paper is significant, and not "drivel", because they showed that with, very little input bandwidth consumption, they were able to exploit a flaw in the deterministic algorithm used in an example application so that that application ceased to effectively function. That makes the attack practical, and this means that other, legitimate users, could not make use of this shared resource -- a DoS attack. One of the other example applications that was mentioned is Perl 5; there is no reason to believe that Tcl 8.4, for example, would be invulnerable to a practical attack along these lines.

I do think this is serious and worthy of consideration. For a Tcl based networked server, there is a DoS risk here if you let a remote, untrusted application determine the hash keys you're going to use. That happens, as was pointed out, when you parse mime headers, when you parse HTTP headers, and (a new one from me) when you represent XML as nested arrays. In fact, even an assignment to an element of an array (or any local variable) causes a hash in Tcl.

Fortunately, it's very easy to fix this DoS risk: instead of making Tcl's array resizing algorithm deterministic, let it choose, at Tcl_Init time, a resize factor in a bounded range. This effectively defeats the attack, because the remote attacker has no way to predict what inputs will clog a specific invocation of Tcl. Of course they could still get lucky and hit a sequence of inputs that causes a particular invocation to lock up tight. However, a sufficiently large range for the resize factor is sufficient to make the attack impractical. To make the array resizing algorithm even more resistant to this attack, allow it to choose a resize factor each time an array is created, and store the resize factor in the array metadata.

Now, as to whether it's worthwile to fix this: Since Tcl accepts data as code, and allows you to execute strings received from untrusted sources, it is already exposed to DoS and hack and crash attacks, even without this specific vulnerability. Even if you run all untrusted code in a safe interpreter, you're still open to DoS attacks. For an example, try this simple code in a safe interpreter and wait for the Tcl prompt to return (hint: it won't :)):

    set i [interp create -safe]
    $i eval {
      proc donothing {} {}
      for {} {true} {} {donothing}
    }

If some hacker sent this code to your networked server, and you accepted it for execution, and followed all the rules about executing untrusted code only in safe interpreters, your application would become catatonic to the world. This seems to me to be an even more reliable and low bandwidth method for DoS than what Scott Crosby and Dan Wallace propose.

Maybe after we fix this problem, it'll be worthwile to fix the hash table DoS attack also :)


AM 2003-06-03 In the chatroom this was discussed as well, and one idea that was put forward is to see how chroot() and setrlimit() could be used as a model to limit the possibilities of a potentially malicious or careless script.

It was noted that chroot() could be implemented via a script: redefine the [open] command in the slave interpreter.

(Actually, you would have to redefine [cd] and [pwd] as well and the details are a bit complicated, as you do not want a file name as "../../myfile.inp" to compromise the starting point. Or on Windows, "c:/c:/myfile.inp", and you only strip off one copy of "c:/" ... Oh well, just some thoughts.)

The infinite loop, as noted on the c.l.t. later on, could be broken by watching the command count. Here too, there are gory details to consider, but it can be done.

DKF notes (on 04-Jun-2003) that Tcl already provides the tools for ensuring that nothing bad happens when doing a chroot()-alike, through the [ file normalize] command, as well as safe interpreters as normal.


Jacob Levy 2003-06-03 I'm waiting for Kevin Kenny to describe another idea he was proposing, a new [timelimit script ms] command.


Jacob Levy 2003-06-07 Recently there was a thread on comp.lang.tcl about a supposed bug in safe slaves where an "update" command made it appear that safe slaves somehow break out of an infinite loop. In the master, if control flow ever returns to it, the slave interpreter is destroyed.

The scenario reported as a "bug" is that the master evaluates a script in a safe slave. The slave's script is an infinite "while" loop, and on each iteration the script calls "update". The problematic behavior observed was that the slave was destroyed somehow even though control was never supposed to return to the master. The person reporting the problem said that the problem only appears in "wish", not in "tclsh". This difference in the behavior can probably be attributed to the fact that the event loop is inactive in tclsh whereas it is active in wish. The "update" command probably allowed a nested script to start in the slave, and when control flow returned from it to the master the slave was destroyed. This makes it appear that the outter script somehow returned.

Also, recently we had a discussion on how to implement resource limits for computations in general, in particular in the context of evaluating untrusted scripts in safe interpreters. One idea floated in that discussion was to place a clock time limit on the computation, so that if control does not return by the time limit, the computation is terminated. Also, nested time limits are unwound until the time limit that hit is reached.

Now consider what happens when you place a time limit on a script that calls "update". The call to "update" does not return until all pending events are processed. If you use the event loop to start untrusted scripts e.g. those arriving on a socket, this script will cause nested untrusted computations to be started in the slave, and then when the time limit hits for the outter script, the nested computations are unwound as part of servicing the time limit. This in essence has the effect of time limiting these nested computations with the time limit of the outter script, denying service to these nested computations. Note that the outter script is malicious but the nested ones may be legitimate.

I dont have a solution for this problem (yet). Is it possibly a security bug that "update" is present in a safe slave? Or perhaps it should be somehow limited in scope if invoked from a safe interpreter?

DKF: (9-Jun-2003) This is a problem with time-limits for limiting execution, and there's not much you can do about it. However, the way you resolve these things is to mark the slave interpreter as expired, which stops further processing of commands within it. OK, it doesn't shut the interpreter down immediately (though in fact, deleting the interpreter shouldn't have any really bad effects - Tcl's careful like that) but just prevents further execution of commands within it. It's up to the other interpreter to exit from its processing before the [while 1 update] can respond, but that's reasonable enough.


DKF: An interesting reference for doing something about the hash table problem that was discussed earlier this year in c.l.t [REF] is [L1 ]. The code isn't suited to being used as-is, but seems to have at least some of the spectral properties we'd like. And it is fairly cheap too.


US: (4-Sep-2003) A DoS attack can very well be handled in pure Tcl!

I've prepared a small example server, which is able to handle and limit infinite while loops. You may set a time limit (in milliseconds) and a maximum of iterations through the loop. Loops may even be nested. I'm sure the code still needs some polishing, but it works. It should be easy to extend this capability to for and foreach loops.

Please try it out, try to initiate a DoS. If you accomplish a DoS, please tell me how, I'll try to improve the code. (No, redefinition of while in terms of other loop constructs is not the way to fool this server. The other loops are not yet protected, but in the final version they will be.)

All puts statements are just for illustration purposes, they aren't needed.

 proc lim_while {chan cond body} {
   global next_loop_id cmd slave maxnum maxtim
   puts "Slave: while \{$cond\} \{$body\}"
   set loop_id while#[incr next_loop_id]
   set cmd($loop_id,time) [expr {[clock clicks -milliseconds] + $maxtim}]
   set cmd($loop_id,runs) $maxnum
   set new_cond "\[check_limit $loop_id\] && ($cond)"
   interp invokehidden $slave($chan) while $new_cond $body
 }

 proc check_limit {loop_id} {
   global cmd
   if {$cmd($loop_id,runs)} {
     incr cmd($loop_id,runs) -1
   } else {
     error "$loop_id exceeded maxnum"
   }
   if {[clock clicks -milliseconds] > $cmd($loop_id,time)} {
     error "$loop_id timed out"
   }
   return 1
 }

 proc connect {chan addr port} {
   global slave
   fconfigure $chan -buffering line
   fileevent $chan readable [list rcv $chan]
   set slave($chan) [interp create -safe]
   interp hide $slave($chan) while
   interp alias $slave($chan) while {} lim_while $chan
   interp alias $slave($chan) check_limit {} check_limit
   puts "Host $addr connected at port $port ($chan)"
 }

 proc rcv {chan} {
   global cmd slave
   if {[gets $chan line] == -1} {
     close $chan
     interp delete $slave($chan)
     unset cmd($chan)
     return
   }
   append cmd($chan) $line "\n"
   if {[info complete $cmd($chan)]} {
     if {[catch {$slave($chan) eval $cmd($chan)} res]} {
       puts "Illegal instruction ${res}."
       set res "Illegal instruction ignored."
     }
     puts $chan $res
     set cmd($chan) ""
   }
 }

 set maxnum 100
 set maxtim 1000
 set next_loop_id 0

 set server [socket -server connect 4711]
 vwait forever
 close $server
 exit

DOS also sometimes refers to MSDOS. The resemblance to a denial-of-service attack is uncanny... ;^)


Category Acronym