Version 35 of gets

Updated 2013-11-15 02:54:00 by APN

gets - Read a single line from a channel http://www.tcl.tk/man/tcl8.5/TclCmd/gets.htm

gets channelId
gets channelId variable

Reads a single line from the specified channel. In the first form, the characters of the line (with the exception of the end-of-line character) are returned as the result of the command. In the second form the characters of the line are written into the variable and the length of the line is returned instead.

When applied to a blocking channel the command will block until a line is complete or EOF was encountered. If the command is applied to a non-blocking channel and unable to read a complete line the first form of the command will return an empty string. The second form will return a -1 and refrain from setting the variable.

Do not use this command when Working with binary data. It will try to recognize end-of-line characters no matter what, even inside of packets.

If you're using gets in a loop, and want to stop when you reach the end of the file, use the following structure:

while {[gets $filestream line] >= 0} {
    # do what you like here
}

When structuring your code this way (and if your channel is blocking – the default) you do not need to use the eof command to detect when you've read all the data.

APN 2013-11-15 The issues in this section (DoS etc.) have been addressed in newer Tcl versions (8.5+) with the chan pending command.

George Peter Staplin: Using gets with a socket is a BAD IDEA. tclhttpd uses gets (as well as some of the modules), and it is trivial to make it panic on a unix-like system. With a Windows system, that may not have a ulimit on the memory tclhttpd can allocate, it may be even worse. Sadly, this has been known since 2001 (see below).

20040721 CMcC In defence of tclhttpd: tcl core module http and tcllib modules comm, ftpd, ftp, irc, nntp, pop3, pop3d, smtpd and ident all seem to suffer from precisely the same problem.

20060730 CMcC was thinking about why we don't see this problem in the wild, and remembered that all of the above protocols have per-transaction timeouts. For example, tclhttpd expects a completed header within a defined period after a connection occurs. It is this timeout, not the available address space, which limits the length of a line an attacker can send in most cases. This is not to say that gets shouldn't be fixed, but there is a simple preventative.

From the Tcl'ers chat on Oct 24, 2001:

dgp: I reported long ago that tclhttpd was vulnerable to a DoS due to gets slurping up data until it sees a newline. I guess that weakness in gets has never been addressed.

bbh: is it a weakness in gets or a weakness in an app using gets instead of read ?

dgp: Well, Brent Welch replied and said that the solution he would have to implement would be effectively writing his own safe gets in terms of read.

From the Tcl'ers Wiki Sep 20, 2002:

GPS: This bug with gets could be solved I suspect by adding a -maxchars flag to gets. For example:

 set res [gets -maxchars 100 $chan data]

If more than 100 chars are read then gets should return -1 or something like that. This would only be for the usage of gets with the optional variable argument.

MC 29 Oct 2006: I've proposed a [chan available] command (TIP #287 [L1 ]) to give programmers a tool they can use to introspect the amount of buffered (but as of yet unread) data on a channel. This would allow applications enough new introspection capabilities to implement their own policy for handling excessively long input lines, while still retaining the same [gets] semantics. (In a readable fileevent callback, where one should be testing for fblocked already, you could check whether [chan available $sock] > $limit and take appropriate action if it is.)


RS 2005-08-25: Here's how to temporarily disable echoing of the characters input to gets. You need stty which is part of Linux and Cygwin, so it works even on windows: (thanks MNO for the stty tip!)

 proc userpasswd _arr {
    upvar 1 $_arr ""
    if ![info exists (-user)]   {set (-user) [prompt "username:"]} 
    if ![info exists (-passwd)] {
        exec stty -echo
        set (-passwd) [prompt "password:"]
        exec stty echo
        puts ""
    }
 }
 proc prompt string {
    puts -nonewline "$string "
    flush stdout
    gets stdin
 }

More concentrated, here's the "gets with no echo" functionality by itself:

 proc gets'noecho {} {
     exec stty -echo
     gets stdin line
     exec stty echo
     puts ""
     set line
 }

MHo: I think it should be possible to handle the echo state on Windows without installing Cygwin....


See gets workaround for a solution when [gets stdin] won't work, e.g. on W95 and PocketPC.


From comp.lang.tcl, thanks to Alex, a drop-in replacement for gets with an extra timeout argument:

 proc gets_timeout {ch vline timeout} { 
    upvar $vline line 
    set id [after $timeout set ::_gt($ch) 1] 
    set blo [fconfigure $ch -blocking] 
    fconfigure $ch -blocking 0 
    fileevent $ch readable [list set ::_gt($ch) 2] 
    set err NONE 
    while {1} { 
        vwait ::_gt($ch) 
        if {$::_gt($ch)==1} { 
            set err TIMEOUT 
            break 
        } 
        set n [gets $ch line] 
        if {$n<0} { 
            if {[fblocked $ch]} continue 
            set err EOF 
        } 
        after cancel $id 
        break 
    } 
    fconfigure $ch -blocking $blo 
    switch $err { 
        NONE {return $n} 
        TIMEOUT {error TIMEOUT} 
        EOF {return -1} 
    } 
 } 

AMG: I'd like to see an option added to [gets] to override the end-of-line characters. When this option is in use, the delimiter character probably should be retained in the output so the program can tell which delimiter was read, or if a delimiter was read at all before hitting EOF. I guess it would work a bit like getdelim() [L2 ].

Here's some code I use right now that comes close.

# Read from $chan until one of the characters in $delims is encountered.
proc read_delim {chan delims} {
    set result ""
    while {1} {
        set char [read $chan 1]
        if {$char eq ""} {
            error EOF
        } elseif {[string first $char $delims] == -1} {
            append result $char
        } else {
            return $result
        }
    }
}

Pie in the sky: Allow the definition of a "line" to be specified as a regular expression. However, I doubt the Tcl RE code is flexible enough to operate on a stream of data as well as a random-access buffer whose size is known in advance.