Version 8 of bexec

Updated 2014-05-19 01:28:18 by RLE

samoc: How to do binary-safe "exec":

proc bexec {command input} {
    # Execute shell "command", send "input" to stdin, return stdout.
    # Ignores stderr (but "2>@1" can be part of "command").
    # Supports binary input and output. e.g.:
    #     set flac_data [bexec {flac -} $wav_data]

    # Run "command" in background...
    set f [open |$command {RDWR BINARY}]
    fconfigure $f -blocking 0

    # Connect read function to collect "command" output...
    set ::bexec_done.$f 0
    set ::bexec_output.$f {}
    fileevent $f readable [list bexec_read $f]

    # Send "input" to command...
    puts -nonewline $f $input
    close $f write

    # Wait for read function to signal "done"...
    vwait ::bexec_done.$f

    # Retrieve output...
    set result [set ::bexec_output.$f]
    unset ::bexec_output.$f
    unset ::bexec_done.$f

    fconfigure $f -blocking 1
    try {
        close $f
    } trap {CHILDSTATUS} {options info} {
        dict set info -errorinfo $result
        return -options $info $result
    }

    return $result
}


proc bexec_read {f} {
    # Accumulate output in ::bexec_output.$f.

    append ::bexec_output.$f [read $f]
    if {[eof $f]} {
        fileevent $f readable {}
        set ::bexec_done.$f 1
    }
}

PYK 2014-05-11: The code above does not need to go to the effort of setting the channel to non-blocking, as Tcl will manage the buffer behind the scenes. Forget about setting the channel to non-blocking, dispense with the vwait, send the desired data into the channel, close the write side of the channel, and then read the output. For code that does need to use non-blocking channels, it's generally best not to interfere with the event loop by using vwait as the code above does. Instead, consider structuring the code such that it works in an event-oriented manner. See also example of reading and writing to a piped command, which provides a template for conducting an interactive conversation with another process.

samoc 2014-05-16: It would be great to have a simpler way to execute a command in the background. However, I can't see how to make it work for large binary data without fconfigure $f -blocking 0. Example (platform::identify > macosx10.9-x86_64):

set wav [download http://foo.net/bar.wav]]
set flac [bexec {flac - --totally-silent} $wav]

In this case fconfigure $f -blocking 0 is required. Without non-blocking mode, "puts -nonewline $f $input" never returns. My assumption is that flac reads some (but not all) of the input from its stdin side of the pipe, then tries to write some output to its stdout side of the pipe. There is no-one reading flac's output pipe yet, so it blocks on output. We're blocked on sending input to flac, it is blocked on sending output back to us. Deadlock.

I expect that this behaviour is common for filter type programs that deal with large amounts of data. It is impractical for these programs to buffer the entire input in RAM, so they process a chunk at a time and send the output to stdout as they go.

WRT being "event-oriented". The aim here is to expose a simple, blocking, non-event oriented interface for executing an external command. The whole point is to hide the messy event processing detail.

It would be nice if exec flac - << $wav worked. But exec is broken for binary input/output. -- See http://tip.tcl.tk/259.html

samoc 2014-05-19: Comments from PYK below correctly point out that if channels are passed to exec, they can be set to BINARY mode before exec is called and will handle binary data just fine. I believe my example above was misleading in its original form (set wav [read [open foo.wav {RDONLY BINARY}]). I hope the revised version makes my intentions more clear. i.e. I have large binary data in RAM _not_ in a file (wav and flac is just an example). I need to pass the data to stdin of an "exec"ed process. I definitely don't want to have to make a tmp file on disk.

PYK 2014-05-16: exec flac - <@$wavchan works fine for me. Do you have any evidence to support the claim that "exec is broken for binary input/output" ? It is best to avoid vwait. Below are three different methods for piping binary data to flac . The third method polls and uses after, but even that is preferable to vwait if the script author doesn't want to design the code around file events.

#! /bin/env tclsh

#method 1:  redirection of stdin and stdout
set chan [open ztest1.wav {RDONLY BINARY}]
exec flac - <@$chan 2>@stderr >ztestout1.flac

#method 2:  redirect data into flac, read flac output into Tcl,  then write to channel 
seek $chan 0
set chan2 [open |[list flac - <@$chan] {RDONLY BINARY}]
set chanout [open ztestout2.flac {BINARY CREAT WRONLY}]
while {![eof $chan2]} {
    puts -nonewline $chanout [read $chan2 32768]
}
close $chan2
close $chanout

#method 3: read output, write to flac, read flac output, write to channel
seek $chan 0
set chan2 [open |[list flac -] r+]
chan configure $chan2 -translation binary -blocking 0
set chan3 [open ztestout3.flac {BINARY CREAT WRONLY}]
set done 0
while 1 {
    if {!$done} {
        if {[eof $chan]} {
            close $chan2 write
            set done 1
        } else {
            set data [read $chan 32768]
            puts -nonewline $chan2 $data 
        }
    }
    set data [read $chan2 32768]
    if {$data eq {}} {
        if {[eof $chan2]} {
            break
        }
    } else {
        puts -nonewline $chan3 $data 
    }
    after 1
}

close $chan
close $chan3

samoc 2014-05-19: Thanks for the feedback PYK.

Re exec being broken for binary data, see reference to relevant TIP above.

Re making channels BINARY, see new comment above.

Re vwait: I am intrigued. Is vwait considered bad? As far as I can see vwait is the Tcl way of doing select(2). i.e. putting the making sure the OS does not schedule the process until a condition is met. This is essential for creating scalable systems. If I have a loop that polls every 1 ms, and I have a few hundred instances blocked on IO, pretty soon the polling and context switching gets expensive.

Re "design the code around fileevent", I believe that is how my original "bexec" works (see "fileevent $f readable..."). But file event only works if the event loop gets run. My understanding is that "vwait" is the way to run the event loop. Is there a better way?

RLE (2014-05-18): If you look down to DKF's comment on the Tcl event loop page you'll find this statement:

DKF: Tcl doesn't run the event loop by default, so idiomatically people do this in their pure Tcl scripts to start the event loop:

 vwait forever

If you are not running Tk, doing a vwait is the only way to get the event loop started. Now, of course, you should not nest vwait calls, and if you are running Tk, you don't need vwait because the event loop runs by default.