HaO: This page is dedicated to the asynchronous socket connect, started by 'socket -async'. See the socket page for a general description.
It also serves as communication page for development and compares TCL 8.5.15-16, TCL 8.6.1-3 and future versions.
Async connect got more complicated in TCL 8.6, as multiple destination IPs are internally supported (due to IPV6 or DNS lookup resulting in multiple IPs).
See my speech on ETCL 2014 [L1 ].
The typical use-case for background connect is to install a writable event to get notified about the connect. If there is an additional connect timeout, this is canceled by the writable connect.
Typical code:
proc Connected {aid h fromip toip} { # cancel timeout after cancel $aid # check connect success or fail set error [fconfigure $h -error] if {$error ne ""} { catch {close $h} return } # disable writable event as it will come again and again if nothing written here fileevent $h writable "" # do something with the socket puts $h "HELO" # install readable event to process reply fileevent $h readable Receive } proc Timeout {h} { # Connect timeout catch {close $h} } # Receive function is not shown here and may be derived from the example below set h [socket -async $host $port] set aid [after 10000 [namespace code [list Timeout $h]]] fileevent $h writable [namespace code [list Connected $aid]] vwait forever
If there is no need to get notified on a successful connect and no connect timeout needed, one may use a readable connect only.
Attention, this did not work in Windows before 8.5.16 and 8.6.1 due to bugs:
proc Receive {h fromip toip} { # check connect succes or fail set error [fconfigure $h -error] if {$error ne ""} { catch {close $h} return } # get read data and process it if {[catch {gets $h} data]} { # read error catch {close $h} return } if {[eof $h]} { # other side disconnected catch {close $h} return } # now do something with the data... } set h [socket -async $host $port] fileevent $h readable Read # if a message is needed by the server after the connect, send it now non-blocking # It will be automatically sent when the connect succeeds fconfigure $h -blocking 0 puts $h "HELO" flush $h vwait forever
A use case is to start multiple connect, do something else and then process the connect state, all in a linear program without event queue. An example is a test if multiple servers are alive.
Example program:
set h [socket -async $host $port] # do something else which needs time # check if failed. Start also next try of multiple IPs of $host set error [fconfigure $h -error] if {$error ne ""} { # connect failed catch {close $h} return } # do something else which needs time # check if failed. Start also next try of multiple IPs of $host set error [fconfigure $h -error] if {$error ne ""} { # connect failed catch {close $h} return } # nothing to do, so do the rest synchronously # this blocks ! if {[catch { puts $h "HELO" set Data [gets $h] close $h } error] { # connect failed catch {close $h} }
This example requires the command 'fconfigure -connecting' which is included in TIP 427 and present as a hidden feature since tcl 8.6.2, public in 8.6.4. It investigates if the connection process is still running. This allows to do the upper example without blocking commands.
Example program:
set h [socket -async $host $port] while {[fconfigure $h -connecting]} { # do something else which needs time } # connection process terminated - check if failed set error [fconfigure $h -error] if {$error ne ""} { # connect failed catch {close $h} return } # do something with the connected socket
'socket -async' host first does a synchronous DNS lookup.
Then the connect is started as background process.
Version | Status |
---|---|
8.5.15 | ok |
8.5.16+ | ok |
8.6.1 unix | ok, requires event loop |
8.6.1 win | only first IP (broken) |
8.6.2 | ok* |
8.6.3 | ok* |
8.6.4+ | ok |
ideas | may be moved in own thread to not require event loop and not to pause between connect tries when command driven |
* See below: Bug c6ed4acfd8
Starting the event loop allows in TCL8.6 to continue with the next try or to fail finally. It is not absolutely necessary, as all other socket commands also advance the connect process.
The event queue may also initiate a pending background flush when the socket is successfully opened.
As a start point for all other commands: if a failed async connect socket is not closed after the first reported error, bad things like unreported errors etc. may happen.
Please close an async socket connect after the first reported error.
Fires when async connect terminates with success or error.
'fconfigure -error' may be used in the event procedure to check if the connect was successful.
Version | Status |
---|---|
8.5.15 | ok, see bugs |
8.5.16+ | ok |
8.6.1 win | only first IP (broken) |
8.6.1 unix | ok |
8.6.2+ | ok |
Fires when async connect terminates with error.
On a successful connect, it fires only, if there is data received.
'fconfigure -error' may be used in the event procedure to check if the connect was successful.
Version | Status |
---|---|
8.5.15 unix | ok |
8.5.15 win | only works when also writable event installed, see bugs |
8.5.16+ | ok |
8.6.1 win | only first IP and only with writable event (broken) |
8.6.1 unix | ok |
8.6.2+ | ok |
Remark: a puts may be delayed to a following flush.
The async connect is terminated synchronously.
On success, the operation is performed.
On connect failure, the error "socket is not connected" is returned. The reason for the connect failure may be investigated using fconfigure -error.
Version | Status |
---|---|
8.5.15 unix | ok. Instead of "socket is not connected", "broken pipe" may be reported. |
8.5.15 win | ok |
8.5.16+ unix | ok |
8.5.16+ | ok |
8.6.1 win | only first IP tested (broken). |
8.6.1 unix | ok. Instead of "socket is not connected", "broken pipe" may be reported. |
8.6.2+ | ok |
Remark: a puts may be delayed to a following flush.
The async connect state is checked or continued (next IP) in a non-blocking way.
Eventual pending flush is executed in the background automatically when the connection is established and the event queue is running.
Possible results:
Number | Condition | Action |
---|---|---|
NB1 | async connect still in progress | write operation is buffered and scheduled for background flush. Read operation returns empty string |
NB2 | async connect succeeded | operation is directly executed |
NB3 | async connect failed | Error "socket is not connected" is returned |
Implementation status:
Version | Status |
---|---|
8.5.15 unix | ok. Instead of "socket is not connected", "broken pipe" may be reported. |
8.5.15 win | ok |
8.5.16+ unix | ok. Instead of "socket is not connected", "broken pipe" may be reported. |
8.5.16+ win | ok |
8.6.1 win | only first IP (broken) |
8.6.1 unix | ok. Instead of "socket is not connected", "broken pipe" may be reported. |
8.6.2+ | ok |
A close while connection is in progress or after a successful connection should succeed.
A close after a failed connection succeeds.
If a background flush is pending (or already resulted in an internal error), an error is returned.
Version | Status |
---|---|
8.5.15 | ok. Empty error message may appear. |
8.5.16+ | ok |
8.6.1 | ok. Empty error message may appear. |
8.6.2+ | ok |
eof should be active:
Version | Status |
---|---|
8.5.15 | ok |
8.5.16+ | ok |
8.6.1 | ok |
8.6.2+ | ok |
Any fconfigure command on the socket continues the connect process.
Version | Status |
---|---|
8.6.1 win | no |
8.6.1 unix | no |
8.6.2+ | ok |
A final connect error should be returned by 'fconfigure -error'. No error should be flagged while connection is running.
Implementation status:
Version | Status |
---|---|
8.5.15 unix | ok. |
8.5.15 win | ok. Small bug: Failed socket connect error is reported indefinitely |
8.5.16+ unix | ok. |
8.5.16+ win | ok. Small bug: Failed socket connect error is reported indefinitely |
8.6.1 win | result of first tested IP (broken) |
8.6.1 unix | The errors of all tested IPs show temporarily up. The connect process may be disturbed. |
8.6.2+ | ok |
To fix the small bug, that a connect error is repeated indefinitely may introduce compatibility issues of programs which rely on that.
My own IP of the socket connection. Returns list of IP, Name, Port.
The return value is documented as undefined while an async connect is running.
Implementation status:
Version | Status |
---|---|
8.5.x | returns something like "0.0.0.0 0.0.0.0 51063" |
8.6.2 | returns the addresses of the connect tries which show up temporarily. Typically ::1, then 127.0.0.1 |
8.6.3 | returns the empty string |
The destination IP. Returns list of IP, Name, Port.
Implementation status:
Version | Status |
---|---|
8.5.x | returns information of tried IP while connecting. Error if connection failed |
8.6.1 win | returns information of first tried IP. Error if first connect try failed |
8.6.2 | reflects connection process, may return temporary IPs or temporary errors |
8.6.3 | returns the empty string |
Returns 1, if connection process is still running, 0 otherwise. Introduced and described in TIP 427 .
Implementation status:
Version | Status |
---|---|
8.5.x | not supported |
8.6.1 | not supported |
8.6.2 | present as hidden feature |
8.6.4 | public feature |
I thought, transferring a socket while connecting would for sure end in a not detected connection (error).
But in fact, everything worked on my Windows using thread 2.7.1 (current trunk):
test logsSuccessful connection:
% package require Thread 2.7.1 % set t [thread::create] tid000016FC % set h [socket -async www.google.com 80] sock01AE4608 % thread::transfer $t $h % thread::send $t "fconfigure $h -error" % thread::send $t "puts $h GETS" %
and connect error:
% set t [thread::create] tid000014C0 % #set h [socket -async www.google.com 80] % set h [socket -async localhost 30001] sock01AE4708 % #fconfigure $h -unsupported1 1 % fconfigure $h -blocking 0 % thread::transfer $t $h % thread::send $t "fconfigure $h -error" connection refused % thread::send $t "close $h" %
TCL8.6.1 only tries the first of eventual multiple IP addresses to connect. This may cause serious connect issues, specially with IPV6.
This is fixed in branch bug-13d3af3ad5 which also serves as main branch to fix all bugs in TCL8.6.1 and to test enhancements too.
Two issues:
Version | Status |
---|---|
8.5.15 | bug present |
8.5.16+ | fixed |
8.6.1 | bug present |
8.6.2+ | fixed |
Test for 1 is timing dependent and may ignore issue on some machines. Test for 2 is to run the teapot client massively, see bug description.
Version | Status |
---|---|
8.5.15 | bug present |
8.5.16+ | fixed |
8.6.1 | bug present |
8.6.2+ | fixed |
The test is difficult, as an async connect must fail after a puts is issued on the channel.
Idea: write a dummy channel driver, which may be set to an error state by fconfigure -seterror and where the readbale/writable state may be set. So one could:
set h [open dummy] fconfigure $h -seterror EWOULDBLOCK fileevent $h writable {set x writable} fconfigure $h -blocking 0 puts $h abc fconfigure $h -setwritable 1 vwait x catch {close $h} e d
Version | Status |
---|---|
8.5.15 win | bug present |
8.5.16+ win | Fixed |
8.6.1 win | bug present |
8.6.2+ win | Fixed |
Version | Status |
---|---|
8.6.1 | ok |
8.6.2 win | Bug introduced |
8.6.3 win | Bug present |
8.6.4 win | Fixed |
(Bug 42d50ebd ) Many tests now got timing dependent. Here is my discussion proposal to eventually cure that:
As found out yesterday, it is not possible any more to fix the moment when a socket connect fails. Example:
set h [socket -async localhost [randport]] # This needs two "updates" to fail, one for ::1, one for 127.0.0.1 # Background connect to ::1 started fconfigure $h -blocking 0 # if connect to ::1 already failed, connect to 127.0.0.1 starts puts $h Hi flush $h # if connect to 127.0.0.1 already failed, this shows the error "connection refused" # if connect to ::1 already failed, connect to 127.0.0.1 starts
For most tests, we need the connect procedure fail after the flush.
So I propose to:
Thus, the upper test may go like that:
# Switch auto-continue off set h [socket -async localhost [randport]] testsocket testflags $h 1 close $h # Now do the test setup: set h [socket -async localhost [randport]] fconfigure $h -blocking 0 puts $h Hi flush $h fileevent $h writable {set ::x 0} # switch auto-continue on to have normal operation testsocket testflags $h 0 vwait x
This is implemented in fossil branch robust-async-connect-tests
The same way, we could also do the test for tclIO.c "background error but no error message".
There are still a couple of test failures on CentOS and on FreeBSD documented in ticket 13d3af3ad5 .
If a socket connect fails, the error in the latest connect stage should be returned. This would prioritize "access denied" (e.g. socket in use) before "network unreachable" (no route).
Project stage for Win and Unix.
This is already implemented for Unix server sockets.
The Win TCL stubs table contains an entry for TclWinGetSockOpt() which returns the info from getsockopt().
In TCL8.5, the result of fconfigure -error was always the return value of the system call getsockopt(). In TCL8.6, a connect failure is cached in a variable and returned by fconfigure -error. Eventually, this should also be done by the routine called by the TclWinGetSockOpt stubs entry.
The purpose of this stub entry seams to be from the times of Windows 98 where a WinSock2.dll may not be present. There are no known usage of this. Thus it was decided to leave it as depreciated and to remove it for Tcl9.0.
When one puts pending data while connecting:
set h [socket -async $host $port] fconfigure $h -blocking 0 puts $h "HELO" flush $h vwait forever
this data is automatically sent when the connection is available.
I have no idea how this works, but it seam to work.
If there is a writable event, ok, I see the entry point for the framework, but without ?
This is a marker for me to investigate this.
A proposed test is a bit like that (sorry, in German from an E-Mail to rmax):
Wir brauchen: - eine Maschine mit IPV4 und IPV6. - info ob erst IPV4 oder erst IPV6 geprüft wird. Im folgenden wird (wie bei mir) erst IPV6 geprüft. Server und Prüfer: proc accept {s m p} { set ::s $s set ::x [gets $s] # hier kein close, da es auch Prozesse auslösen kann } set server [socket -server accept -myaddr 127.0.0.1 30000] vwait x set x # -> x muss "Hi" sein close $server close $s Client in extra Prozess set h [socket -async localhost 30000] fconfigure $h -blocking 0 puts $h Hi # stößt eventuell zweiten connectversuch an, aber gibt erstmal EWOULDBLOCK an Framework zurück... flush $h # bis hierhin wird nichts gesendet, da noch der zweite Connectversuch läuft. # Hier wird jetzt connected und im Hintergrund automatisch das flush ausgeführt. Wie ? keine Ahnung aber bei mir gehts.... after 2000 {set w 1} vwait w # kein close, Daten müssen ohne close ankommen... # auch kein fileevent, da das auch ein background flush auslöst.
In the windows API, there are two system functions which basically do all the work done in the tcl connect loop:
Those functions are only available for Vista+ (Desktop Applications) and Windows 8.1 (all Applications) (whatever that means).
Jan Nijtmans has sent me the following pointer to make code dependent on the availability of windows features: [L2 ]
To use this would increase performance within the connect procedure as there is no wait for the event loop etc necessary. In addition, the loop over IPs could be removed which makes a whole bunch of things easier.
rmax - I don't think this is The Right Thing to do for now, because the WSAConnectBy*() functions don't seem to allow non-blocking operation, so they would only be usable for blocking connects. This means we'd still need the looping and event loop stuff for [socket -async]. Another reason why we couldn't drop the loops even from the synchronous case is that we probably still want to support Windows versions before Vista.
So on the bottom line, we wouldn't save any of the current code, but add a lot of complexity, because the code would have to decide when these convenience functions can be used and load them. It would also increase testing effort, because different Windows versions are needed to test the different code paths.
I'd rather suggest to invest that time into unifying the two loops from the Windows and Unix platform code into a portable convenience function that goes into generic/tclIOSock.c, so that future changes don't have to be done twice.
HaO Thanks Reinhard (also for the chat session). After reading the docs it seams only be usable for syncroneous operation and lacks of options '-myip/-myport'. So using those commands is not an option.
For me, the final goal for the command 'socket -async' is:
and I hoped I can reach those aims without putting all the connect process in its own thread. Aparently, this is not the case using those functions. They block and they don't have the full functionality.
Thanks to Wojciech Kocjan for his book BOOK Tcl 8.5 Network Programming and discussions which taught this network stuff to me.