**Purpose** This page is dedicated to the asynchronous socket connect, started by 'socket -async'. See the [socket] page for a general description. It also serves as communication page for development and compares TCL 8.5.15, TCL 8.5.16, TCL 8.6.1. TCL 8.6.2 and future versions. Async connect got more complicated in TCL 8.6, as multiple destination IPs are internally supported (due to IPV6 or DNS lookup resulting in multiple IPs). See my speach on ETCL 2014 [http://www.eurotcl.tcl3d.org/presentations/EuroTcl2014-Oehlmann-Socket.pdf]. **Use cases** ***Background connect and notification about connect*** The typical use-case for background connect is to install a writable event to get notified about the connect. If there is an additional connect timeout, this is canceled by the writable connect. Typical code: ======tcl proc Connected {aid h fromip toip} { # cancel timeout after cancel $aid # check connect succes or fail set error [fconfigure $h -error] if {$error ne ""} { catch {close $h} return } # disable writable event as it will come again and again if nothing written here fileevent $h writable "" # do something with the socket puts $h "HELO" # install readable event to process reply fileevent $h readable Receive } proc Timeout {h} { # Connect timeout catch {close $h} } # Receive function is not shown here and may be derived from the example below set h [socket -async $host $port] set aid [after 10000 [namespace code [list Timeout $h]]] fileevent $h writable [namespace code [list Connected $aid]] vwait forever ====== ***Background connect and only readevent*** If there is no need to get notified on a succesfull connect and no connect timeout needed, one may use a readable connect only. Attention, this did not work in Windows before 8.5.16 and 8.6.1 due to bugs: ======tcl proc Receive {h fromip toip} { # check connect succes or fail set error [fconfigure $h -error] if {$error ne ""} { catch {close $h} return } # get read data and process it if {[catch {gets $h} data]} { # read error catch {close $h} return } if {[eof $h]} { # other side disconnected catch {close $h} return } # now do something with the data... } set h [socket -async $host $port] fileevent $h readable Read # if a message is needed by the server after the connect, send it now non-blocking # It will be automatically sent when the connect succeeds fconfigure $h -blocking 0 puts $h "HELO" flush $h vwait forever ====== ***async connect and blocking operation*** A use case is to start multiple connect, do something else and then process the connect state, all in a linear program without event queue. An example is a test if multiple servers are alive. Example program: ======tcl set h [socket -async $host $port] # do something else which needs time # check if failed. Start also next try of multiple IPs of $host set error [fconfigure $h -error] if {$error ne ""} { # connect failed catch {close $h} return } # do something else which needs time # check if failed. Start also next try of multiple IPs of $host set error [fconfigure $h -error] if {$error ne ""} { # connect failed catch {close $h} return } # nothing to do, so do the rest syncroneously # this blocks ! if {[catch { puts $h "HELO" set Data [gets $h] close $h } error] { # connect failed catch {close $h} } ====== ***async connect and no event queue*** This example requires the command 'fconfigure -connecting' which is proposed in [http://www.tcl.tk/cgi-bin/tct/tip/427%|%TIP 427%|%] and present as a hidden feature in tcl 8.6.2. It gives the state if the connection process is still running. This allows to do the upper example without blocking commands. Example program: ======tcl set h [socket -async $host $port] while {[fconfigure $h -connecting]} { # do something else which needs time } # connection process terminated - check if failed set error [fconfigure $h -error] if {$error ne ""} { # connect failed catch {close $h} return } # do something with the connected socket ====== **Command behaviour** ***socket -async*** 'socket -async' ''host'' first does a syncroneous DNS lookup. Then the connect is started as background process. * In TCL8.5, this terminates without any interaction by background processes. * In TCL8.6, the event loop or command invocation is required to check multiple IPs. %|Version|Status|% &|8.5.15|ok|& &|8.5.16+|ok|& &|8.6.1 unix|ok, requires event loop|& &|8.6.1 win|only first IP (broken)|& &|8.6.2+|ok|& &|ideas|may be moved in own thread or to one OS call (Vista+) to not require event loop and not to pause between connect tries when command driven|& ***update,vwait*** Starting the event loop allows in TCL8.6 to continue with the next try or to fail finally. It is not absolutely necessary, as all other socket commands also advance the connect process. The event queue may also initiate a pending background flush when the socket is succesfully opened. ***close on error*** As a start point for all other commands: if a failed async connect socket is not closed after the first reported error, bad things like unreported errors etc. may happen. ''Please'' close an async socket connect after the first reported error. ***fileevent writable*** Fires when async connect terminates with success or error. 'fconfigure -error' may be used in the event procedure to check if the connect was succesful. %|Version|Status|% &|8.5.15|ok, see bugs|& &|8.5.16+|ok|& &|8.6.1 win|only first IP (broken)|& &|8.6.1 unix|ok|& &|8.6.2+|ok|& ***fileevent readable*** Fires when async connect terminates with error. On a succesful connect, it fires only, if there is data received. 'fconfigure -error' may be used in the event procedure to check if the connect was succesful. %|Version|Status|% &|8.5.15 unix|ok|& &|8.5.15 win|only works when also writable event installed, see bugs|& &|8.5.16+|ok|& &|8.6.1 win|only first IP and only with writable event (broken)|& &|8.6.1 unix|ok|& &|8.6.2+|ok|& ***blocking gets,read,puts,flush*** Remark: a puts may be delayed to a following flush. The async connect is terminated syncroneously. On success, the operation is performed. On connect failure, the error "socket is not connected" is returned. The reason for the connect failure may be investigated using ''fconfigure -error''. %|Version|Status|% &|8.5.15 unix|ok. Instead of "socket is not connected", "broken pipe" may be reported.|& &|8.5.15 win|ok|& &|8.5.16+ unix|ok|& &|8.5.16+|ok|& &|8.6.1 win|only first IP tested (broken).|& &|8.6.1 unix|ok. Instead of "socket is not connected", "broken pipe" may be reported.|& &|8.6.2+|ok|& ***non blocking gets,read,puts,flush*** Remark: a puts may be delayed to a following flush. The async connect state is checked or continued (next IP) in a non-blocking way. Eventual pending flush is executed in the background automatically when the connection is established and the event queue is running. Possible results: %|Number|Condition|Action|% &|NB1|async connect still in progress|write operation is buffered and sheduled for background flush.<
>Read operation returns empty string|& &|NB2|async connect succeeded|operation is directly executed|& &|NB3|async connect failed|Error "socket is not connected" is returned|& Implementation status: %|Version|Status|% &|8.5.15 unix|ok. Instead of "socket is not connected", "broken pipe" may be reported.|& &|8.5.15 win|ok|& &|8.5.16+ unix|ok. Instead of "socket is not connected", "broken pipe" may be reported.|& &|8.5.16+ win|ok|& &|8.6.1 win|only first IP (broken)|& &|8.6.1 unix|ok. Instead of "socket is not connected", "broken pipe" may be reported.|& &|8.6.2+|ok|& ***close*** A close while connection is in progress or after a succesful connection should succeed. A close after a failed connection succeeds. If a background flush is pending (or already resulted in an internal error), an error is returned. %|Version|Status|% &|8.5.15|ok. Empty error message may appear.|& &|8.5.16+|ok|& &|8.6.1|ok. Empty error message may appear.|& &|8.6.2+|ok|& ***eof*** eof should be active: * After a read on a socket closed from the other side. * never active with async sockets and may not be used to detect the connection status %|Version|Status|% &|8.5.15|ok|& &|8.5.16+|ok|& &|8.6.1|ok|& &|8.6.2+|ok|& ***fconfigure*** Any fconfigure command on the socket continues the connect process. %|Version|Status|% &|8.6.1 win|no|& &|8.6.1 unix|no|& &|8.6.2+|ok|& ***fconfigure -error*** A final connect error should be returned by 'fconfigure -error'. No error should be flagged while connection is running. Implementation status: %|Version|Status|% &|8.5.15 unix|ok.|& &|8.5.15 win|ok. Small bug: Failed socket connect error is reported indefinitely|& &|8.5.16+ unix|ok.|& &|8.5.16+ win|ok. Small bug: Failed socket connect error is reported indefinitely|& &|8.6.1 win|result of first tested IP (broken)|& &|8.6.1 unix|The errors of all tested IPs show temporarely up. The connect process may be disturbed.|& &|8.6.2+|ok|& To fix the small bug, that a connect error is repeated indefinitively may introduce compatibility issues of programs which rely on that. ***fconfigure -sockname*** My own IP of the socket connection. Returns list of IP, Name, Port. The return value is documented as undefined while an async connect is running. Implementation status: %|Version|Status|% &|8.5.x|returns something like "0.0.0.0 0.0.0.0 51063"|& &|8.6.2|returns the addresses of the connect tries which show up temporarily. Typically ::1, then 127.0.0.1|& &|8.6.3|returns the empty string|& ***fconfigure -peername*** The destination IP. Returns list of IP, Name, Port. Implementation status: %|Version|Status|% &|8.5.x|returns information of tried IP while connecting. Error if connection failed|& &|8.6.1 win|returns information of first tried IP. Error if first connect try failed|& &|8.6.2|reflects connection process, may return temporary IPs or temporary errors|& &|8.6.3|returns the empty string|& ***thread::transfer*** I thought, transfering a socket while connecting would for sure end in a not detected connection (error). But in fact, everything worked on my Windows using thread 2.7.1 (current trunk): <> test logs Succesful connection: ======tcl % package require Thread 2.7.1 % set t [thread::create] tid000016FC % set h [socket -async www.google.com 80] sock01AE4608 % thread::transfer $t $h % thread::send $t "fconfigure $h -error" % thread::send $t "puts $h GETS" % ====== and connect error: ======tcl % set t [thread::create] tid000014C0 % #set h [socket -async www.google.com 80] % set h [socket -async localhost 30001] sock01AE4708 % #fconfigure $h -unsupported1 1 % fconfigure $h -blocking 0 % thread::transfer $t $h % thread::send $t "fconfigure $h -error" % thread::send $t "fconfigure $h -error" connection refused % thread::send $t "close $h" % ====== <> **TIPs** ***TIP 427: fconfigure -connecting*** [http://www.tcl.tk/cgi-bin/tct/tip/427%|%TIP 427%|%]: A new option '-connecting' should return 1 if connection is still in process. Reason for that: there are use cases for socket -async without event loop. Example: Within Web-Server verify multiple servers for connectivity. Open many sockets in the background, do some other calculations, check if connect terminated by "-connecting". Within TCL 8.6.2, a hidden implementation is present but hidden. As TIP is currently not updateable, here is the new TIP source data: <> TIP 427 raw contents **Abstract** ======none This TIP describes a method to introspect the asynchronous connection process by an extension of the '''fconfigure''' interface in addition to '''fileevent writable'''. This will enable better control over the asynchronous connection process, even in cases where the event loop is not in use. ====== **Body** ======none ~ Rationale The '''socket''' core command supports two ways to establish a client socket, ''synchronous'' and ''asynchronous''. In synchronous mode (which is the default) the command does not return until the connection attempt has completed (established or failed). In asynchronous mode ('''-async option''') the command returns after DNS lookup and the connection is established in the background. This is useful in situations where it is undesirable that a process or thread blocks for completing a synchronous connection attempt. Classically, an asyncronously connecting socket would indicate that it had connected (or failed to connect) by becoming writeable, which '''fileevent writable''' can be used to detect. A DNS name may have multiple IP addresses associated, e.g. for IPv4/IPv6 dual stack hosts or for fail safety or load balancing reasons as it is the case for google.com as of this writing. In Tcl 8.5 the socket command only tried to connect to a single IPv4 address that was randomly picked from the list returned by DNS. In Tcl 8.6, the socket command tries to connect to all the IP addresses of a DNS name in turn until one succeeds or all have failed. With the current implementation, the driver needs calls from the tcl framework to continue with this connection process, which is done by any call into the driver which may be invoked from the script level by calling read, puts, flush or fconfigure. The usage of '''socket -async''' is seen as helpful even without the event loop. An example is an application, which checks a list of hosts for a connection. The application may start many background socket connects, do something else, and then collect the results. Without the event loop (i.e., a '''fileevent writable'''), there is no non-blocking way to discover if the asynchronous connect has completed. In addition, the following future points may be considered: * The connection process may internally get delegated to its own thread; this would allow the connection process to be asynchronous. * A future Windows implementation may use the Vista+ API ''WSAConnectByList'' (once we do not support Windows XP any more). Using this, no own looping over the addresses is necessary. It allows the connection process to be a single OS call, but does not allow inspection of the different connection steps. ~ Proposed Change ~~ Introspection Command to Inspect a Running Asynchronous Connect An additional introspection function should inform if the asynchronous connect is running or if it has terminated: > '''fconfigure''' ''channel'' '''-connecting''' This option returns '''1''' as long as a socket is still in the process of connecting asynchronously and 0 when the asynchronous connection has completed (succeeded or failed) or the socket was opened synchronously. ~~ Non-Event Loop Operation If the event loop runs, the state machine of a (possibly multiple-address try) async connection proceeds within an internal callback. ~ Alternatives ~ Implementation This is currently implemented in trunk as hidden feature. The branch tip-427 only makes this official. ~ Copyright This document has been placed in the public domain. ====== <> ***TIP 428: Produce Error Dictionary from 'fconfigure -error'*** A possibility to return the full POSIX information of a background error was drafted. Dsscussed solutions: * fconfigure -throwerror: throws the stored background error or does nothing if no error present * fconfigure -options: returns a dict as 'catch {} e d' * fconfigure -error d: returns dict in variable d similar to 'catch {...} e d'. ([http://www.tcl.tk/cgi-bin/tct/tip/428%|%TIP 428%|%]) Implementation is in fossil branch [https://core.tcl.tk/tcl/timeline?r=tip-428%|%tip-428%|%]. **Bugs** ***Win TCL8.6.1 only tries first IP*** [https://core.tcl.tk/tcl/tktview?name=13d3af3ad5%|%Bug 13d3af3ad5%|%] TCL8.6.1 only tries the first of eventual multiple IP addresses to connect. This may cause serious connect issues, specially with IPV6. This is fixed in branch [https://core.tcl.tk/tcl/timeline?r=bug-13d3af3ad5&unhide%|%bug-13d3af3ad5%|%] which also serves as main branch to fix all bugs in TCL8.6.1 and to test enhancements too. ***Win connect ignored*** [https://core.tcl.tk/tcl/info/336441ed59%|%Bug 336441ed59%|%] Two issues: 1. When a connect terminates to quick so the notifier is not ready yet, the connect is ignored and thus it waits forever for it. 2. A call of puts, gets or read while connecting shortly switched off the connect notification. %|Version|Status|% &|8.5.15|bug present|& &|8.5.16+|fixed|& &|8.6.1|bug present|& &|8.6.2+|fixed|& Test for 1 is timing dependent and may ignore issue on some machines. Test for 2 is to run the teapot client massively, see bug description. ***Empty error message on close on failed background flush*** [https://core.tcl.tk/tcl/info/97069ea11a%|%Bug 97069ea11a%|%] %|Version|Status|% &|8.5.15|bug present|& &|8.5.16+|fixed|& &|8.6.1|bug present|& &|8.6.2+|fixed|& <> Test proposal The test is difficult, as an async connect must fail after a puts is issued on the channel. Idea: write a dummy channel driver, which may be set to an error state by fconfigure -seterror and where the readbale/writable state may be set. So one could: ======tcl set h [open dummy] fconfigure $h -seterror EWOULDBLOCK fileevent $h writable {set x writable} fconfigure $h -blocking 0 puts $h abc fconfigure $h -setwritable 1 vwait x catch {close $h} e d ====== <> ***No readable event on async socket connect failure*** [https://core.tcl.tk/tcl/info/581937ab1e%|%Bug 581937ab1e%|%] %|Version|Status|% &|8.5.15 win|bug present|& &|8.5.16+ win|Fixed|& &|8.6.1 win|bug present|& &|8.6.2+ win|Fixed|& **ToDo's** ***Correct trunk*** On trunk, test socket-14.2 is failing for me on CentOS. This is tracked in [https://core.tcl.tk/tcl/tktview/13d3af3ad5b8e8eb20d673962d442dd64a40af40%|%ticket 13d3af3ad5%|%]. FreeBSD and Windows does not fail. ***Robust tests*** Many tests now got timing dependent. Here is my discussion proposal to eventually cure that: As found out yesterday, it is not possible any more to fix the moment when a socket connect fails. Example: ======tcl set h [socket -async localhost [randport]] # This needs two "updates" to fail, one for ::1, one for 127.0.0.1 # Background connect to ::1 started fconfigure $h -blocking 0 # if connect to ::1 already failed, connect to 127.0.0.1 starts puts $h Hi flush $h # if connect to 127.0.0.1 already failed, this shows the error "connection refused" # if connect to ::1 already failed, connect to 127.0.0.1 starts ====== For most tests, we need the connect procedure fail after the flush. So I propose to: * Add a test command "testsocket testflags $h bool" which sets a channel flag to not continue the connect on any command. Thus, the upper test may go like that: ======tcl # Switch auto-continue off set h [socket -async localhost [randport]] testsocket testflags $h 1 close $h # Now do the test setup: set h [socket -async localhost [randport]] fconfigure $h -blocking 0 puts $h Hi flush $h fileevent $h writable {set ::x 0} # switch auto-continue on to have normal operation testsocket testflags $h 0 vwait x ====== This is implemented in fossil branch [https://core.tcl.tk/tcl/timeline?r=robust-async-connect-tests&nd%|%robust-async-connect-tests%|%] The same way, we could also do the test for tclIO.c "background error but no error message". There are still a couple of test failures on CentOS and on FreeBSD documented in [https://core.tcl.tk/tcl/tktview/13d3af3ad5b8e8eb20d673962d442dd64a40af40%|%ticket 13d3af3ad5%|%]. ***prioritize connect errors and return most appropriate*** If a socket connect fails, the error in the latest connect stage should be returned. This would prioritize "access denied" (e.g. socket in use) before "network unreachable" (no route). Project stage for Win and Unix. This is already implemented for Unix server sockets. ***TclWinGetSockOpt() stubs entry may return wrong state*** The Win TCL stubs table contains an entry for '''TclWinGetSockOpt()''' which returns the info from '''getsockopt()'''. In TCL8.5, the result of '''fconfigure -error''' was always the return value of the system call '''getsockopt()'''. In TCL8.6, a connect failure is cached in a variable and returned by '''fconfigure -error'''. Eventually, this should also be done by the routine called by the '''TclWinGetSockOpt''' stubs entry. The purpose of this stub entry seams to be from the times of Windows 98 where a WinSock2.dll may not be present. There are no known usage of this. Thus it was decided to leave it as depreciated and to remove it for Tcl9.0. <> send pending data when connected When one puts pending data while connecting: ======tcl set h [socket -async $host $port] fconfigure $h -blocking 0 puts $h "HELO" flush $h vwait forever ====== this data is automatically sent when the connection is available. I have no idea how this works, but it seam to work. If there is a writable event, ok, I see the entry point for the framework, but without ? This is a marker for me to investigate this. A proposed test is a bit like that (sorry, in German from an E-Mail to rmax): ======tcl Wir brauchen: - eine Maschine mit IPV4 und IPV6. - info ob erst IPV4 oder erst IPV6 geprüft wird. Im folgenden wird (wie bei mir) erst IPV6 geprüft. Server und Prüfer: proc accept {s m p} { set ::s $s set ::x [gets $s] # hier kein close, da es auch Prozesse auslösen kann } set server [socket -server accept -myaddr 127.0.0.1 30000] vwait x set x # -> x muss "Hi" sein close $server close $s Client in extra Prozess set h [socket -async localhost 30000] fconfigure $h -blocking 0 puts $h Hi # stößt eventuell zweiten connectversuch an, aber gibt erstmal EWOULDBLOCK an Framework zurück... flush $h # bis hierhin wird nichts gesendet, da noch der zweite Connectversuch läuft. # Hier wird jetzt connected und im Hintergrund automatisch das flush ausgeführt. Wie ? keine Ahnung aber bei mir gehts.... after 2000 {set w 1} vwait w # kein close, Daten müssen ohne close ankommen... # auch kein fileevent, da das auch ein background flush auslöst. ====== <> **Thanks** Thanks to [Wojciech Kocjan] for his book [BOOK Tcl 8.5 Network Programming] and discussions which teached this network stuff to me. <>Tcl syntax | Arts and crafts of Tcl-Tk programming | Command | Networking | Channel