HOWTO
A. Kupries
TclIODriver Andreas Computer Laboratories
(Me, myself and I)
November 14, 2000
!!!!!!
'''The Tcl I/O system as seen by a driver (channel type)'''
!!!!!!
**Abstract**
!!!!!!
''This document describes the I/O system used in the Tcl core as it is seen from a driver implementing a channel type.''
!!!!!!
**Table of Contents**
===
1. '''Introduction''' . . . . . . . . . . . . . . . . . . . . . . . 2
2. '''Main facilities in the core''' . . . . . . . . . . . . . . . 3
3. '''Writing a channel driver''' . . . . . . . . . . . . . . . . . 5
3.1 '''InstanceData''' . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 '''Creation of channels''' . . . . . . . . . . . . . . . . . . . 6
3.2.1 '''Creation of a base channel''' . . . . . . . . . . . . . . . . 6
3.2.2 '''Creation of a transformation''' . . . . . . . . . . . . . . . 7
3.3 '''Destruction of channels''' . . . . . . . . . . . . . . . . . 7
3.4 '''Accessing the channel downstream''' . . . . . . . . . . . . . 8
3.5 '''The driver in detail''' . . . . . . . . . . . . . . . . . . . 8
3.5.1 '''GetHandleProc''' . . . . . . . . . . . . . . . . . . . . . . 11
3.5.2 '''SetOptionProc''' . . . . . . . . . . . . . . . . . . . . . . 11
3.5.3 '''GetOptionProc''' . . . . . . . . . . . . . . . . . . . . . . 12
3.5.4 '''SeekProc''' . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.5.5 '''BlockModeProc''' . . . . . . . . . . . . . . . . . . . . . . 16
3.5.6 '''CloseProc''' . . . . . . . . . . . . . . . . . . . . . . . . 17
3.5.7 '''InputProc''' . . . . . . . . . . . . . . . . . . . . . . . . 18
3.5.8 '''OutputProc''' . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.9 '''WatchProc''' . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.10 '''HandlerProc''' . . . . . . . . . . . . . . . . . . . . . . . 24
3.5.11 '''FlushProc''' . . . . . . . . . . . . . . . . . . . . . . . . 25
'''References''' . . . . . . . . . . . . . . . . . . . . . . . . 26
'''Author's Address''' . . . . . . . . . . . . . . . . . . . . . 29
A. '''Glossary''' . . . . . . . . . . . . . . . . . . . . . . . . . 30
B. '''Acknowledgements''' . . . . . . . . . . . . . . . . . . . . . 31
===
**1. Introduction**
The main concept of the I/O system used by the Tcl core is the
abstract notion of channels unifying different paths for
communication and the accompanying split of this subsystem into two
layers, one generic in nature, the other handling the specialities
of the various communication channels.
It is this second layer which is the home of the drivers
implementing channel types and thus bridging the gap between the
generic layer and the operating system providing the actual
facilities for communication. His interface to the generic layer is
what we will describe here.
Before embarking on this task some other things to note before. In
the beginning of the I/O system were there only drivers for things
like files, pipes and sockets. But with the inclusion of the stacked
channel patch into the core in 8.2 we now have the situation that
two different types of drivers can be written, one like the ones
mentioned before, i.e. base (fundamental, bottom) drivers, and
transformations (also called filtering channels). Both types will be
described in this document but thereas the properties regarding base
drivers are valid across the various versions of the core the
statements regarding transformations will apply only to Tcl 8.4 and
beyond. The reason for this restriction is that the interface to
transformations (and their semantics) differ considerably between
the various versions of the core and while documenting the
differences is not impossible (only tedious) I am currently not in
the mood for this boring task. People who want to know how to
support a transformation across version are hereby directed to take
a look at the Trf[http://www.purl.org/NET/akupries/soft/trf/] extension and the various compile-time and
run-time tricks and decisions to do so.
**2. Main facilities in the core**
The main entrance to understanding a channel driver is the
Tcl_ChannelType[http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M6] structure as it lists all the functionality a
driver has to implement for a correct integration into the (I/O)
core.
These function vectors will be later explained in Section 3.5
Of the many channel-related functions in the public API only some
are of interest to a channel driver.
A principal API used by all drivers is
* Tcl_NotifyChannel[http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm]
which allows a driver to communicate with the notification/event
subsystem and to post events when he is readable or writeable.
For the creation of new channels two different APIs are available,
one for each type of drivers, base and transformation:
* Tcl_CreateChannel[http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm] and
* Tcl_StackChannel[http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm]
Whereas the first function creates an independent channel the second
will push the new transformation over an existing channel, thus
forming a stack of transformations with a base driver at the bottom.OA
This latter function is also complemented by two more functions, one
to retrieve the channel below a transformation, the second to remove
the topmost transformation from its stack:
* Tcl_UnstackChannel[http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm] and
* Tcl_GetStackedChannel[http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm]
All these functions (and some more) will be explained later in more
detail.
Right now it is also important to know that the following guarantees
are made by the core with respect to channels and their possible
stacking:
1. Stacking a transformation on a channel given through a Tcl_Channel token (a reference) will neither invalidate this nor all other references held by some C code. When writing to a channel represented by such an older token the I/O system will automatically compensate for the stack above it.
1. Only the topmost channel in a stack will do EOL-translation, UTF <-> XX encoding and buffering. All channels below will neither buffer, nor translate EOL, nor encode UTF.
1. Events posted by a channel are always filtered through all the channels above before being handed to channel handlers, either in C or in Tcl. All channels the event is passing through are allowed to absorb it in the process of their own work. This means that transformations are allowed and able to talk to and negotiate in an asynchronous manner with their counterpart on the other side of their channel before allowing the higher layers their turn with respect to events. An example of a transformation requiring such a facility is the implementation of the TLS/SSL protocol[http://www.sensus.org/tcl/] which has to set up the secure channel before allowing the normal communication.
1. The above also implies that the topmost channel in a stack is always notified last.
1. A transformation channel automatically starts out with the same blocking mode as the channel it replaces.
**3. Writing a channel driver**
***3.1 InstanceData***
Whenever a channel is created the core will not only get a reference
to the structure containing the references to the driver procedures,
i.e. the channel type, but a reference to a structure allocated by
the caller as well. This reference is given to all driver procedures
when called for that particular channel. The internals of this
structure are known only to the channel driver; the core will just
pass the reference around. This allows us to associate the specific
state of the driver with the channel.
The following information should be present in the instance data,
not necessarily under the name I gave them, you are free to choose
your own identifiers:
''Tcl_Channel channel'';: is a backlink from the instanceData to the channel. Without this link the driver will be unable to access and manipulate its channel as all driver procedures are called only with the instance data as argument. The token to store is the result value of Tcl_CreateChannel[9] (or Tcl_StackChannel[10]). <
><
>All transformations need access to the channel below them so that they are able read and/or write from it. No additional item is required, just call Tcl_GetStackedChannel (Section 3.4) with this item as argument to obtain the necessary token.
''Tcl_TimerToken timer'';: Such a timer is necessary if the driver is able to buffer processed data the generic I/O layer has no knowledge of. It will be used to flush out such data. See Section 3.5.9 for more explanations. Transformations usually do such buffering.
''int flags'';: Transformations have to remember the current blocking mode to handle EOF on input right. See Section 3.5.7 for more explanations. Base channels on the other hand usually don't have a need for this as they can propagate this information to the operating system. An exception are channel types like memchan[11] which hold all their information in memory.
''int mask'';: Transformations have to remember the current interest in events; see Section 3.5.9 for more explanations. Other channel types may use this too to detect and skip calls which do not change the mask.
***3.2 Creation of channels***
Every channel has to have a creation command at the tcl level and an
equivalent procedure at the C-level. Depending on the type of the
channel the procedure has to use either Tcl_CreateChannel[12] or
Tcl_StackChannel[13] to create and configure the generic part of the
new channel. These two cases are described in the next two
subsections.
****3.2.1 Creation of a base channel****
The main call to create a new base channel is Tcl_CreateChannel[14]
Before doing it the channeltype-specific creation procedure has to
* process its arguments for possible options,
* create and the specific channel structure (clientdata),
* create a name for the channel, like 'sockXX', 'fileN', etc. and
* initialize the specific channel structure.
After the creation of the channel it might be necessary to configure
to configure it. For example it might be non-blocking by default.
Note that the backlink to the channel has to be initialized with the
result of the call to Tcl_CreateChannel[15].
Creation procedure skeleton for a base channel
===
int XX_CreateChannel(interp, objc, objv, cd) {
''process arguments'' /* objc, objv */
name = ''generate_name_for_new_channel()'';
clientData = Tcl_Alloc(...);
''initialize clientData''... /* name, state, config ... */
clientData->channel = Tcl_CreateChannel(interp, &chan_type, clientData);
''configure the channel according to arguments, if necessary''
Tcl_SetResult(interp, name);
return TCL_OK;
}
===
The ''chan_type'' in the code above is the structure containing the
references to the driver procedures for the base channel.
****3.2.2 Creation of a transformation****
In contrast to base channel types the creation procedure must not
use Tcl_CreateChannel[16] as that would create a new and separate
channel, but use Tcl_StackChannel[17] instead. This procedure takes
as one of its parameters a reference to an existing channel and
creates a new channel structure holding the state of the
transformation. A token for this new structure is returned. When
later accessing the old channel, i.e. the one the transformation was
stacked upon, via Tcl_Read/Write et. al. the system will
automatically redirect such calls to the top of the stack. In other
words, all Tcl_Channel tokens stay valid, independent of where they
are in a stack, yet no backdoors are opened. The latter is not
completely true, but we will come to this later on.
Other things, like the creation and initialization of the necessary
clientData for the transformation, have to be done as usual.
The backlink to the channel of the transformation has to be
initialized with the result of the call to Tcl_StackChannel[18].
Creation procedure skeleton for a transformation
===
int XX_CreateTransformation(interp, objc, objv, cd) {
old_channel = find(handle(objv[[1]));
clientData = Tcl_Alloc (...);
''initialize clientData...''
clientData->channel = Tcl_StackChannel(interp, &trans_type, clientData, old_channel);
Tcl_SetResult(interp, old_channel->name);
return TCL_OK;
}
===
The ''trans_type'' in the code above is the structure containing the
references to the driver procedures for the transformation.
***3.3 Destruction of channels***
Destruction of channels is done with either
Tcl_UnregisterChannel[19] or Tcl_UnstackChannel[20]. Both of them
can be called with any channel and will always compensate if the
channel was part of a stack. The first always destroys all channels
in a stack, from top to bottom, whereas the second will always
destroy just the topmost channel of a stack. Both procedures are
equivalent if there is only one channel in the stack.
As Tcl_UnregisterChannel[21] knows that the whole stack of channels
is in destruction it does not deal with events anymore, except for
destroying the internal data structures supposed to deal with them.
But it does ask the various channels in the stack to flush buffered
information (on the write side) down the stack so that nothing which
is stuck is lost. This is not possible for information in the
upward/read buffers, as there is no ultimate receiver for them, so
these bytes are lost.
Tcl_UnstackChannel[22]does mainly the same as
Tcl_UnregisterChannel[23] above, except that it takes action to keep
the event-system up and running. Again information in the generic
read-buffers is lost, but for a reason: Anything in the input queue
and the push-back buffers of the transformation going away is
transformed data, but not yet read. As unstacking means that the
caller does not want to see transformed data any more we have to
discard these bytes. Information stored in buffers internal to the
transformation and not yet transformed should be saved for later
reads without the transformation in place, but we currently don't
have an API to do this. Consequence: No transformation should read
more information than it is willing to transform at once, or
unstacking will cause gaps in the data read from a channel. This may
change in the future as some of the required mechanism are already
in place in the core, internally.
Whatever way was used to destroy a channel, the system will call the
Section 3.5.6 of the transformation so that its driver may cleanup
its data structures.
***3.4 Accessing the channel downstream***
This section is relevant to transformations, but nothing else.
To accesss the channel below itself a driver just has to call
Tcl_GetStackedChannel[24] with the token of its own channel (the
backlink we talked about in Section 3.1). The function will return
the token for the channel we want. A (Tcl_Channel) NULL indicates
that the channel used as argument was at the bottom of the stack.
***3.5 The driver in detail***
Now that the environment of the driver is a little more known we can
explain the operations of the various driver procedures in detail.
Every description will start with the general condition under which
the procedure is called by the generic I/O layer of the tcl core and
proceeds to the specialities a transformation has to take care of.
But before we come to this we have to talk about driver versions as
well. The core currently supports version 1 and version 2 drivers.
The latter was introduced with Tcl 8.3.2, during the second rewrite
of the stacked channel stuff. It was required as for some
functionality new vectors were needed to correctly support it and
the core also had to had a way to determine wether the new fields
were valid or not and just accessing them is out of the question as
the older structures simply don't have them, i.e. we would get
random information and most likely crash later oon.
The structure for a channel type looks like this:
'''Tcl_ChannelType definition'''
======
typedef struct Tcl_ChannelType {
char *typeName; /* The name of the channel type in Tcl
* commands. This storage is owned by
* the channel type. */
Tcl_ChannelTypeVersion version; /* Version of the channel type. */
Tcl_DriverCloseProc *closeProc; /* Procedure to call to close the
* channel, or TCL_CLOSE2PROC if the
* close2Proc should be used
* instead. */
Tcl_DriverInputProc *inputProc; /* Procedure to call for input
* on channel. */
Tcl_DriverOutputProc *outputProc; /* Procedure to call for output
* on channel. */
Tcl_DriverSeekProc *seekProc; /* Procedure to call to seek
* on the channel. May be NULL. */
Tcl_DriverSetOptionProc *setOptionProc;
/* Set an option on a channel. */
Tcl_DriverGetOptionProc *getOptionProc;
/* Get an option from a channel. */
Tcl_DriverWatchProc *watchProc; /* Set up the notifier to watch
* for events on this channel. */
Tcl_DriverGetHandleProc *getHandleProc;
/* Get an OS handle from the channel
* or NULL if not supported. */
Tcl_DriverClose2Proc *close2Proc; /* Procedure to call to close the
* channel if the device supports
* closing the read & write sides
* independently. */
Tcl_DriverBlockModeProc *blockModeProc;
/* Set blocking mode for the
* raw channel. May be NULL. */
/*
* Only valid in TCL_CHANNEL_VERSION_2 channels
*/
Tcl_DriverFlushProc *flushProc; /* Procedure to call to flush a
* channel. May be NULL. */
Tcl_DriverHandlerProc *handlerProc; /* Procedure to call to handle a
* channel event. This will be passed
* up the stacked channel chain. */
} Tcl_ChannelType;
======
For this discussion are only the four fields 'version',
'blockModeProc', 'flushProc' and 'handlerProc' of relevance.
All possible values are acceptable for 'version', but
'TCL_CHANNEL_VERSION_1' and 'TCL_CHANNEL_VERSION_2' are special as
they explictly spell out the version of the driver. Any other value
will cause the system to assume that the structure describes a
version 1 driver and that the 'version' field actually contains the
reference to 'blockModeProc'. And when looking at older versions of
tcl.h one will find exactly this definition of the structure. If the
special values are used the 'blockModeProc' must be found in the
field of that name, and for a version 2 driver 'flushProc' and
'handlerProc' are valid as well.
Now that I have explained the complex logic I should also note that
code which has just to read the various fields in a
Tcl_ChannelType[25] has to use the accessor functions in the
following list. These functions have the logic above written into
them and will return the correct value independent of the driver
they access. This is especially of value for transformation
channels. Someone setting up a Tcl_ChannelType[26] structure for a
new driver still has to know the rules, though.
o Tcl_ChannelBlockModeProc[27]
o Tcl_ChannelCloseProc[28]
o Tcl_ChannelClose2Proc[29]
o Tcl_ChannelInputProc[30]
o Tcl_ChannelOutputProc[31]
o Tcl_ChannelSeekProc[32]
o Tcl_ChannelSetOptionProc[33]
o Tcl_ChannelGetOptionProc[34]
o Tcl_ChannelWatchProc[35]
o Tcl_ChannelGetHandleProc[36]
o Tcl_ChannelFlushProc[37]
o Tcl_ChannelHandlerProc[38]
Some questions of policy:
When should one write a version 1 driver, when one for version 2 ?
And what is the 'best' way to write a version 1 driver ?
When defining a transformation writing a version 2 driver is
recommended as only that version has the best support for
integrating transformations and eevent processing. The disadvantage
is that thiis will restrict the driver to Tcl 8.3.2, 8.4 and up. On
the other hand, someone dead set at writing a transformation
supporting all (or some) versions of the core with stacked channels
should take a good look at Trf[39] for the necessary voodoo to make
such a beast work.
When writing a base channel it is on the other hand recommended to
use a version 1 driver as that version will support all versions of
the core with the least hassle involved. Because of this reasoning
it is also recommended not to use 'TCL_CHANNEL_VERSION_1' but to
define the version implicitly, i.e. to store 'blockModeProc' in that
field. Usage of 'TCL_CHANNEL_VERSION_1' would again restrict the
driver to 8.3.2 and beyond.
****3.5.1 GetHandleProc****
This procedure is called by the C-API function
Tcl_GetChannelHandle[40] to retrieve the OS specific handle
associated to the queried channel.
Channel types implementing communication paths independent of the
OS, like memchan[41], have to return a NULL handle (erroring out is
not possible).
Transformations don't need to bother with this function. The generic
layer will always query the bottom most channel in a stack as that
is the only one which can have OS specific handles. In other words,
transformations are never queried for this information.
****3.5.2 SetOptionProc****
This procedure is called by the generic I/O layer whenever
Tcl_SetChannelOption[42] is used (for example by 'fconfigure') and a
non-standard option was specified as argument.
For base channel the handling is straight forward. If there are no
options to set, just set the asssociated field in
Tcl_ChannelType[43] to NULL. Else compare the specified name against
the options supported by the driver and act accordingly. If the
option is not known use Tcl_BadChannelOption[44] to generate the
error message.
Transformation channels are basically the same, except for unknown
options. They have the additional option to delegate the call to the
channel downstream. I personally recommend to delegate the call.
Because of this I also recommend to implement this function even if
the transformation has no options by itself.
'''SetOptionProc skeleton for a transformation'''
===
static int SetOptionProc(clientData, interp, optionName, value) {
''... handle your own options''
/* delegate unknown options downstream */
Tcl_Channel parent = Tcl_GetStackedChannel(clientData->channel);
Tcl_DriverSetOptionProc *setOptionProc =
Tcl_ChannelSetOptionProc(Tcl_GetChannelType(parent));
if (setOptionProc == NULL) {
return TCL_ERROR;
}
return setOptionProc(Tcl_GetChannelInstanceData(parent),
interp, optionName, value);
}
===
****3.5.3 GetOptionProc****
This procedure is called by the generic I/O layer whenever
Tcl_GetChannelOption[45] is used (f.e. by 'fconfigure') to query the
value of a non-standard (or all) option(s).
The channel type has to implement everything from Section 3.5.2 and
some more. The latter not only because read-only options make sense
(write-only not so much) but also because there is a special case
which asks the channel for the values of all of its options.
For transformations it is again possible to delegate options unknown
to it to the underlying channel. In the case of a query for all
options such a delegation will generate a mighty long result.
Pruning the unnecessary options values from the result of the
underlying channel (-encoding, -buffering, -translation) is
possible, but tedious (We are working with DStrings, not
Tcl_Obj'ects, and especially no ListObj'ects).
'''GetOptionProc skeleton for a transformation'''
===
static int GetOptionProc(clientData, interp, optionName, dsPtr) {
if (query for all options) {
''... add our own options to the result''
/* add the options of the channel downstream
* to the result
*/
Tcl_Channel parent = Tcl_GetStackedChannel(clientData->channel);
Tcl_DriverGetOptionProc *getOptionProc =
Tcl_ChannelGetOptionProc(Tcl_GetChannelType(parent));
if (getOptionProc == NULL) {
return TCL_OK;
}
return getOptionProc(Tcl_GetChannelInstanceData(parent),
interp, optionName, dsPtr);
} else {
''... handle your own options.''
/* delegate queries for unknown options downstream */
Tcl_Channel parent = Tcl_GetStackedChannel(clientData->channel);
Tcl_DriverGetOptionProc *getOptionProc =
Tcl_ChannelGetOptionProc(Tcl_GetChannelType(parent));
if (getOptionProc == NULL) {
return TCL_ERROR;
}
return getOptionProc(Tcl_GetChannelInstanceData(parent),
interp, optionName, dsPtr);
}
}
===
P.S. Given the similarities in the way the delegation is handled by
the two branches of the if statement above it makes sense to factor
this code into a separate procedure.
****3.5.4 SeekProc****
This procedure is called by the generic I/O layer whenever the user
asks the channel to move or query the 'file access point'. The
respective public entries for these functions are Tcl_Seek[46] and
Tcl_Tell[47]. The tell functionality is requested by 'mode ==
SEEK_CUR' and 'offset == 0'.
With respect to seeking the core currently distinguishes between
seekable and unseekable channels. The latter are marked by setting
'seekProc' to NULL. This is currently true for "tty", "tcp" and
"pipe" types, i.e serial lines, sockets and pipes. This distinction
as supported by the tcl core is actually a bit more restrictive than
necessary as all of the currently unseekable channels could support
limited seeking. I am speaking of forward seeking or 'skipping'. For
now we will have to live with this restriction.
For new a base channel the implementation of this procedure should
be straight forward. Unseekable channels (like [udp]) and forward
seekable channels just don't implement it, the other types either
forward the call to the OS or simply manipulate their internal
state. See memchan[48] for an example of the latter
'''SeekProc skeleton for OS associated channels'''
===
static int SeekProc(clientdata, offset, mode, errorCodePtr) {
''... flush possibly waiting output''
''... discard possibly waiting input''
''... compute some value from offset, mode and current location, then forward this to the OS.''
*errorCodePtr = (result == -1) ? Tcl_GetErrno () : 0;
return result;
}
===
'''SeekProc skeleton for non-OS channels'''
===
static int SeekProc(clientdata, offset, mode, errorCodePtr) {
''... compute the new location X from offset, mode and current location''
if (''X out of bounds'') {
*errorCodePtr = EINVALID;
return -1;
}
clientdata->state = ''X'';
*errorCodePtr = 0;
return ''X'';
}
===
For transformations seeking is a hard problem. Should they seek
using their own notion of access point ? Or should they use the
notion of the underlying channel and then try to adapt their own
state for fine-positioning? Should they allow seeking at all ?
Depending on the transformation both the first two can be
impossible. Nice examples are compressors (like zlib) with their
completely non-linear and position-dependent relationship between
the number of bytes coming in from the downstream channel and going
out to its caller. Another reason could be that the transformation
state is not reversible, i.e. cannot be rolled back in a simple way,
without hogging memory. An example for this would be an encryption
transformation using a cryptographically strong hash-function to go
from the current state to the state for the encryption of the next
byte (or block). This is not reversible. We can go forward from
state to state, but not back to the old state, except for saving
them all.
So the simplest policy when dealing with seeking is to propagate the
request unchanged to the underlying channel, to discard all
information in the internal buffers of the transformation and then
hope for the best. Data waiting to be written is converted as if
they are the last block, in other words the special end of
information processing is applied, and then flushed. The current
state is abandoned too. The next call to InputProc (Section 3.5.7)
or OutputProc (Section 3.5.8) will be handled as if it were the
first call to the transformation.
The above is basically a strategy 'The user knows best, is able to
compute a place making sense and not creating garbage during
recover'. It could also be named 'head-in-the-ground'. In the end
this simply means that the user of a certain transformation has to
understand its properties and whether a seek on it makes sense at
all before trying to seek.
Remark: It is possible to deal even with non-reversible state, by
recording all read/write calls and maintaining an exact image of the
information read/written so far, but this is, ah, memory-extensive,
to understate this a little. Also note that the notion of channels
with non-reversible state is equivalent to the notion of forward
seekable channels.
'''SeekProc skeleton with 1:1 pass'''
===
static int SeekProc(clientdata, offset, mode, errorCodePtr) {
''... flush waiting output''
''... flush waiting input, if possible (f.e. into a configured variable!)''
/* Chain the call */
Tcl_Channel parent = Tcl_GetStackedChannel(clientData->channel);
Tcl_ChannelType *parentType = Tcl_GetChannelType(parent);
Tcl_DriverSeekProc *parentSeekProc = Tcl_ChannelSeekProc(parentType);
int errorCode;
if (parentSeekProc == NULL) {
return -1;
}
return parentSeekProc(Tcl_GetChannelInstanceData(parent),
offset, mode, &errorCode);
}
===
As a last note, Trf[49] implements a much more complex seeking model
for transformations, but describing it is beyond the scope of this
document. Go to its documentation instead.
****3.5.5 BlockModeProc****
This procedure is called by the generic I/O layer whenever
Tcl_SetChannelOption[50] is used (for example by 'fconfigure') and a
'-blocking' was specified as the name of the option.
For a base channel this procedure has to take the necessary actions
at the OS level to switch the OS object underlying the managed
channel into (non-)blocking behaviour.
A transformation channel however just has to remember this
information in its instance data so that InputProc (Section 3.5.7)
is able to deal correctly with empty reads on the downstream
channel. The generic layer takes care of notifying all channels in a
stack so that all have the same information.
'''BlockModeProc skeleton'''
===
static int BlockModeProc(clientdata, mode) {
if (mode == TCL_MODE_NONBLOCKING) {
clientdata->flags |= ASYNC;
} else {
clientdata->flags &= ~ASYNC;
}
}
===
****3.5.6 CloseProc****
This procedure is called by generic the I/O layer to tell a channel
that it is about to be destroyed. See Section 3.3 for the procedures
which can invoke it.
It is the responsibility of the procedure to clean up any data
structures held by the channel. Another task is the removal of all
event related things, like ChannelHandlers and Timers, although this
could be billeted under 'clean up of any data structures held by the
channel' too.
Transformations have the additional responsibility to complete the
conversion of all incomplete information sitting in its internal
write buffers and to write the result into the downstream channel to
ensure a clean closure.
'''CloseProc skeleton'''
===
int CloseProc(clientdata, interp) {
''... delete timer, if any. See 'WatchProc' too.''
'''|'''... do last minute conversions on r/w buffers and try to
'''|''' flush their results to the underlying channel.
'''|''' See below
'''|'''
'''|''' parent = Tcl_GetStackedChannel (clientdata->channel);
'''|''' Tcl_WriteRaw (parent, buffer, bufsize);
''... free data structures on the heap.''
return TCL_OK
}
===
The part marked with '''|''' is specific to transformations and not
required for a base channel.
****3.5.7 InputProc****
This procedure is called by the generic I/O layer whenever some
input is required. Entrypoints which can cause this are
Tcl_Read[51], Tcl_ReadChars[52], Tcl_Gets[53] and Tcl_GetsObj[54].
A base channel just has to call the appropriate OS functionality to
get the information, or retrieve it from its internal buffers.
A transformation on the other hand has to ask the channel downstream
for the data to convert when reading, or write the converted data to
it when writing. Usage of Tcl_Read[55] is not allowed under any
circumstances. As said several times before, the generic layer
compensates for the existence of a stack when dealing with a
channel. So reading using Tcl_Read[56] will cause a read from the
topmost channel in the stack, this will try to get information from
downstream, jump again to the top, ad infinitum, or rather until the
stack blows up.
To get around this problem two special APIs were introduced, the
Raw-functions. They will always access the channel given as their
argument without compensation for stacking, thus enabling a
transformation to talk directly to the channel downstream. It is
important to note that these functions pose a risk too. Usage from
within the driver of a transformation is required, but nothing can
stop usage from outside of such a driver as well. This means that it
is possible to write tools which are able to bypass channels in a
stacks and cause all sorts of (de)synchronisation and security
problems.
Here we need Tcl_ReadRaw[57].
Instead of a skeleton which would be overwhelming even if trimmed
down I list the rules the input procedures in my transformations are
based upon.
1. If the request can be satisfied by the information in the internal read buffers of the transformation, just use their contents.
1. If there was not enough data in the buffers to satisfy the request ask the underlying channel for more.
* In blocking mode this will wait until we get some data or hit
EOF.
+ After returning from the read we first have to convert the
read bytes and second check whether the result is enough
to satisfy the initial request. If not we have to repeat
querying the channel downstream.
+ If we hit EOF instead we have to convert any incomplete
information in the internal buffers using any special
handling defined for the transformation and then return (a
possibly partial result). The EOF condition must not be
signaled upward to our caller until our internal buffers
are empty.
* In non-blocking mode we either get nothing, some data or EOF.
Getting EOF or data has to be handled as in the previous
item. But if nothing was retrieved we simply return the
partial (or even empty) result. And if there is nothing in
the internal buffers we have to signal the error EWOULDBLOCK
too.
Other things to consider:
* If a transformation makes use of an interpreter for the evaluation of scripts during its work it has to use Tcl_SaveResult[58] and Tcl_RestoreResult[59] to protect the result area of the interpreter. This is necessary as the I/O system, i.e. the calling procedure may have an unannounced reference to the object. Not doing this may crash the interpreter with a defect list of free objects.
* Writing to the underlying channel is allowed! An example using this is the SSL/TLS transformation[60] created by Matt Newman[61]. Before going into a transparent encryption mode it handles the complete handshake between server and client required to setup the encryption. As long as the handshake is not complete nothing can be read from/written to the channel.
* The InputProc can have any type and number of side effects. Examples:
* Identity transformation collecting statistics (frequency of
bytes, byte-pairs, triplets, etc.)
* Splitter: Identity transformation piping the information
flowing through it to a second channel (different from the
channel downstream).
* Recursively reading from/writing to the transformation itself (maybe indirectly, see splitter) is not a good idea, it may lead to infinite looping.
****3.5.8 OutputProc****
This procedure is called by the generic I/O layer whenever something
is written to the channel and an I/O buffer in the generic layer is
flushed. Entrypoints which can cause this are Tcl_Write[62],
Tcl_WriteChars[63] and Tcl_WriteObj[64].
A base channel just has to call the appropriate OS functionality to
write the information, or to store it into its internal buffers.
Transformations on the other hand have to convert as much as
possible of the data they got, and the result must be written to the
channel downstream (Well, not really, but not writing it does not
make much sense). Like for Section 3.5.7 usage of Tcl_Write[65] and
consorts is forbidden and would crash the system.
Here we need the second of the two Raw_APIs, namely Tcl_WriteRaw[66].
If there is data which cannot be converted at once it has to be
buffered internally, for conversion by future write requests,
together with the data written by these calls.
As with InputProc (Section 3.5.7) this procedure is free to read
from the underlying channel too, or from some other channel, or ...
****3.5.9 WatchProc****
This procedure is called by the generic I/O layer whenever the user
(or the system) announces its (dis)interest in events on the
channel. It is called throughout the system, for example when
channel handlers are added to and removed from a channel, or after
the execution of channelhandlers (as they may change the interest).
Base channels have to check the mask and then (un)register
themselves at the notifier. In most cases this will involve using
the existing APIs and some OS handle for the channel, but in more
complex cases it might be necessary to write an entirely new event
source and add it to the notifier. This is beyond the scope of this
HOWTO.
The most relevant entries in the API for this are
* Tcl_CreateFileHandler[67],
* Tcl_DeleteFileHandler[68],
* Tcl_CreateTimerHandler[69]and
* Tcl_DeleteTimerHandler[70]
The behaviour for transformations is much simpler, and fixed. In
other words, the following is what has to be done here for a smooth
interoperation with the notifier and for working fileevents. The
correct implementation of Section 3.5.10 is also of essence.
Two things have to be done.
1. Propagation of the mask information to the channel downstream. All transformations have to cooperate in this or else the base channel at the bottom won't register with the notifier for events. The contrasting example to this is Section 3.5.5 where the generic layer automatically notifies all channels in the tsack itself. Here we decided against such an automatism because the way it is now allows the various transformations to modify the mask before passing it down, i.e. to add the events they are interested in beyond the interest of the script level without any changes to the structure of Tcl_ChannelType[71] or the sematics of this vector. This would have been the case if we had changed this to an automatic chaining in the generic part of the I/O system.
1. Setting up and destroying the timer used to flush out the internal read buffers. This needs a bit more explanation, which will be given after the skeleton code for a WatchProc.
'''WatchProc skeleton'''
===
static void WatchProc(clientdata, mask) {
/* Pass the mask to the channel downstream, possibly
* modified. Remember the mask internally.
*/
Tcl_Channel parent = Tcl_GetStackedChannel(clientdata->channel);
Tcl_DriverWatchProc *watchProc = Tcl_ChannelWatchProc(Tcl_GetChannelType(parent));
trans->watchMask = mask;
watchProc(Tcl_GetChannelInstanceData(parent), mask);
/* Manage the timer */
if (!(mask & TCL_READABLE) || (''no pending converted information''))) {
/* A pending timer may exist, but either is
* there no (more) interest in the events it
* generates or nothing is available for
* reading. Remove it, if existing.
*/
''... kill the timer''
} else {
/* There might be no pending timer, but there
* is interest in readable events and we
* actually have data waiting, so generate a
* timer to flush that if it does not exist.
*/
... create the timer.
}
}
===
The handler procedure for the timer handled above looks like below.
It basically proclaims that the transformation channel is readable.
There is no need to recreate the timer here, because the generic
layer will call Section 3.5.9 after it has handled the event and
that vector will then do the right thing (see above).
'''Timer handler skeleton'''
===
static void ChannelHandlerTimer(clientData) {
clientdata->timer = NULL;
Tcl_NotifyChannel(clientdata->channel, TCL_READABLE);
}
===
Now the promised explanation about the necessity of the timer.
Consider this scenario:
1. A transformation is stacked upon a socket and its internal read buffer is empty. The transformation does not merge lines.
1. A fileevent script is set up and waiting for calls.
1. The socket has data available, say 400 bytes, in several lines (more than one), they are the last on the channel, i.e. followed by EOF. The notifier generates the appropriate 'readable' event.
1. This event triggers the execution of the fileevent script in the top channel.
1. The executed script uses 'gets' to read a single line.
1. As the buffers are empty (s.a.) the transformation asks the socket for data to convert, using Tcl_ReadRaw[72] and a standard buffer of 4K size. Thus it gets all waiting bytes from the socket. These are converted, resulting in several lines (no merge). Some of them are delivered up into the generic layer and its buffering, but not all (small buffersize). At least one line remains in the read buffer(s) of the transformation itself.
1. The script processes the one line it got and then goes back to sleep.
1. Now what ?
1. The generic I/O layer finds that its buffers are not empty and uses a timer to generate additional 'readable' events to clear them.
1. In the end the generic I/O buffers are empty. Now what, again ?
1. Nothing. No events, and no processing of the remaining line(s) stored in the transformation.
1. Why ? Because the socket has an EOF pending and will not generate events anymore. and the generic layer has empty buffers and ceases to generate events too. It has no knowledge about the buffers inside the driver, i.e the transformation. So the script will not wake up again, neither ask for the line, nor detect the pending EOF. We are hung.
The solution to this lock is the same one used by the generic layer,
but from the inside of the transformation this time:
: The transformation has to check itself for data waiting to be read and then use a timer to generate the necessary 'readable' events. And that is what the timer shown in the WatchProc will do.
****3.5.10 HandlerProc****
This vector is the second part of the integration of transformations
with the notifier. It is called by the generic layer whenever an
event happens at the channel downstream. This also means that it is
never called for the channel at the bottom of a stack. In other
words, base channels don't have to implement this function.
Transformations on the other hand have to implement it, and the
minimally required implementation will pass the incoming event mask
through, unchanged.
To understand this an explanation is in in order. The function is
called with a mask where bits are set for all events which happend
on the channel downstream. The caller then expects that the return
value of the function is the same mask, but with all the bits
cleared whose events were handled by the function itself.
Because of this a transformation is able to absorb and handle events
without the channel (or script) above being aware of them and the
associated processing. The TLS transformation[73] for example uses
this facility to handle the whole negotiation phase. Only after the
encryption is setup events are passed unchanged to the higher layers.
'''HandleProc skeleton'''
===
static int HandleProc(clientdata, mask) {
/*
* An event occured in the underlying channel. This
* transformation doesn't process such events thus
* returns the incoming mask unchanged.
*
* We do delete an existing timer. It was not fired,
* yet we are here, so the channel below generated
* such an event and we don't have to. The renewal of
* the interest after the execution of channel
* handlers will eventually cause us to recreate the
* timer
*/
''... kill timer''
return interestMask;
}
===
****3.5.11 FlushProc****
This vector is currently not used by the generic layer of the I/O
system. In the future it might be used to separate the actions of
flushing and writing data.
**References**
[1] http://www.purl.org/NET/akupries/soft/trf/
[2] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M6
[3] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[4] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[5] http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm
[6] http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm
[7] http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm
[8] http://www.sensus.org/tcl/
[9] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[10] http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm
[11] http://www.purl.org/NET/akupries/soft/memchan/
[12] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[13] http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm
[14] http://dev.ccriptics.com/man/tcl8.4/TclLib/CrtChannel.htm
[15] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[16] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[17] http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm
[18] http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm
[19] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M11
[20] http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm
[21] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M11
[22] http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm
[23] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M11
[24] http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm
[25] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M6
[26] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M6
[27] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[28] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[29] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[30] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[31] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[32] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[33] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[34] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[35] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[36] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[37] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[38] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[39] http://www.purl.org/NET/akupries/soft/trf/
[40] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm
[41] http://www.purl.org/NET/akupries/soft/memchan/
[42] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M21
[43] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M6
[44] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M17
[45] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M20
[46] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M18
[47] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M19
[48] http://www.purl.org/NET/akupries/soft/memchan/
[49] http://www.purl.org/NET/akupries/soft/trf/
[50] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M21
[51] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M13
[52] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M13
[53] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M14
[54] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M14
[55] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M13
[56] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M13
[57] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm
[58] http://www.tcl.tk/man/tcl8.4/TclLib/SaveResult.htm
[59] http://www.tcl.tk/man/tcl8.4/TclLib/SaveResult.htm
[60] http://www.sensus.org/tcl/
[61] mailto:matt@novadigm.com
[62] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M16
[63] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M16
[64] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M16
[65] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M16
[66] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm
[67] http://www.tcl.tk/man/tcl8.4/TclLib/CrtFileHdlr.htm
[68] http://www.tcl.tk/man/tcl8.4/TclLib/CrtFileHdlr.htm
[69] http://www.tcl.tk/man/tcl8.4/TclLib/CrtTimerHdlr.htm
[70] http://www.tcl.tk/man/tcl8.4/TclLib/CrtTimerHdlr.htm
[71] http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M6
[72] http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm
[73] http://www.sensus.org/tcl/
[74] mailto:mrose@dbc.mtview.ca.us
[75] http://www.rfc-editor.org/rfc/rfc2629.txt
[76] http://memory.palace.org/authoring/xml2rfc.tar.gz
**Author's Address**
Andreas Kupries
Andreas Computer Laboratories (Me, myself and I)
EMail: andreas_kupries@users.sourceforge.net
**Appendix A. Glossary**
A little glossary of terms used in the paper, but so far without
much of an explanation (or none).
Tcl_Channel: An opaque token for channels, and used by all interfaces accessing channels. Internally it is a pointer to the relevant data structures (Channel*).
stack: If one or more transformations are stacked upon an arbitrary other channel I use this word to refer to the whole group of channels.
(un)cover: Placing a transformation on a channel C "covers" C, removing the transformation "uncovers" it again.
**Appendix B. Acknowledgements**
This HOWTO was written in XML using the DTD developed by [Marshall T.
Rose] for writing RFC's and I-D's, see [RFC] 2629[75], and
converted to text and HTML with his tool, ''[xml2rfc]''.
----
!!!!!!
%| [Category Internals] |%
!!!!!!