Version 9 of Tcl IO Drivers

Updated 2007-12-07 11:31:18 by dkf

HOWTO

 A. Kupries
 TclIODriver                                Andreas Computer Laboratories
                                                      (Me, myself and I)
                                                       November 14, 2000

The Tcl I/O system as seen by a driver (channel type)

Abstract

This document describes the I/O system used in the Tcl core as it is seen from a driver implementing a channel type.

Table of Contents

  1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 2
  2. Main facilities in the core . . . . . . . . . . . . . . . 3
  3. Writing a channel driver . . . . . . . . . . . . . . . . . 5
   3.1    InstanceData . . . . . . . . . . . . . . . . . . . . . . .   5
   3.2    Creation of channels . . . . . . . . . . . . . . . . . . .   6
   3.2.1  Creation of a base channel . . . . . . . . . . . . . . . .   6
   3.2.2  Creation of a transformation . . . . . . . . . . . . . . .   7
   3.3    Destruction of channels  . . . . . . . . . . . . . . . . .   7
   3.4    Accessing the channel downstream . . . . . . . . . . . . .   8
   3.5    The driver in detail . . . . . . . . . . . . . . . . . . .   8
   3.5.1  GetHandleProc  . . . . . . . . . . . . . . . . . . . . . .  11
   3.5.2  SetOptionProc  . . . . . . . . . . . . . . . . . . . . . .  11
   3.5.3  GetOptionProc  . . . . . . . . . . . . . . . . . . . . . .  12
   3.5.4  SeekProc . . . . . . . . . . . . . . . . . . . . . . . . .  14
   3.5.5  BlockModeProc  . . . . . . . . . . . . . . . . . . . . . .  16
   3.5.6  CloseProc  . . . . . . . . . . . . . . . . . . . . . . . .  17
   3.5.7  InputProc  . . . . . . . . . . . . . . . . . . . . . . . .  18
   3.5.8  OutputProc . . . . . . . . . . . . . . . . . . . . . . . .  20
   3.5.9  WatchProc  . . . . . . . . . . . . . . . . . . . . . . . .  20
   3.5.10 HandlerProc  . . . . . . . . . . . . . . . . . . . . . . .  24
   3.5.11 FlushProc  . . . . . . . . . . . . . . . . . . . . . . . .  25
          References . . . . . . . . . . . . . . . . . . . . . . . .  26
          Author's Address . . . . . . . . . . . . . . . . . . . . .  29
   A.     Glossary . . . . . . . . . . . . . . . . . . . . . . . . .  30
   B.     Acknowledgements . . . . . . . . . . . . . . . . . . . . .  31





1. Introduction

The main concept of the I/O system used by the Tcl core is the abstract notion of channels unifying different paths for communication and the accompanying split of this subsystem into two layers, one generic in nature, the other handling the specialities of the various communication channels.

It is this second layer which is the home of the drivers implementing channel types and thus bridging the gap between the generic layer and the operating system providing the actual facilities for communication. His interface to the generic layer is what we will describe here.

Before embarking on this task some other things to note before. In the beginning of the I/O system were there only drivers for things like files, pipes and sockets. But with the inclusion of the stacked channel patch into the core in 8.2 we now have the situation that two different types of drivers can be written, one like the ones mentioned before, i.e. base (fundamental, bottom) drivers, and transformations (also called filtering channels). Both types will be described in this document but thereas the properties regarding base drivers are valid across the various versions of the core the statements regarding transformations will apply only to Tcl 8.4 and beyond. The reason for this restriction is that the interface to transformations (and their semantics) differ considerably between the various versions of the core and while documenting the differences is not impossible (only tedious) I am currently not in the mood for this boring task. People who want to know how to support a transformation across version are hereby directed to take a look at the Trf1 extension and the various compile-time and run-time tricks and decisions to do so.

2. Main facilities in the core

The main entrance to understanding a channel driver is the Tcl_ChannelType2 structure as it lists all the functionality a driver has to implement for a correct integration into the (I/O) core.

These function vectors will be later explained in Section 3.5

Of the many channel-related functions in the public API only some are of interest to a channel driver.

A principal API used by all drivers is

  • Tcl_NotifyChannel3

which allows a driver to communicate with the notification/event subsystem and to post events when he is readable or writeable.

For the creation of new channels two different APIs are available, one for each type of drivers, base and transformation:

  • Tcl_CreateChannel4 and
  • Tcl_StackChannel5

Whereas the first function creates an independent channel the second will push the new transformation over an existing channel, thus forming a stack of transformations with a base driver at the bottom.OA This latter function is also complemented by two more functions, one to retrieve the channel below a transformation, the second to remove the topmost transformation from its stack:

  • Tcl_UnstackChannel6 and
  • Tcl_GetStackedChannel7

All these functions (and some more) will be explained later in more detail.

Right now it is also important to know that the following guarantees are made by the core with respect to channels and their possible stacking:

  1. Stacking a transformation on a channel given through a Tcl_Channel token (a reference) will neither invalidate this nor all other references held by some C code. When writing to a channel represented by such an older token the I/O system will automatically compensate for the stack above it.
  2. Only the topmost channel in a stack will do EOL-translation, UTF <-> XX encoding and buffering. All channels below will neither buffer, nor translate EOL, nor encode UTF.
  3. Events posted by a channel are always filtered through all the channels above before being handed to channel handlers, either in C or in Tcl. All channels the event is passing through are allowed to absorb it in the process of their own work. This means that transformations are allowed and able to talk to and negotiate in an asynchronous manner with their counterpart on the other side of their channel before allowing the higher layers their turn with respect to events. An example of a transformation requiring such a facility is the implementation of the TLS/SSL protocol8 which has to set up the secure channel before allowing the normal communication.
  4. The above also implies that the topmost channel in a stack is always notified last.
  5. A transformation channel automatically starts out with the same blocking mode as the channel it replaces.

3. Writing a channel driver

3.1 InstanceData

Whenever a channel is created the core will not only get a reference to the structure containing the references to the driver procedures, i.e. the channel type, but a reference to a structure allocated by the caller as well. This reference is given to all driver procedures when called for that particular channel. The internals of this structure are known only to the channel driver; the core will just pass the reference around. This allows us to associate the specific state of the driver with the channel.

The following information should be present in the instance data, not necessarily under the name I gave them, you are free to choose your own identifiers:

Tcl_Channel channel;
is a backlink from the instanceData to the channel. Without this link the driver will be unable to access and manipulate its channel as all driver procedures are called only with the instance data as argument. The token to store is the result value of Tcl_CreateChannel9 (or Tcl_StackChannel10).
All transformations need access to the channel below them so that they are able read and/or write from it. No additional item is required, just call Tcl_GetStackedChannel (Section 3.4) with this item as argument to obtain the necessary token.
Tcl_TimerToken timer;
Such a timer is necessary if the driver is able to buffer processed data the generic I/O layer has no knowledge of. It will be used to flush out such data. See Section 3.5.9 for more explanations. Transformations usually do such buffering.
int flags;
Transformations have to remember the current blocking mode to handle EOF on input right. See Section 3.5.7 for more explanations. Base channels on the other hand usually don't have a need for this as they can propagate this information to the operating system. An exception are channel types like memchan11 which hold all their information in memory.
int mask;
Transformations have to remember the current interest in events; see Section 3.5.9 for more explanations. Other channel types may use this too to detect and skip calls which do not change the mask.

3.2 Creation of channels

Every channel has to have a creation command at the tcl level and an equivalent procedure at the C-level. Depending on the type of the channel the procedure has to use either Tcl_CreateChannel12 or Tcl_StackChannel13 to create and configure the generic part of the new channel. These two cases are described in the next two subsections.

3.2.1 Creation of a base channel

The main call to create a new base channel is Tcl_CreateChannel14 Before doing it the channeltype-specific creation procedure has to

  • process its arguments for possible options,
  • create and the specific channel structure (clientdata),
  • create a name for the channel, like 'sockXX', 'fileN', etc. and
  • initialize the specific channel structure.

After the creation of the channel it might be necessary to configure to configure it. For example it might be non-blocking by default. Note that the backlink to the channel has to be initialized with the result of the call to Tcl_CreateChannel15.

Creation procedure skeleton for a base channel

int XX_CreateChannel(interp, objc, objv, cd) {
	process arguments /* objc, objv */

	name = generate_name_for_new_channel();

	clientData  = Tcl_Alloc(...);

	initialize clientData... /* name, state, config ... */

	clientData->channel = Tcl_CreateChannel(interp, &chan_type, clientData);

	configure the channel according to arguments, if necessary

	Tcl_SetResult(interp, name);
	return TCL_OK;
}

The chan_type in the code above is the structure containing the references to the driver procedures for the base channel.

3.2.2 Creation of a transformation

In contrast to base channel types the creation procedure must not use Tcl_CreateChannel16 as that would create a new and separate channel, but use Tcl_StackChannel17 instead. This procedure takes as one of its parameters a reference to an existing channel and creates a new channel structure holding the state of the transformation. A token for this new structure is returned. When later accessing the old channel, i.e. the one the transformation was stacked upon, via Tcl_Read/Write et. al. the system will automatically redirect such calls to the top of the stack. In other words, all Tcl_Channel tokens stay valid, independent of where they are in a stack, yet no backdoors are opened. The latter is not completely true, but we will come to this later on.

Other things, like the creation and initialization of the necessary clientData for the transformation, have to be done as usual.

The backlink to the channel of the transformation has to be initialized with the result of the call to Tcl_StackChannel18.

Creation procedure skeleton for a transformation

int XX_CreateTransformation(interp, objc, objv, cd) {
	old_channel = find(handle(objv1));

	clientData = Tcl_Alloc (...);

	initialize clientData...

	clientData->channel = Tcl_StackChannel(interp, &trans_type, clientData, old_channel);

	Tcl_SetResult(interp, old_channel->name);
	return TCL_OK;
}

The trans_type in the code above is the structure containing the references to the driver procedures for the transformation.

3.3 Destruction of channels

Destruction of channels is done with either Tcl_UnregisterChannel19 or Tcl_UnstackChannel20. Both of them can be called with any channel and will always compensate if the channel was part of a stack. The first always destroys all channels in a stack, from top to bottom, whereas the second will always destroy just the topmost channel of a stack. Both procedures are equivalent if there is only one channel in the stack.

As Tcl_UnregisterChannel21 knows that the whole stack of channels is in destruction it does not deal with events anymore, except for destroying the internal data structures supposed to deal with them. But it does ask the various channels in the stack to flush buffered information (on the write side) down the stack so that nothing which is stuck is lost. This is not possible for information in the upward/read buffers, as there is no ultimate receiver for them, so these bytes are lost.

Tcl_UnstackChannel22does mainly the same as Tcl_UnregisterChannel23 above, except that it takes action to keep the event-system up and running. Again information in the generic read-buffers is lost, but for a reason: Anything in the input queue and the push-back buffers of the transformation going away is transformed data, but not yet read. As unstacking means that the caller does not want to see transformed data any more we have to discard these bytes. Information stored in buffers internal to the transformation and not yet transformed should be saved for later reads without the transformation in place, but we currently don't have an API to do this. Consequence: No transformation should read more information than it is willing to transform at once, or unstacking will cause gaps in the data read from a channel. This may change in the future as some of the required mechanism are already in place in the core, internally.

Whatever way was used to destroy a channel, the system will call the Section 3.5.6 of the transformation so that its driver may cleanup its data structures.

3.4 Accessing the channel downstream

This section is relevant to transformations, but nothing else.

To accesss the channel below itself a driver just has to call Tcl_GetStackedChannel24 with the token of its own channel (the backlink we talked about in Section 3.1). The function will return the token for the channel we want. A (Tcl_Channel) NULL indicates that the channel used as argument was at the bottom of the stack.

3.5 The driver in detail

   Now that the environment of the driver is a little more known we can
   explain the operations of the various driver procedures in detail.
   Every description will start with the general condition under which
   the procedure is called by the generic I/O layer of the tcl core and
   proceeds to the specialities a transformation has to take care of.

   But before we come to this we have to talk about driver versions as
   well. The core currently supports version 1 and version 2 drivers.
   The latter was introduced with Tcl 8.3.2, during the second rewrite
   of the stacked channel stuff. It was required as for some
   functionality new vectors were needed to correctly support it and
   the core also had to had a way to determine wether the new fields
   were valid or not and just accessing them is out of the question as
   the older structures simply don't have them, i.e. we would get
   random information and most likely crash later oon.

   The structure for a channel type looks like this:

   Tcl_ChannelType definition

   typedef struct Tcl_ChannelType {
       char *typeName;                        /* The name of the channel type in Tcl
  • commands. This storage is owned by
  • the channel type. */
       Tcl_ChannelTypeVersion version;        /* Version of the channel type. */
       Tcl_DriverCloseProc *closeProc;        /* Procedure to call to close the
  • channel, or TCL_CLOSE2PROC if the
  • close2Proc should be used
  • instead. */
       Tcl_DriverInputProc *inputProc;        /* Procedure to call for input
  • on channel. */
       Tcl_DriverOutputProc *outputProc;        /* Procedure to call for output
  • on channel. */
       Tcl_DriverSeekProc *seekProc;        /* Procedure to call to seek
  • on the channel. May be NULL. */
       Tcl_DriverSetOptionProc *setOptionProc;
                                           /* Set an option on a channel. */
       Tcl_DriverGetOptionProc *getOptionProc;
                                           /* Get an option from a channel. */
       Tcl_DriverWatchProc *watchProc;        /* Set up the notifier to watch
  • for events on this channel. */
       Tcl_DriverGetHandleProc *getHandleProc;
                                           /* Get an OS handle from the channel
  • or NULL if not supported. */
       Tcl_DriverClose2Proc *close2Proc;        /* Procedure to call to close the
  • channel if the device supports
  • closing the read & write sides
  • independently. */
       Tcl_DriverBlockModeProc *blockModeProc;
                                           /* Set blocking mode for the
  • raw channel. May be NULL. */
       /*
  • Only valid in TCL_CHANNEL_VERSION_2 channels
        */
       Tcl_DriverFlushProc *flushProc;        /* Procedure to call to flush a
  • channel. May be NULL. */
       Tcl_DriverHandlerProc *handlerProc;        /* Procedure to call to handle a
  • channel event. This will be passed
  • up the stacked channel chain. */
   } Tcl_ChannelType;

   For this discussion are only the four fields 'version',
   'blockModeProc', 'flushProc' and 'handlerProc' of relevance.

   All possible values are acceptable for 'version', but
   'TCL_CHANNEL_VERSION_1' and 'TCL_CHANNEL_VERSION_2' are special as
   they explictly spell out the version of the driver. Any other value
   will cause the system to assume that the structure describes a
   version 1 driver and that the 'version' field actually contains the
   reference to 'blockModeProc'. And when looking at older versions of
   tcl.h one will find exactly this definition of the structure. If the
   special values are used the 'blockModeProc' must be found in the
   field of that name, and for a version 2 driver 'flushProc' and
   'handlerProc' are valid as well.

   Now that I have explained the complex logic I should also note that
   code which has just to read the various fields in a
   Tcl_ChannelType[25] has to use the accessor functions in the
   following list. These functions have the logic above written into
   them and will return the correct value independent of the driver
   they access. This is especially of value for transformation
   channels. Someone setting up a Tcl_ChannelType[26] structure for a
   new driver still has to know the rules, though.

   o  Tcl_ChannelBlockModeProc[27]

   o  Tcl_ChannelCloseProc[28]

   o  Tcl_ChannelClose2Proc[29]

   o  Tcl_ChannelInputProc[30]

   o  Tcl_ChannelOutputProc[31]

   o  Tcl_ChannelSeekProc[32]

   o  Tcl_ChannelSetOptionProc[33]

   o  Tcl_ChannelGetOptionProc[34]

   o  Tcl_ChannelWatchProc[35]

   o  Tcl_ChannelGetHandleProc[36]

   o  Tcl_ChannelFlushProc[37]

   o  Tcl_ChannelHandlerProc[38]

   Some questions of policy:

   When should one write a version 1 driver, when one for version 2 ?
   And what is the 'best' way to write a version 1 driver ?

   When defining a transformation writing a version 2 driver is
   recommended as only that version has the best support for
   integrating transformations and eevent processing. The disadvantage
   is that thiis will restrict the driver to Tcl 8.3.2, 8.4 and up. On
   the other hand, someone dead set at writing a transformation
   supporting all (or some) versions of the core with stacked channels
   should take a good look at Trf[39] for the necessary voodoo to make
   such a beast work.

   When writing a base channel it is on the other hand recommended to
   use a version 1 driver as that version will support all versions of
   the core with the least hassle involved. Because of this reasoning
   it is also recommended not to use 'TCL_CHANNEL_VERSION_1' but to
   define the version implicitly, i.e. to store 'blockModeProc' in that
   field. Usage of 'TCL_CHANNEL_VERSION_1' would again restrict the
   driver to 8.3.2 and beyond.

3.5.1 GetHandleProc

   This procedure is called by the C-API function
   Tcl_GetChannelHandle[40] to retrieve the OS specific handle
   associated to the queried channel.

   Channel types implementing communication paths independent of the
   OS, like memchan[41], have to return a NULL handle (erroring out is
   not possible).

   Transformations don't need to bother with this function. The generic
   layer will always query the bottom most channel in a stack as that
   is the only one which can have OS specific handles. In other words,
   transformations are never queried for this information.

3.5.2 SetOptionProc

   This procedure is called by the generic I/O layer whenever
   Tcl_SetChannelOption[42] is used (for example by 'fconfigure') and a
   non-standard option was specified as argument.

   For base channel the handling is straight forward. If there are no
   options to set, just set the asssociated field in
   Tcl_ChannelType[43] to NULL. Else compare the specified name against
   the options supported by the driver and act accordingly. If the
   option is not known use Tcl_BadChannelOption[44] to generate the
   error message.

   Transformation channels are basically the same, except for unknown
   options. They have the additional option to delegate the call to the
   channel downstream. I personally recommend to delegate the call.
   Because of this I also recommend to implement this function even if
   the transformation has no options by itself.

   SetOptionProc skeleton for a transformation

           static int SetOptionProc (clientData, interp, optionName, value)
           {
                   ... handle your own options

                   /* delegate unknown options downstream */

                   Tcl_Channel parent = Tcl_GetStackedChannel (clientData->channel);

                   Tcl_DriverSetOptionProc *setOptionProc =
                   Tcl_ChannelSetOptionProc (Tcl_GetChannelType (parent));

                   if (setOptionProc != NULL) {
                         return (*setOptionProc) (Tcl_GetChannelInstanceData (parent),
                                  interp, optionName, value);
                   } else {
                         return TCL_ERROR;
                   }
           }

3.5.3 GetOptionProc

   This procedure is called by the generic I/O layer whenever
   Tcl_GetChannelOption[45] is used (f.e. by 'fconfigure') to query the
   value of a non-standard (or all) option(s).

   The channeltype has to implement everything from Section 3.5.2 and
   some more. The latter not only because read-only options make sense
   (write-only not so much) but also because there is a special case
   which asks the channel for the values of all of its options.

   For transformations it is again possible to delegate options unknown
   to it to the underlying channel. In the case of a query for all
   options such a delegation will generate a mighty long result.
   Pruning the unnecessary options values from the result of the
   underlying channel (-encoding, -buffering, -translation) is
   possible, but tedious (We are working with DStrings, not
   Tcl_Obj'ects, and especially no ListObj'ects).

   GetOptionProc skeleton for a transformation

           static int GetOptionProc (clientData, interp, optionName, dsPtr)
           {
                   if (query for all options) {
                           ... add our own options to the result

                           /* add the options of the channel downstream
  • to the result
                            */

                            Tcl_Channel parent = Tcl_GetStackedChannel (clientData->channel);

                            Tcl_DriverGetOptionProc *getOptionProc =
                            Tcl_ChannelGetOptionProc (Tcl_GetChannelType (parent));

                            if (getOptionProc != NULL) {
                                    return (*getOptionProc) (Tcl_GetChannelInstanceData (parent),
                                                           interp, optionName, dsPtr);
                            }
                            return TCL_OK;
                   } else {
                           ... handle your own options.

                           /* delegate queries for unknown options downstream */

                           Tcl_Channel parent = Tcl_GetStackedChannel (clientData->channel);

                           Tcl_DriverGetOptionProc *getOptionProc =
                           Tcl_ChannelGetOptionProc (Tcl_GetChannelType (parent));

                           if (getOptionProc != NULL) {
                                   return (*getOptionProc) (Tcl_GetChannelInstanceData (parent),
                                           interp, optionName, dsPtr);
                           } else {
                                   return TCL_ERROR;
                           }
                   }
           }

   P.S. Given the similarities in the way the delegation is handled by
   the two branches of the if statement above it makes sense to factor
   this code into a separate procedure.


Kupries Page 13

HOWTO Tcl I/O and channeltypes November 2000

3.5.4 SeekProc

   This procedure is called by the generic I/O layer whenever the user
   asks the channel to move or query the 'file access point'. The
   respective public entries for these functions are Tcl_Seek[46] and
   Tcl_Tell[47]. The tell functionality is requested by 'mode ==
   SEEK_CUR' and 'offset == 0'.

   With respect to seeking the core currently distinguishes between
   seekable and unseekable channels. The latter are marked by setting
   'seekProc' to NULL. This is currently true for "tty", "tcp" and
   "pipe" types, i.e serial lines, sockets and pipes. This distinction
   as supported by the tcl core is actually a bit more restrictive than
   necessary as all of the currently unseekable channels could support
   limited seeking. I am speaking of forward seeking or 'skipping'. For
   now we will have to live with this restriction.

   For new a base channel the implementation of this procedure should
   be straight forward. Unseekable channels (like [udp]) and forward
   seekable channels just don't implement it, the other types either
   forward the call to the OS or simply manipulate their internal
   state. See memchan[48] for an example of the latter

   SeekProc skeleton for OS associated channels

           static int SeekProc (clientdata, offset, mode, errorCodePtr)
           {
                   ...        flush possibly waiting output
                   ...        discard possibly waiting input

                   ...        compute some value from offset, mode and
                           current location, then forward this to the OS.

                   *errorCodePtr = (result == -1) ? Tcl_GetErrno () : 0;
                   return result;
           }













Kupries Page 14

HOWTO Tcl I/O and channeltypes November 2000

   SeekProc skeleton for non-OS channels

           static int SeekProc (clientdata, offset, mode, errorCodePtr)
           {
                   ...        compute the new location X from offset, mode and
                           current location

                   if (X out of bounds) {
                           *errorCodePtr = EINVALID;
                           return -1;
                   }

                   clientdata->state = X;
                   *errorCodePtr = 0;
                   return X;
           }

   For transformations seeking is a hard problem. Should they seek
   using their own notion of access point ? Or should they use the
   notion of the underlying channel and then try to adapt their own
   state for fine-positioning? Should they allow seeking at all ?

   Depending on the transformation both the first two can be
   impossible. Nice examples are compressors (like zlib) with their
   completely non-linear and position-dependent relationship between
   the number of bytes coming in from the downstream channel and going
   out to its caller. Another reason could be that the transformation
   state is not reversible, i.e. cannot be rolled back in a simple way,
   without hogging memory. An example for this would be an encryption
   transformation using a cryptographically strong hash-function to go
   from the current state to the state for the encryption of the next
   byte (or block). This is not reversible. We can go forward from
   state to state, but not back to the old state, except for saving
   them all.

   So the simplest policy when dealing with seeking is to propagate the
   request unchanged to the underlying channel, to discard all
   information in the internal buffers of the transformation and then
   hope for the best. Data waiting to be written is converted as if
   they are the last block, in other words the special end of
   information processing is applied, and then flushed. The current
   state is abandoned too. The next call to InputProc (Section 3.5.7)
   or OutputProc (Section 3.5.8) will be handled as if it were the
   first call to the transformation.

   The above is basically a strategy 'The user knows best, is able to
   compute a place making sense and not creating garbage during
   recover'. It could also be named 'head-in-the-ground'. In the end
   this simply means that the user of a certain transformation has to
   understand its properties and whether a seek on it makes sense at
   all before trying to seek.

   Remark: It is possible to deal even with non-reversible state, by
   recording all read/write calls and maintaining an exact image of the
   information read/written so far, but this is, ah, memory-extensive,
   to understate this a little. Also note that the notion of channels
   with non-reversible state is equivalent to the notion of forward
   seekable channels.

   SeekProc skeleton with 1:1 pass

           static int SeekProc (clientdata, offset, mode, errorCodePtr)
           {
                   ...        flush waiting output
                   ...        flush waiting input, if possible
                           (f.e. into a configured variable!)

                   /* Chain the call */

                   Tcl_Channel parent = Tcl_GetStackedChannel (clientData->channel);

                   Tcl_ChannelType*    parentType     = Tcl_GetChannelType  (parent);
                   Tcl_DriverSeekProc* parentSeekProc = Tcl_ChannelSeekProc (parentType);
                   int                 errorCode;

                   if (parentSeekProc == (Tcl_DriverSeekProc*) NULL) {
                           return -1;
                   }

                   return (*parentSeekProc) (Tcl_GetChannelInstanceData (parent),
                                           offset, mode, &errorCode);
           }

   As a last note, Trf[49] implements a much more complex seeking model
   for transformations, but describing it is beyond the scope of this
   document. Go to its documentation instead.

3.5.5 BlockModeProc

   This procedure is called by the generic I/O layer whenever
   Tcl_SetChannelOption[50] is used (for example by 'fconfigure') and a
   '-blocking' was specified as the name of the option.

   For a base channel this procedure has to take the necessary actions
   at the OS level to switch the OS object underlying the managed
   channel into (non-)blocking behaviour.

   A transformation channel however just has to remember this
   information in its instance data so that InputProc (Section 3.5.7)
   is able to deal correctly with empty reads on the downstream
   channel. The generic layer takes care of notifying all channels in a
   stack so that all have the same information.

   BlockModeProc skeleton

           static int BlockModeProc (clientdata, mode)
           {
                   if (mode == TCL_MODE_NONBLOCKING) {
                           clientdata->flags |= ASYNC;
                   } else {
                           clientdata->flags &= ~ASYNC;
                   }
           }

3.5.6 CloseProc

   This procedure is called by generic the I/O layer to tell a channel
   that it is about to be destroyed. See Section 3.3 for the procedures
   which can invoke it.

   It is the responsibility of the procedure to clean up any data
   structures held by the channel. Another task is the removal of all
   event related things, like ChannelHandlers and Timers, although this
   could be billeted under 'clean up of any data structures held by the
   channel' too.

   Transformations have the additional responsibility to complete the
   conversion of all incomplete information sitting in its internal
   write buffers and to write the result into the downstream channel to
   ensure a clean closure.

















Kupries Page 17

HOWTO Tcl I/O and channeltypes November 2000

   CloseProc skeleton

           int CloseProc (clientdata, interp)
           {
                   ...        delete timer, if any. See 'WatchProc' too.

                   |...        do last minute conversions on r/w buffers and try to
                   |        flush their results to the underlying channel.
                   |        See below
                   |
                   |        parent = Tcl_GetStackedChannel (clientdata->channel);
                   |        Tcl_WriteRaw (parent, buffer, bufsize);

                   ...        free data structures on the heap.
                   return        TCL_OK
           }

   The part marked with '|' is specific to transformations and not
   required for a base channel.

3.5.7 InputProc

   This procedure is called by the generic I/O layer whenever some
   input is required. Entrypoints which can cause this are
   Tcl_Read[51], Tcl_ReadChars[52], Tcl_Gets[53] and Tcl_GetsObj[54]. 

   A base channel just has to call the appropriate OS functionality to
   get the information, or retrieve it from its internal buffers.

   A transformation on the other hand has to ask the channel downstream
   for the data to convert when reading, or write the converted data to
   it when writing. Usage of Tcl_Read[55] is not allowed under any
   circumstances. As said several times before, the generic layer
   compensates for the existence of a stack when dealing with a
   channel. So reading using Tcl_Read[56] will cause a read from the
   topmost channel in the stack, this will try to get information from
   downstream, jump again to the top, ad infinitum, or rather until the
   stack blows up.

   To get around this problem two special APIs were introduced, the
   Raw-functions. They will always access the channel given as their
   argument without compensation for stacking, thus enabling a
   transformation to talk directly to the channel downstream. It is
   important to note that these functions pose a risk too. Usage from
   within the driver of a transformation is required, but nothing can
   stop usage from outside of such a driver as well. This means that it
   is possible to write tools which are able to bypass channels in a
   stacks and cause all sorts of (de)synchronisation and security
   problems.

   Here we need Tcl_ReadRaw[57].

   Instead of a skeleton which would be overwhelming even if trimmed
   down I list the rules the input procedures in my transformations are
   based upon.
  1. If the request can be satisfied by the information in the
       internal read buffers of the transformation, just use their
       contents.
  1. If there was not enough data in the buffers to satisfy the
       request ask the underlying channel for more.
  • In blocking mode this will wait until we get some data or hit
          EOF.

          +  After returning from the read we first have to convert the
             read bytes and second check whether the result is enough
             to satisfy the initial request. If not we have to repeat
             querying the channel downstream.

          +  If we hit EOF instead we have to convert any incomplete
             information in the internal buffers using any special
             handling defined for the transformation and then return (a
             possibly partial result).  The EOF condition must not be
             signaled upward to our caller until our internal buffers
             are empty.
  • In non-blocking mode we either get nothing, some data or EOF.
          Getting EOF or data has to be handled as in the previous
          item. But if nothing was retrieved we simply return the
          partial (or even empty) result. And if there is nothing in
          the internal buffers we have to signal the error EWOULDBLOCK
          too.

   Other things to consider:

   o  If a transformation makes use of an interpreter for the
      evaluation of scripts during its work it has to use
      Tcl_SaveResult[58] and Tcl_RestoreResult[59] to protect the
      result area of the interpreter. This is necessary as the I/O
      system, i.e. the calling procedure may have an unannounced
      reference to the object. Not doing this may crash the interpreter
      with a defect list of free objects.

   o  Writing to the underlying channel is allowed!  An example using
      this is the SSL/TLS transformation[60] created by Matt
      Newman[61]. Before going into a transparent encryption mode it
      handles the complete handshake between server and client required
      to setup the encryption. As long as the handshake is not complete
      nothing can be read from/written to the channel.

   o  The InputProc can have any type and number of side effects.
      Examples:
  • Identity transformation collecting statistics (frequency of
         bytes, byte-pairs, triplets, etc.)
  • Splitter: Identity transformation piping the information
         flowing through it to a second channel (different from the
         channel downstream).

   o  Recursively reading from/writing to the transformation itself
      (maybe indirectly, see splitter) is not a good idea, it may lead
      to infinite looping.

3.5.8 OutputProc

   This procedure is called by the generic I/O layer whenever something
   is written to the channel and an I/O buffer in the generic layer is
   flushed. Entrypoints which can cause this are Tcl_Write[62],
   Tcl_WriteChars[63] and Tcl_WriteObj[64]. 

   A base channel just has to call the appropriate OS functionality to
   write the information, or to store it into its internal buffers.

   Transformations on the other hand have to convert as much as
   possible of the data they got, and the result must be written to the
   channel downstream (Well, not really, but not writing it does not
   make much sense). Like for Section 3.5.7 usage of Tcl_Write[65] and
   consorts is forbidden and would crash the system.

   Here we need the second of the two Raw_APIs, namely Tcl_WriteRaw[66].

   If there is data which cannot be converted at once it has to be
   buffered internally, for conversion by future write requests,
   together with the data written by these calls.

   As with InputProc (Section 3.5.7) this procedure is free to read
   from the underlying channel too, or from some other channel, or ...

3.5.9 WatchProc

   This procedure is called by the generic I/O layer whenever the user
   (or the system) announces its (dis)interest in events on the
   channel. It is called throughout the system, for example when
   channelhandlers are added to and removed from a channel, or after
   the execution of channelhandlers (as they may change the interest).

   Base channels have to check the mask and then (un)register
   themselves at the notifier. In most cases this will involve using
   the existing APIs and some OS handle for the channel, but in more
   complex cases it might be necessary to write an entirely new event
   source and add it to the notifier. This is beyond the scope of this
   HOWTO.

   The most relevant entries in the API for this are

   o  Tcl_CreateFileHandler[67],

   o  Tcl_DeleteFileHandler[68], 

   o  Tcl_CreateTimerHandler[69]and

   o  Tcl_DeleteTimerHandler[70]

   The behaviour for transformations is much simpler, and fixed. In
   other words, the following is what has to be done here for a smooth
   interoperation with the notifier and for working fileevents. The
   correct implementation of Section 3.5.10 is also of essence.

   Two things have to be done.

      Propagation of the mask information to the channel downstream.
      All transformations have to cooperate in this or else the base
      channel at the bottom won't register with the notifier for
      events. The contrasting example to this is Section 3.5.5 where
      the generic layer automatically notifies all channels in the
      tsack itself. Here we decided against such an automatism because
      the way it is now allows the various transformations to modify
      the mask before passing it down, i.e. to add the events they are
      interested in beyond the interest of the script level without any
      changes to the structure of Tcl_ChannelType[71] or the sematics
      of this vector. This would have been the case if we had changed
      this to an automatic chaining in the generic part of the I/O
      system.

      Setting up and destroying the timer used to flush out the
      internal read buffers. This needs a bit more explanation, which
      will be given after the skeleton code for a WatchProc.








Kupries Page 21

HOWTO Tcl I/O and channeltypes November 2000

   WatchProc skeleton

           static void WatchProc (clientdata, mask)
           {
                   /* Pass the mask to the channel downstream, possibly
  • modified. Remember the mask internally.
                    */

                    Tcl_Channel          parent    = Tcl_GetStackedChannel (clientdata->channel);
                    Tcl_DriverWatchProc* watchProc = Tcl_ChannelWatchProc (Tcl_GetChannelType (parent));

                    trans->watchMask = mask;

                    (*watchProc) (Tcl_GetChannelInstanceData(parent), mask);

                    /* Manage the timer */

                    if (!(mask & TCL_READABLE) || (no pending converted information))) {

                            /* A pending timer may exist, but either is
  • there no (more) interest in the events it
  • generates or nothing is available for
  • reading. Remove it, if existing.
                            */

                           ... kill the timer
                   } else {
                           /* There might be no pending timer, but there
  • is interest in readable events and we
  • actually have data waiting, so generate a
  • timer to flush that if it does not exist.
                            */

                           ... create the timer.
                   }
           }

   The handler procedure for the timer handled above looks like below.
   It basically proclaims that the transformation channel is readable.
   There is no need to recreate the timer here, because the generic
   layer will call Section 3.5.9 after it has handled the event and
   that vector will then do the right thing (see above).







Kupries Page 22

HOWTO Tcl I/O and channeltypes November 2000

   Timer handler skeleton

           static void ChannelHandlerTimer (clientData)
           {
                   clientdata->timer = (Tcl_TimerToken) NULL;
                   Tcl_NotifyChannel (clientdata->channel, TCL_READABLE);
           }

   Now the promised explanation about the necessity of the timer.

   Consider this scenario:
  1. A transformation is stacked upon a socket and its internal read
        buffer is empty. The transformation does not merge lines.
  1. A fileevent script is set up and waiting for calls.
  2. The socket has data available, say 400 bytes, in several lines
        (more than one), they are the last on the channel, i.e.
        followed by EOF. The notifier generates the appropriate
        'readable' event.
  1. This event triggers the execution of the fileevent script in the
        top channel.
  1. The executed script uses 'gets' to read a single line.
  2. As the buffers are empty (s.a.) the transformation asks the
        socket for data to convert, using Tcl_ReadRaw[72] and a
        standard buffer of 4K size. Thus it gets all waiting bytes from
        the socket. These are converted, resulting in several lines (no
        merge). Some of them are delivered up into the generic layer
        and its buffering, but not all (small buffersize). At least one
        line remains in the read buffer(s) of the transformation itself.
  1. The script processes the one line it got and then goes back to
        sleep.
  1. Now what ?
  2. The generic I/O layer finds that its buffers are not empty and
        uses a timer to generate additional 'readable' events to clear
        them.
    1. In the end the generic I/O buffers are empty. Now what, again ?
    2. Nothing. No events, and no processing of the remaining line(s)
        stored in the transformation.
    1. Why ? Because the socket has an EOF pending and will not
        generate events anymore. and the generic layer has empty
        buffers and ceases to generate events too. It has no knowledge
        about the buffers inside the driver, i.e the transformation. 
        So the script will not wake up again, neither ask for the line,
        nor detect the pending EOF. We are hung.

   The solution to this lock is the same one used by the generic layer,
   but from the inside of the transformation this time:

      The transformation has to check itself for data waiting to be
      read and then use a timer to generate the necessary 'readable'
      events. And that is what the timer shown in the WatchProc will do.

3.5.10 HandlerProc

   This vector is the second part of the integration of transformations
   with the notifier. It is called by the generic layer whenever an
   event happens at the channel downstream. This also means that it is
   never called for the channel at the bottom of a stack. In other
   words, base channels don't have to implement this function.

   Transformations on the other hand have to implement it, and the
   minimally required implementation will pass the incoming event mask
   through, unchanged.

   To understand this an explanation is in in order. The function is
   called with a mask where bits are set for all events which happend
   on the channel downstream. The caller then expects that the return
   value of the function is the same mask, but with all the bits
   cleared whose events were handled by the function itself.

   Because of this a transformation is able to absorb and handle events
   without the channel (or script) above being aware of them and the
   associated processing. The TLS transformation[73] for example uses
   this facility to handle the whole negotiation phase. Only after the
   encryption is setup events are passed unchanged to the higher layers.












Kupries Page 24

HOWTO Tcl I/O and channeltypes November 2000

   WatchProc skeleton

           static int HandleProc (clientdata, mask)
           {
                   /*
  • An event occured in the underlying channel. This
  • transformation doesn't process such events thus
  • returns the incoming mask unchanged.
                    *
  • We do delete an existing timer. It was not fired,
  • yet we are here, so the channel below generated
  • such an event and we don't have to. The renewal of
  • the interest after the execution of channel
  • handlers will eventually cause us to recreate the
  • timer
                    */

                   ... kill timer
                   return interestMask;
           }

3.5.11 FlushProc

   This vector is currently not used by the generic layer of the I/O
   system. In the future it might be used to separate the actions of
   flushing and writing data.























Kupries Page 25

HOWTO Tcl I/O and channeltypes November 2000

References

   [1]  http://www.purl.org/NET/akupries/soft/trf/

   [2]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M6

   [3]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [4]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [5]  http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm

   [6]  http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm

   [7]  http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm

   [8]  http://www.sensus.org/tcl/

   [9]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [10]  http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm

   [11]  http://www.purl.org/NET/akupries/soft/memchan/

   [12]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [13]  http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm

   [14]  http://dev.ccriptics.com/man/tcl8.4/TclLib/CrtChannel.htm

   [15]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [16]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [17]  http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm

   [18]  http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm

   [19]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M11

   [20]  http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm

   [21]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M11

   [22]  http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm

   [23]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M11

   [24]  http://www.tcl.tk/man/tcl8.4/TclLib/ChnlStack.htm

   [25]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M6

   [26]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M6

   [27]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [28]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [29]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [30]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [31]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [32]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [33]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [34]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [35]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [36]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [37]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [38]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [39]  http://www.purl.org/NET/akupries/soft/trf/

   [40]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm

   [41]  http://www.purl.org/NET/akupries/soft/memchan/

   [42]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M21

   [43]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M6

   [44]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M17

   [45]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M20

   [46]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M18

   [47]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M19

   [48]  http://www.purl.org/NET/akupries/soft/memchan/

   [49]  http://www.purl.org/NET/akupries/soft/trf/

   [50]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M21

   [51]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M13

   [52]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M13

   [53]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M14

   [54]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M14

   [55]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M13

   [56]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M13

   [57]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm

   [58]  http://www.tcl.tk/man/tcl8.4/TclLib/SaveResult.htm

   [59]  http://www.tcl.tk/man/tcl8.4/TclLib/SaveResult.htm

   [60]  http://www.sensus.org/tcl/

   [61]  mailto:[email protected]

   [62]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M16

   [63]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M16

   [64]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M16

   [65]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm#M16

   [66]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm

   [67]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtFileHdlr.htm

   [68]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtFileHdlr.htm

   [69]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtTimerHdlr.htm

   [70]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtTimerHdlr.htm

   [71]  http://www.tcl.tk/man/tcl8.4/TclLib/CrtChannel.htm#M6

   [72]  http://www.tcl.tk/man/tcl8.4/TclLib/OpenFileChnl.htm

   [73]  http://www.sensus.org/tcl/

   [74]  mailto:[email protected]

   [75]  http://www.rfc-editor.org/rfc/rfc2629.txt

   [76]  http://memory.palace.org/authoring/xml2rfc.tar.gz

Author's Address

   Andreas Kupries
   Andreas Computer Laboratories (Me, myself and I)
   Kongress-Str. 23/15
   Aachen, NRW  52070
   DE

   Phone: +49 241 514 998
   EMail: [email protected]

Appendix A. Glossary

A little glossary of terms used in the paper, but so far without much of an explanation (or none).

Tcl_Channel
An opaque token for channels, and used by all interfaces accessing channels. Internally it is a pointer to the relevant data structures (Channel*).
stack
If one or more transformations are stacked upon an arbitrary other channel I use this word to refer to the whole group of channels.
(un)cover
Placing a transformation on a channel C "covers" C, removing the transformation "uncovers" it again.

Appendix B. Acknowledgements

This HOWTO was written in XML using the DTD developed by Marshall T. Rose74 for writing RFC's and I-D's, see RFC 262975, and converted to text and HTML with his tool, 'xml2rfc'76.