Three major computing quantizers 1: Sockets

Page initiated by Theo Verelst

As in that there are 3 major characterization parameters for just about any computer design and setup:

the actual things being computed, the number crunching, the bit fiddling, the float stuff, the graphics blits, some data goes in, a piece of machinery does something to it, and the result comes out of the computation
without it, we'd only have functions, which have no state, and nowhere to prepare the input data or store results, or cut and paste with or make a database
the connections between units with either of the computation or storage behaviour, transparent in the sense that data passes by unchanged, no actual computations take place.

I coined these while working at university once, in discussions with people who were supposed to go deeply and theoretically into parallel computing on a good day when I thought about correcting their lack of views instead of uttering my annoyance with their blatant lack of actual computer knowledge.

And in fact, they are well choosen, probably trivial for some, but thinking along their lines is not. Let's see what this means for using Tcl and Tk.

Quantizing is in:

  • Computation power: MIPS/Flops, specmarks, etc
  • Storage capacity: bits/bytes
  • Communication speed, called bandwidth in bits/sec or Bytes/S

Today I though I'd sit down and hand-benchmark sockets on redhat 9.0 and maybe on XP on a modern PC, see what they can do. Sockets are the essential link in X, and the only decent real time communication method between applications apart maybe from shared memory. The internet uses them (tcp/ip osi stack level 5 or so: application layer), and at least all os-es have some form of them.

starting a fresh wish or tclsh, or wish-with-console (I prefer to work that way and save my history for finding back real nitty gritty stuff I tried), and than another, a connection can be set up.

 wish1: socket -server sp 777
 wish2: proc sp {} {}

   wish2 set s [socket localhost 777]

DKF: TCP/IP sockets can be made to go much faster on high capacity links if you use a non-standard networking stack that sends the packets in a slightly different order. There's been some papers on this somewhere. [ref?]

TV Definitely interesting subject. First, it depends on what is measured and how communication is set up, as I first do, I let the sockets run on the same machine, so that essentially IP contributes nothing much and all tcp does is a preferably a blazing fast data copy and preferably percentage-wise small stream management overhead.

The data does need to get transferred, because we go from one process address space to another, but optimally, it would get copied only once. Using streams, the stream mechanism and OS interference is in the way (unless you would use unix sockets with in-buffer access, which is not so fail safe), requiring a OS security level function to copy data from one process to another, and usually the stream access function would copy data into and out of the os buffer, making three data copies happen at least. And that is on top of the process switch overhead to transfer control from one process to another as buffers run full and are emptied. Unix stuff, the reason for certain buffer sized and pointer stream access functions.

I remember sitting down with various (mainly HP-UX at the time) unix systems and doing nothing but checking data throughput (locally and over a net) and granularity for various buffer settings. That os could do with at least one of the user buffer lengths set to 0.

IP is more involved, because you can have various interfaces and a variety of network paths and routing algorithms involved, of which the main one would be the exponential bandwidth growing option of tcp.

Wanting to have a fast graphics computing link long before AGP and in a lab situation between existing machines I considered a HiPPI link (also some time ago) which would leave you with little IP overhead (one to one). Another high end approach would be to have for instance atm links, which could be routed.

In a consumer or business network, where you would have 10 and currently 100Mb/s ethernet, you'd want to see data transfers of up to 10 Megabytes oer second actually happen on not retarded machines, which in my experience is possible within reason, though windows is not exactly a pleasant beast in that game, linux I don't know yet.

It's fun enough having a remote X link for instance with a canvas with pictures and watch the netload when it is moved around.

Anyhow, the idea of communicating between processes to begin with is that inter application networks, a very valid application field for tcl/tk, could be pretty fast, I got tens of megabytes per second effectively without much effort 10 years ago on HP720's, and was wandering what current giga+ pentiums would do, and what tcl's stream copy is up to as it is.

While I'm at it: a lot of programs do not at all respond within reason in timeframe you might expect on modern machines, there are dos systems which effectively work faster than the full blown applications of these days exchanging messages and being under script control, and making use of things like com and other 'objects' it just doesn't make all too much sense, like moving a football with a shovel, and still not getting it right.

DKF: Just noting that 100Mb ether is not exactly what I think of as a high capacity link. Start thinking dedicated fibre optic... ;)