TV Anyone did any (preperatory) work for this ? (Cuda is the fairly heavy graphics card parallel processing possible on recent NVidia cards, see )

(Remark about the catergory) Uh well, the idea is that the Cuda part of the graphics card is like a sittle supercomputer with like 100s of Gigaflops and fine grained threads, which could run list commands. Also I don't mean a cuda compiler or language port, such as there is a python interface to the cuda processing, but more along the lines of a tcl working as a big scale parallel compiled language, such that is runs really fast, and can do things like inter-thread scheduling very quick.

TV Jan 3 '09 Since no one else seems to be working on thus particular desktop supercomputing, I took a few hours to do a little testing myself.

First I compiled Tcl 8.6.b1 from source (It seemed 8.5 didn't download. Since there are some backward canvas saving issues since some version in BWise that could be not ok), and then I added a Cuda example (From the NVidia Cuda SDK) to the source code of tclAppInit.c in the Unix directory (I run this on Fedora 10/64, the processor is Pentium D, the graphics card a FX9500GT), in principle by adding the source of the example to the file and calling the result [L1 ] . Then I added a tcl function (I'm sure breaking some style rules possibly) 'hello' which calls the Cuda test (-bench) which doesn't run the graphics, but tests an actual CUDA function call with data transporting to the graphics card (using CudaMemCopy).

To compile the result I used this command (you need the cuda devenv for that, see [L2 ], and of course a driver which can handle cuda, in my case the latest cuda 2.1 and opengl3 and such), which can be compared with the Cinepaint plugin approach I described here [L3 ]:


after setting:

   export LD_LIBRARY_PATH=`pwd`:/usr/local/cuda/lib

and making a local data directory with the lena.ppm example available.

The result is now a a.out as a tclsh with an extra command which executes the cuda test code, in the case of the above file a graphicsless test after which one is returned to the tclsh prompt, when "-bench" is removed from the argv list, the cuda graphics window will close after pressing <escape> (like in the Cuda SDK example), but then the interpreter is terminated because the GLUT graphics loop cannot be broken.


 [theo@medion2 unix]$ ./a.out 
 % hello
 Loaded 'data/lena.ppm', 512 x 512 pixels
 Processing time: 350.947998 (ms)
 74.70 Mpixels/sec
 Hello, World!

Oh, in this case the cuda function linked to the hello command can be run only once.

I is possible to install both the tcl from above, compile Tk too, and then run a.out (the nvcc compiled main tcl app) and then do:

   package require Tk

to use Tk (and possibly BWise in conjunction with the compiled CUDA .cu code

More later.