There is an easy way to parallelize complex computations in C++ by OpenMP. OpenMP is a set of directives to the compiler to parallelize loops, which allows efficient parallelization over multiple processors and vectorization over SIMD units with little changes to the code. It can be combined with SWIG and Tcl to speed up number crunching and to use Tcl to control the process.
Since Tcl_Obj is not thread-safe, all data must be stored in the C struct, most easily a C++ object. The interface can be done with SWIG, because that can create "objects" with little to no effort. Here is an example which computes a dot product in parallel:
#ifdef SWIG %module dotpro %include exception.i %include typemaps.i %include "std_vector.i" namespace std { %template(fvec) vector<float>; } %{ #include "dotpro.hpp" %} #else #include <vector> #endif // here comes the C++ header file typedef std::vector<float> fvec; class dotpro { fvec a; fvec b; public: dotpro(const fvec& a_, const fvec& b_) { a = a_; b = b_; } double dotproduct() { size_t l = a.size(); double result = 0; // this code runs in parallel via OpenMP #pragma omp parallel for reduction(+:result) for (size_t i = 0; i < l; i++) { result += a[i]*b[i]; } return result; } };
Save as dotpro.hpp and compile like this on macOS:
swig -c++ -tcl8 dotpro.hpp clang-omp++ -DUSE_TCL_STUBS -dynamiclib -fopenmp dotpro_wrap.cxx -o dotpro.dylib -ltclstub8.5
On Linux:
swig -c++ -tcl8 dotpro.hpp g++ -DUSE_TCL_STUBS -shared -fopenmp dotpro_wrap.cxx -o dotpro.so -ltclstub8.5
Then in Tcl:
(Programmieren) 49 % load dotpro.dylib (Programmieren) 50 % dotpro d {1.0 2.0 3.0} {4.0 5.0 6.0} _80196bfcb97f0000_p_dotpro (Programmieren) 51 % d dotproduct 32.0
The last call to dotproduct then runs in parallel over all your CPUs.
The key is to store all data in C structs (the std::vector in this case), because Tcl_Obj is not thread safe, and convert it on the way there and back, which is easily done by SWIG.