[Arjen Markus] (15 april 2003) Inspired by a remark in the Tcl'ers
chatroom I started thinking about a smallish extension that can handle
blocks of data or gridded data or whatever you want to call it. Such
data arise in geographical information systems as the spatial
distribution of, say, population density or terrain level. They arise in
meteorology as the result of computer models and they can be found in
the field of partial differential equations, which is my professional
field of interest. Another application is digital image processing.
(I want to stress that I am not talking about ''matrices'' and ''linear algebra'', though both enter the picture. I want to manipulate data in a uniform way, without having to explicitly deal with for-loops and the like.)
So how can we represent such blocks of data? What operations do we need?
To answer the first question: I think we need an extension in C or
Fortran for this. At the script level we manipulate these blocks of data
via handles, much like files. The actual storage is handled by the
system programming language of choice (or by Tcl, if we use binary arrays as
an opaque data type). We may need some special arrangements to prevent
memory leaks, though.
With regard to the second question: I am biased towards
differential equations, so the list of possible operations that I
come up with may not be complete from your point of view, but I
would say (for esthetic reasons, the words look "nicer", I use the
suffix mat to indicate this type of variables):
* ''initmat'': set the sizes of the matrix variables - a global operation and perhaps arrange for variables like the X and Y coordinates
* ''setmat'': copy values into a new or existing matrix variable (the second argument is a matrix variable or a scalar)
* ''addmat'', ''subtractmat'', ''multiplymat'', ''dividemat'': elementwise arithmetic operations on two matrix variables or a matrix and a scalar
* ''maxmat'', ''minmat'': elementwise maximum and minimum
* ''backwardDiffX'', ''forwardDiffX'', ''centralDiffX'': first-order difference in X direction (various flavours). Note that these are ''not'' estimates of the derivatives (to keep the operation general)
* ''backwardDiffY'', ''forwardDiffY'', ''centralDiffY'': ditto in Y direction
* ''secondDiffX'', ''secondDiffY'': second-order difference in X and Y direction
* ''exprmat'': evaluate an expression in matrix variables
* ''setelem'': set a particular element to a new value
* ''getelem'': get the value of a particular element
* ''matToImage'': create a gray-scale image for visualisation
* shift operations?
* overall properties, such the maximum over the whole matrix
What is also required, is a way to handle the boundary conditions, any
difference operation will need to deal with this:
* The value on the boundary is a prescribed value
* The value on the boundary is the same as the value just inside
* The value on the boundary differs by a known amount from that just inside
As a simple example, the mean slope of a terrain could be determined
like this:
setmat slopeX [backwardDiffX $level]
setmat slopeY [backwardDiffY $level]
set meanSlopeX [overallMean $slopeX]
set meanSlopeY [overallMean $slopeY]
Or, a differential equation like the diffusion equation in two
dimensions (forgive me the clumsy notation):
dC d2C d2C
-- = D --- + D ---
dt dx2 dy2
The derivative in time of the concentration C can be calculated as:
setmat derivc [scalemat \
[addmat [secondDiffX $c] [secondDiffY $c]] $factor]
(where for simplicity, the grid sizes in X and Y direction are taken the
same, and the variable factor absorbes all the scalar parameters
involved).
Solving this equation for the stationary solution:
... Set up the computational grid
#
# Initial condition
#
setmat c 0.0
... Define boundary conditions
#
# Continue until convergence is achieved
# (checked in the proc "convergence?")
#
while { ! $converged } {
setmat derivc [scalemat ...]
setmat cnew [addmat $c $derivc]
set converged [convergence? $c $cnew]
setmat c $cnew
}
'''Implementation notes'''
Of course, the extension needs to deal with memory management: if a
temporary Tcl variable goes out of scope, because the proc returns, then
the memory will get lost, unless we can cleverly reclaim the memory or
some other means of accessing the variable's contents still exists.
I can see a number of reasonable ways for implementing this:
Via a compiled extension:
* Use a specific extension written in C or Fortran (my personal idea is to do it in Fortran, as this has excellent array operation facilities, making the implementation so much more elegant).
* Use Harvey Davies' NAP extension. I do not know at the moment if this extension has all the facilities we need for this, but it might be a nice application. (From the documentation, I think that differences are going to be a difficulty)
* By letting Tcl deal with it as much as it can - store the actual matrix in a binary array and let the extension do the calculations, not the storage.
Via a pure Tcl solution (thanks to [Andreas Kupries] and [Kevin Kenny]):
* Use the storage facilities of [struct]::[matrix] in [Tcllib].
* Use [Ed Hume]'s [LA] package.
Both provide the basic storage facilities we would need for this.
And, why not, solutions abound - [Vkit] (see remark below) should not be left out.
----
[TV] Just a humorously tainted side remark:
in serious profi software land I would have known of no one,
really, no one who would even even consider using Fortran in favour
of a more modern language except for compatibility reasons....
Why not use C? Maybe you'd want to try finding the sources for Khoros, which should be somewhere on the web for Unix for sure (for free, was a project at a Mexico university years ago); it even does nice enough graphical block diagrams to combine lots of field filters. Then again, never hurts to take a fresh angle, and tcl is nicer and in ways nicer than unix shells.
[AM] Why indeed? Consider this:
real, dimension(:,:) :: matrix1, matrix2, result
result = matrix1 + matrix ! Assuming shape compatibility, but let the compiler
! make sure via runtime array bounds checking
versus:
float *matrix1, *matrix2, *result ;
int i,j,k,n1,n2 ;
k = 0 ;
for ( j = 0 ; j < n2 ; j ++ ) {
for ( i = 0 ; i < n1 ; i ++ ) {
result[k] = matrix1[k] + matrix2[k] ; /* Trusting the programmer to supply
sufficient large arrays */
k ++ ;
}
}
Arguments as to "Fortran is not a modern language, whereas C is" (or C++ or Java or C# or ....) remind me of similar statements regarding Tcl :D.
Seen [Vkit is a vector engine]?
----
''[escargo] 17 Apr 2003'' - See [Playing APL] for a look at matrix operations in Tcl.
Other matrix operations that I could imagine wanting are:
* Matrix inverse
* Identity matrices of size set by a parameter
* Solutions to simultaneous equations
<> Mathematics