Version 16 of Ideas for a numerical analysis package

Updated 2004-04-01 06:47:31

Arjen Markus This page is meant for collecting ideas on how to deal with numerical analysis in Tcl.

The rationale

There are quite a few attempts to implement numerical analysis methods in Tcl, but so far there is no framework (conceptual or otherwise) that you can readily use. Everybody tries to do it in his or her own way.

If we look at our Perl and Python colleagues, they have PDL (Perl Data Language) and Numpy (or Numarray it seems to be called nowadays). There is no equivalent to my knowledge in Tcl, though there are quite extensive packages like la and NAP that might classify as such.

The basic problem

Numerical methods often deal with collections of data:

  • Arrays (in the C or Fortran sense) of numbers
  • Vectors in an N-dimensional space
  • Matrices (square or rectangular)
  • Higher-dimensional structures (but these tend to be used mainly in specialised areas)

One could think of nested lists (see Playing APL for an example) to represent them, but the la package uses plain lists for good reasons:

  • More efficient
  • The possibility to represent row and column vectors, important for a linear algebra package

Standard libraries exist in both C and Fortran for many problems (think of: Lapack and FFT libraries for instance). We can access these via small wrappers, generated via SWIG or Critcl and in fact a few such wrappers already exist.

The solution?

We should decide what the best methods are for dealing with numerical data and create an easy to use framework out of this. The design issues are:

  • Comfortable use from within Tcl
  • Acceptable performance, even with large sets of data
  • Easy to pass to and from binary extensions

My first guess is that we need a hybrid solution:

  • For small sets of data, a nested list may be the easiest way
  • For large sets of data, the LA approach or even an approach with binary strings (these are opaque to the scripting side but easy to pass to binary extensions) can be used

Lars H: I'd suggest using a flat list to hold the numerical data, with a separate "shape specification" that tells commands (that bother about the shape of data) how to treat it. I recall Fortran has some operation which e.g. allows you to say that the thing declared as a 6x6 matrix should be treated as a 36 element vector, or even 4x9 matrix. Saying "it's all basically vectors, but with a shape specification" makes that kind of thing easy.

As for how to handle the Tcl <-> binary conversions, I'd suggest making the string representation of the thing like a vector of numbers, or like a list containing such a thing. E.g. a 2x2 identity matrix might be

  {2 2} {1.0 0.0 0.0 1.0}

where the {2 2} part specifies the shape and the rest is the data. Note that this wouldn't have to be a list-of-lists-of-numbers as Tcl_Objs, but could be a single object of some new type whose string representation just happens to be possible to parse as lists. (In practice one shouldn't apply list operations on it, because that would discard the internal representation as "numerical array", but being able to do this anyway can be very useful when debugging.)

AM Your solution comes very close to what several packages are doing (I wanted to keep the discussion open by not proposing a "definite" solution :) But yes, it is the sort of solution you can find a lot.

DKF: Actually, for an N-dimensional matrix, your printed representation only needs to state N-1 dimensions, since you have the overall number of items "for free".


AM In Clustering data I use an approach with a list of lists, where each sublist represents the "coordinates" of the data point. As the algorithm only needs the data point by point and does not do anything "across" points, this works very well.


disneylogic I would not look to APL [L1 ] as a model for doing numerical analysis, or even J [L2 ], its successor. There are aspects of it which fall short and are, frankly, rather old compared to the state of knowledge of numerical methods, particularly numerical linear algebra. Instead, MATLAB [L3 ] should be the model. It is very well done, although the scope of the language is, in my opinion, now overextended.

One very nice thing about Tcl/Tk is that you have what is basically a simple language to which you can add packages of arbitrary complexity, but you don't have to have them there.

The corresponding book references are G.H.Golub, C.F.Van Loan, MATRIX COMPUTATIONS, ISBN 0-818-3010-9 [L4 ], J.Dongarra, J.R.Bunch, C.B.Moler, G.W.Stewart, LINPACK USERS GUIDE, ISBN 089871172X [L5 ], and J.E.Dennis, Jr, R.B.Schnabel, NUMERICAL METHODS FOR UNCONSTRAINED OPTIMIZATION AND NONLINEAR EQUATIONS, ISBN 0-13-627216-9 [L6 ].

Although it's not well known, Didier Besset, OBJECT-ORIENTED IMPLEMENTATION OF NUMERICAL METHODS: AN INTRODUCTION WITH SMALLTALK & JAVA, ISBN 1558606793 [L7 ] is really a nice layout of this kind of project in the context of specific languages.

Also, I would start out small.

The major topics are:

  • preliminaries of machine precision, roundoff, and error and their effects on things like convergence
  • how to represent complex numbers and shut off the representation when you don't want them
  • interpolation: you can never have too many kinds available
  • Householder transforms
  • polynomials and polynomial fits
  • least squares
  • numerical integration: not as hard as it seems
  • basic linear algebra fashioned in a manner which does not rely on determinants
  • solving various systems of linear equations having specific structures
  • numerical differentiation, a delicate subject
  • various powerful but essential operators for linear systems, like the Singular Value Decomposition, QR decomposition
  • eigenvalues in the general case

So, one part could be "solved" at a time and, then, as pieces are provided, they could be cobbled together into a larger structure. Tclers are good at cobbling!

It's important, in my opinion, to keep an eye on the eventual goal and also not to be too seduced by APLish syntactic sugar. Sure, you want a good notation for, say, matrices but lists of lists is already there and is similar to MATLAB's notation, apart from MATLAB's use of commas. Indeed, I would argue such a package should be able to import and export datasets of matrices in MATLAB's notation.

I would go on and list specialized functions and polynomial systems, linear differential equations and higher order equations, but the project given above is quite large. If available, it also would go a huge way towards providing the Tcl/Tk world with a very competitive numerical capability.

I have begun to make a living again using numerical methods, am wedded to Tcl/Tk, and so am interested in this project.


AM (30 march 2004) With respect to the topics above:

preliminaries of machine precision, roundoff, and error and their effects on things like convergence

how to represent complex numbers and shut off the representation when you don't want them

  • Several pages on the Wiki work with complex numbers or rational numbers
  • My "mathematical workbench" (see starkit archive: tclmath) uses a tagged list to represent them (and in fact implements a whole infrastructure to deal with things other than plain numbers)
  • Martin Russell has a package that deals with complex numbers and quaternions

interpolation: you can never have too many kinds available

  • I have little to add to that - except that I am interested in that sort of things myself

Householder transforms

  • The la package?

polynomials and polynomial fits

least squares

  • Again the la package

numerical integration: not as hard as it seems

  • See math::calculus in Tcllib
  • This includes: definite integrals, solving ordinary differential equations

basic linear algebra fashioned in a manner which does not rely on determinants, solving various systems of linear equations having specific structures

numerical differentiation, a delicate subject

  • Not as difficult as many people make you believe - at least from an engineer's point of view :)

various powerful but essential operators for linear systems, like the Singular Value Decomposition, QR decomposition, eigenvalues in the general case

  • The la package again ...

Other stuff that may be mentioned:


[ Category Numerical Analysis ]