Evaluating polynomial functions

Arjen Markus (13 july 2004) I have created a small package for dealing with polynomial functions: you can add, subtract, and multiply polynomial functions. You can divide a polynomial by another polynomial (with a remainder) and you can, of course, evaluate them at a particular coordinate.

The latter is not as simple as it may look at first. A naive computation like this:

   f(x) = 1.0 + 2.0 * x + 3.0 * x**2 + 4.0 * x**3 + ... + 10.0 * x**9

may lead to very inaccurate results plus it uses far too many multiplications. Horner's rule can be used to improve the results:

   f(x) = 1.0 + x * (2.0 + x * (3.0 + x * (4.0 + ... x * 10.0)))))))))

This is not however a guarantee to success: the various families of (classical) orthogonal polynomials, like Chebyshev and Legendre polynomials, cause significant trouble even with this rule, as kbk pointed out to me.

To appreciate how much trouble, let us do some experiments:

 #
 # Load the scripts for general and orthogonal polynomials
 #
 source polynomials.tcl
 source classic_polyns.tcl

 #
 # A simple polynomial: g(x) = (x-1.3)**20
 # A second polynomial: h(x) = x*(x-1.1)*(x-2.1)*(x-3.1)*...*(x-19.1)
 #
 set f [::math::polynomials::polynomial {-1.3 1}]
 set g 1
 set h 1
 for {set i 0 } {$i < 20} {incr i} {
     set g [::math::polynomials::multPolyn $f $g]
     set h [::math::polynomials::multPolyn $h \
                [::math::polynomials::polynomial [list [expr {-$i-0.1}] 1]]]
 }

 #
 # Some checks:
 #
 puts "Degree:       [::math::polynomials::degreePolyn $g]"
 puts "Coefficients: [::math::polynomials::allCoeffsPolyn $g]"

 #
 # Evaluate g at x=1.3 in two ways - naively and via Horner
 #
 puts "g:"
 puts "x, \"exact\", naive, Horner -- relative errors"
 foreach x {0.1 0.2 0.3 0.5 1.0 1.2 1.3 1.4 1.6 2.0 5.0} {
    set result1 0.0
    set power   0
    foreach c [::math::polynomials::allCoeffsPolyn $g] {
       set result1 [expr {$result1+$c*pow($x,$power)}]
       incr power
    }

    set result2 [::math::polynomials::evalPolyn $g $x]
    set exact   [expr {pow($x-1.3,20)}]
    if { $exact != 0.0 } {
       set error1  [expr {($result1-$exact)/$exact}]
       set error2  [expr {($result2-$exact)/$exact}]
       puts [format "%4.2f %20.12e %20.12e %20.12e -- %10e %10e" \
                $x $exact $result1 $result2 $error1 $error2]
    } else {
       set error1  -
       set error2  -
       puts [format "%4.2f %20.12e %20.12e %20.12e -- %10s %10s" \
                $x $exact $result1 $result2 $error1 $error2]
    }
 }
 puts "h:"
 puts "x, \"exact\", naive, Horner -- relative errors"
 foreach x {0.1 0.2 0.3 0.5 1.01 1.2 1.3 1.4 1.6 2.1 5.1} {
    set result1 0.0
    set power   0
    foreach c [::math::polynomials::allCoeffsPolyn $h] {
       set result1 [expr {$result1+$c*pow($x,$power)}]
       incr power
    }

    set result2 [::math::polynomials::evalPolyn $h $x]
    set exact   1.0
    for { set i 0 } { $i < 20 } { incr i } {
       set exact   [expr {$exact*($x-$i-0.1)}]
    }
    if { $exact != 0.0 } {
       set error1  [expr {($result1-$exact)/$exact}]
       set error2  [expr {($result2-$exact)/$exact}]
       puts [format "%4.2f %20.12e %20.12e %20.12e -- %10e %10e" \
                $x $exact $result1 $result2 $error1 $error2]
    } else {
       set error1  -
       set error2  -
       puts [format "%4.2f %20.12e %20.12e %20.12e -- %10s %10s" \
                $x $exact $result1 $result2 $error1 $error2]
    }
 }

 #
 # Check Legendre 2, 4, ..., 50
 #
 puts "order (n), Pn(1), Pn(-1)"
 puts "expected: 1 and 1"
 for { set i 2 } { $i <= 50 } { incr i 2 } {
    set leg [::math::special::legendre $i]
    set p1 [::math::polynomials::evalPolyn $leg 1.0]
    set p2 [::math::polynomials::evalPolyn $leg -1.0]

    puts [format "%5d %20.12g %20.12g" $i $p1 $p2]
 }

The result is:

 Degree:       20
 Coefficients: 190.049637749 -2923.84058075 21366.5273209 -98614.7414812 322394.347149 -793586.085291 1526127.0871 -2347887.82631 2934859.78288 -3010112.59783 2547018.352 -1781131.71469 1027575.98924 -486426.503784 187087.11684 -57565.26672 13837.8045 -2504.58 321.1 -26.0 1.0
 g:
 x, "exact", naive, Horner -- relative errors
 0.10   3.833759992447e+01   3.833759992476e+01   3.833759992476e+01 -- 7.338657e-12 7.316973e-12
 0.20   6.727499949326e+00   6.727499948326e+00   6.727499947888e+00 -- -1.486001e-10 -2.137647e-10
 0.30   1.000000000000e+00   9.999999934125e-01   9.999999919215e-01 -- -6.587529e-09 -8.078473e-09
 0.50   1.152921504607e-02   1.152913820723e-02   1.152913493922e-02 -- -6.664706e-06 -6.948161e-06
 1.00   3.486784401000e-11  -1.930119685767e-05  -1.431931048046e-05 -- -5.535539e+05 -4.106748e+05
 1.20   1.000000000000e-20  -1.147058508124e-04  -7.339982499843e-05 -- -1.147059e+16 -7.339982e+15
 1.30   0.000000000000e+00  -2.580087965214e-04  -1.517559886679e-04 --          -          -
 1.40   1.000000000000e-20  -5.542161987933e-04  -2.977163204889e-04 -- -5.542162e+16 -2.977163e+16
 1.60   3.486784401000e-11  -2.267025187393e-03  -9.986696039732e-04 -- -6.501765e+07 -2.864157e+07
 2.00   7.979226629761e-04  -2.498599886894e-02  -6.362375005551e-03 -- -3.231381e+01 -8.973674e+00
 5.00   2.312248366666e+11   2.312248356551e+11   2.312248368273e+11 -- -4.374473e-09 6.947283e-10
 h:
 x, "exact", naive, Horner -- relative errors
 0.10  -0.000000000000e+00   3.739352465198e+04   4.481800000000e+04 --          -          -
 0.20  -8.460014212631e+15  -8.460014212595e+15  -8.460014212587e+15 -- -4.224343e-12 -5.158266e-12
 0.30  -1.154825498147e+16  -1.154825498143e+16  -1.154825498142e+16 -- -3.410039e-12 -3.675014e-12
 0.50  -9.999185058306e+15  -9.999185058245e+15  -9.999185058266e+15 -- -6.050493e-12 -4.009927e-12
 1.01  -7.137691193017e+14  -7.137691188577e+14  -7.137691192674e+14 -- -6.220714e-10 -4.806876e-11
 1.20   4.923817795711e+14   4.923817806832e+14   4.923817796032e+14 -- 2.258555e-09 6.511969e-11
 1.30   7.371226583915e+14   7.371226601609e+14   7.371226584225e+14 -- 2.400446e-09 4.204643e-11
 1.40   8.035389574478e+14   8.035389602065e+14   8.035389574776e+14 -- 3.433108e-09 3.705436e-11
 1.60   6.341259826947e+14   6.341259889764e+14   6.341259827225e+14 -- 9.906112e-09 4.386116e-11
 2.10  -5.923385583628e-02   3.500333996906e+07   2.383600000000e+04 -- -5.909347e+08 -4.024060e+05
 5.10  -3.774706499371e-03   8.557366072250e+09   3.909520000000e+05 -- -2.267028e+12 -1.035715e+08
 order (n), Pn(1), Pn(-1)
 expected: 1 and 1
     2                    1                    1
     4        1.00000000002        1.00000000002
     6        0.99999999994        0.99999999994
     8        0.99999999992        0.99999999992
    10        0.99999999907        0.99999999907
    12        1.00000000349        1.00000000349
    14        1.00000000299        1.00000000299
    16       0.999999939378       0.999999939378
    18         1.0000000112         1.0000000112
    20        1.00000501373        1.00000501373
    22        1.00001560093        1.00001560093
    24       0.999974806438       0.999974806438
    26        1.00033043498        1.00033043498
    28        1.00687428185        1.00687428185
    30        1.04755829826        1.04755829826
    32        1.13065619275        1.13065619275
    34        1.29242497941        1.29242497941
    36        1.53122404724        1.53122404724
    38        10.4985706725        10.4985706725
    40        50.6674061331        50.6674061331
    42       -504.307776538       -504.307776538
    44       -971.900395524       -971.900395524
    46        7574.82722195        7574.82722195
    48        111470.061522        111470.061522
    50          500314.6215          500314.6215

The overall conclusions:

  • Horner's rule is in many cases (slightly) more accurate, but can be way off the correct value (see for instance: h(2.1), which should have been zero - but even the "exact" value is definitely not zero).
  • Evaluating Legendre polynomials should not be done with Horner's rule when the order gets over, say, 10 or 20.

TV (13 jul 04) Em, urh, well, the remarks starting of this page suggest things I don't really understand.

What's wrong with filling in a simple (for engineers)

  N
 ----
 \          i
   >   a . x
 /      i
 ----
 i=0

For any moderate N? That's a matter of usually floating point or maybe rationals or in some cases integer precision, for which such a formula has a normal condition, nothing special, unless maybe N is large or X is large or the sheer number of i's makes the mantissa not acurate enough. I don't see the problem.

With respect to approximations by polynoms, I don't see your point, but I do know what an important point can be: using polynomal or power basis to approximate functions, direct polynomal approximation like in a spline approach can start oscilating and becoming unstable when the number of terms starts to grow. Higher power point-evaluation based polynomials are not good idea, and often of course as always it will depend on your problem: most natural problems do not by nature *have* higher orders in them, so trying to approximate by using higher powers is doomed to be inacurate by nature, or leave you with an approximation where you're trying to find higher order terms which you'd basically find to be zero.

As engineer one would know of course that one of the by far preferable approximation methods for many practical problems, and of great theoretical value, which in a sense can be seen to have polynomial results, is the Taylor Expansion.

I've mentioned this expansion being available in Maxima on the page about that package.

With respect to numerical accuracy in general: according to (as I remember vividly from a 1st year physics practicum) normal error propagation rules, one can compute what the maximum error is for a certain evaluation of a given formula, and then figure out if the number of bits in the mantissa of the floating point operations is sufficient. Doing that analysis to some depth would seem intesting, but then one would probably start with integers, and include a decent rounding analyis.

AM One thing that is "wrong" with the direct evaluation of a polynomial is the number of multiplications: O(N**2), whereas Horner's rule gives you O(N). Furthermore Taylor series have the somewhat unfortunate property to converge slowly for many functions. (Instead of Taylor series one often finds series of Chebyshev polynomials).

TV Than that is analysis-wise still wrong, because that is a matter of efficiency, which in e.g. computer graphics is the essential reasons to for certain purposes use horner. For the same accuracy, a refactoring can be interesting, and possibly, use of algebraics in terms of distributiveness, commutative, associative and substitutions cna greatly simplify a formula, or prevent inaccurate intermedeate approximations. Horner's rule is also efficient in terms of reusing previous iterations, but isn't more accurate, to my knowledge.

Not saying I don't appreciate or understand the value of other approximations then taylor, the mian reason for it's importance is that it guarantees a function's approximation will even be perfect when the degree of the approximated function is reached, it is intuitive enough for advanced mathamaticians, and even the absolute error bounds can be found.

Often polynomials are used as approximation, then you shouldn't confuse the error of the approximation, and the ways to reduce that error between the approximated function and the approximator, and the error in evaluating the approximation formula.

The source of errors in simply evaluating or computing the value for a polynomial is of course the rounding which takes place at each step of the computation. For instance a reason for inaccuracies in doing spline-like approximations with lagrange polynomials based on inaccurate sampling points is that intermedeate results tend to grow out of bounds. For instance because the data points are chosen such that it appears as if higher orders are present, which in some intermedeate sum or product may lead to large error magnidication in the approximator.

Basic (for engineers) error analyis can tell you beforehand how the above examples would run, and then have that problem properly analysed. That is maybe a bit more work, but more fundamental.

In the computer/digital domain, the digital error analysis shouldn't be confused with the inherent digital computations' inaccuracies, such as the inpreventable rounding when a floating or fixed point multiplication is fitted in a word smaller than the double width of the input words to the operations. And of course also in the digital domaain such error analysis is completely possible.

Mathematically, you'd want to use symbolic fractions, and possibly rationals to prevent most of the damge, the functions which you propose seem to have fixed structure, so it should be possible to do the analysis and solution synthesis once, and reuse it for different data relatively efficiently.


KBK - The problem with evaluating the orthogonal polynomials as polynomials in limited-precision arithmetic is that they are extraordinarily ill-conditioned. The x**n coefficient of the nth Chebyshev polynomial is 2**(n-1) - and yet the Chebyshev polynomials are known always to stay between zero and one on the unit interval, owing to delicate cancellation. Floating-point arithmetic is prone to trouble in such cases, where numbers opposite in sign but nearly equal in magnitude are added. The numerical analysts call the phenomenon "catastrophic loss of significance."

Horner's rule occasionally helps, because it tends to confine all arithmetic to operating on the same scales of magnitude. But for cases like Chebyshev series, you really need a custom evaluation procedure with rigorous error control. Using rationals in such a case doesn't help all that much, because you will usually want to evaluate

   ___
   \    a  T (x)
   /     n  n
   ---
   n>=0

for some unknown x - and the usual reason for representing a function as Chebyshev series is runtime efficiency. If you have to maintain the a's and evaluate the sum in unlimited precision, you've just lost that efficiency.

Fortunately, the orthogonal polynomial series are defined by simple recurrences, and Clenshaw's formula [L1 ] does help. There's an (oversimplified) treatment in Numerical Recipes [L2 ], and a Google for 'Clenshaw recurrence formula' turns up a lot, including programs that use Clenshaw for the discrete cosine transformation, the spherical harmonics, the Lagrange polynomials, and so on.

AM (15 july 2004) Well, kbk was kind enough to actually program Chebyshev approximation in Tcl - I post the code here for every one to see and look at - pay attention to the very nice demo! (This deserves to be added to Tcllib, but it is holiday time ...).


TV (15 jul 04) About Taylor expansions: one of the reasons for it's importance is that it is a good idea to look at the function which is to be approximated by for instance a polynomial, and determine what the degree is that the to-be-approximated function has, and make sure the the approximator has no higher degree than the approxiatee, fore very clear mathematical reasons.

Lars H: For the above to makes any sense, you have to supply a definition of "the degree of the to-be-approximated function" and furthermore indicate how it fits in with Taylor expansions. I very much doubt that there is such a definition that is not merely restricted to polynomials, in which case the above reduces to the triviality that a polynomial can be approximated by itself. It thus says nothing about the general usefulness of Taylor approximations, which IMO is often greatly overestimated. (TV It's an engineering approach, probably. Taylor needs the derivatives in a point until they are zero, and then of course the result is a polygon. But it can also be a row, which can be used for important mathematical proofs)

And on top of that: the historic reasons for certain forms of approximations often lies in the fact that either mathematical proofs require certain outcomes (like the chebichevs have to do with the number of possible factorings of a polynomial), or with existance or other general proofs like 'is it possible or reasonable to take a certain order polynomial and approximate thi sparticular function or class of functions in that way'.

When an approximation is essentially of a certain degree, no matter what refactoring is taken, somewhere in the solution or evealuation path of that approximation than you will encounter *that* degree to which you approximate. That of course doesn't make refactoring redundant or unimportant, or legitimate terms like ... +x^99 -x^99 or equivalent, or rouding errors with similar effect.

Taylor expansions around a point of a mathematically known function is of the form:

  N
 ----
 \               i
   >   a . (x-x )
 /      i      0
 ----
 i=0

where one may expect x-x0 no to be all to extreme, because the approximation is supposed to be around x0. So the ill-conditioning of that, also for higher powers of x (who has practical problems of degree 20 or higher, lets say?) lies only in the fact that there are a number of multiplications involved equal to the degree of the term.

And at each multiplication step, even when the eponent range isn't superseded, the number of bits which is needed to store the result of multiplying the mantissas would need to be 2n when there are n bits in the mantissa. Throwing away the least significant n of those 2n is always an error, except when they are zero. That happens when muliplying with numbers which have a small integer mantissa in floating point representation, but in average computions is not so likely.

But with any approximation scheme, like Horner's, which in the end delivers the chosen degree of approximation, a similar 'effective' number of multiplications takes place, with the same ongoings.

Lars H: Horner's scheme is not an approximation scheme, but merely a more efficient way of organising the calculations involved in computing the value of a polynomial. (TV Indeed that is true, it is a refactoring, with eye for reuse of results, and therefore other rounding errors.)

In terms of digital signal path analysis, the rounding errors in the course of non-sufficiently accurate computations can be seen as noise, being generated at a equivalent level of one half of the least significant bit at each point of the computation where rounding takes place, such as computations where the result is twice as many bits as the mantissa of a stored floating point number, or additions, where usually one bit of result of the mantissa additions' result will be lost.

As an approximation, one can take the noise to be uniformly distributed, non-correlated. and that because of the law of great numbers, over a large number of compuations we can take noise builtup to behave according to gaussian distribution. This is the reason the average outcome of computations involving a lot of theoretical noise buildup can still be fairly ok.

It should be noted, just like the assumption that the product of two floating poitn numbers is another floating point number, that those statistical assumptions are all essentially incorrect, except on average, depending on the type of computations, they can be a nice starting point. The error introduced by throwing away bits to bring back resolution to a fixed number of mantissa bits is certainly no generally uniformly distributed, very correlated with the numbers used, and therefore will in infinite limit also not ideally converge to gaussians with perfect statistical properties. But the reasonable hope that over a long number of computations the rounding errors will not be very far away from the expectation value is not unreasonable, depending on the conditioning of the problem.