Version 12 of Polynomial fitting

Arjen Markus (5 april 2004) Knowing orthogonal polynomials like the Legendre family and others, and having played a bit with discrete Fourier transforms (see Discrete Fourier Transform) I thought I would try an analoguous technique polynomials.

There is one practical difference with discrete Fourier transforms: it is difficult (but not impossible) to write down closed formulae for the polynomials to be used. I started with:

The nodes or auxiliary points are x = k/N (where k = -m, -m+1, ..., m-1, m, N=2m+1, and similar - but different for even N)
The poynomials must be orthogonal in the sense that the sum

       m
      Sum   Pi(x)*Pj(x) = 0,   if i != j
      k=-m

The first two are simple:

    P0(x) = 1
    P1(x) = x

The third becomes troublesome already because you have to distinguish between even and odd N.

So, I decided to give up on this analytic approach and instead use a practical (numerical) one:

I try to fit N data points
The nodes for the polynomials are x = k, k=1,N
The polynomials that I want to fit with are determined "on the fly", using a simple Gram-Schmidt orthogonalisation process and subsequent normalisation.
Due to the orthogonality the fitting of the data becomes almost trivial, just as with the discrete Fourier transform.

(Improvements:

Reduce the overhead of double computations - certain intermediate results are computed more than once
The toString procedure does not take care of minus signs - I overlooked that possibility and was too lazy to repair that. Probably the best solution is to modify the string afterwards.)

 # discrpol.tcl --
 #    A discrete polynomial transform:
 #    Just like with a discrete Fourier transform a series of data
 #    is fitted with orthogonal polynomials, the orthogonality being
 #    defined via sums over discrete (equidistant) points.
 #    This leads to sums of the type:
 #         N
 #        Sum  k**p
 #        k=1
 #    which can be expressed in a closed form for any integer p,
 #    but the formulae get complicated and there is little system
 #    in them (well, there is a very complicated system in them).
 #
 #    So, we compute the orthogonal polynomials "on the fly"
 #
 #    Note:
 #    The polynomials are represented as a tagged list of
 #    coefficients. The independent coordinates are taken to be
 #    from 1 to N (the number of data)

 namespace eval ::dpt {
    # Make it exist
 }

 # EvalPolynomial --
 #    Evalute a polynomial at a given ordinate
 #
 # Arguments:
 #    coeffs      The coefficients of the polynomial
 #    x           The ordinate
 # Result:
 #    Value of the polynomial at x
 #
 proc ::dpt::EvalPolynomial { coeffs x } {
    set result 0.0
    set px     1.0
    foreach c $coeffs {
       set result [expr {$result+$c*$px}]
       set px     [expr {$px*$x}]
    }
    return $result
 }

 # auxiliaryPoints --
 #    Return a list of auxiliary points (for evaluting the polynomials)
 #
 # Arguments:
 #    nodata      Number of data
 # Result:
 #    List of ordinates to evaluate the polynomials at
 # Note:
 #    This procedure defines in fact the details of the polynomial
 #    fitting.
 #
 proc ::dpt::auxiliaryPoints { nodata } {
    set xlist  {}
    for { set x 1 } { $x <= $nodata } { incr x } {
       lappend xlist $x
    }
    return $xlist
 }

 # valuesPolynomial --
 #    Evalute a polynomial for a list of ordinates
 #
 # Arguments:
 #    polyn       The polynomial
 #    xlist       List of ordinates
 # Result:
 #    Values of the polynomial at each ordinate as a list
 #
 proc ::dpt::valuesPolynomial { polyn xlist } {
    set result {}
    set coeffs [lrange $polyn 2 end]
    foreach x $xlist {
       lappend result [EvalPolynomial $coeffs $x]
    }
    return $result
 }

 # discreteNorm --
 #    Determine the (discrete) norm of a given polynomial
 #
 # Arguments:
 #    poly      The coefficients of the polynomial
 # Result:
 #    Norm of the polynomial (sqrt(sum(Poly(x)**2))
 #
 proc ::dpt::discreteNorm { poly } {
    set nodata [lindex $poly 1]
    set xlist  [auxiliaryPoints $nodata]

    set values [valuesPolynomial $poly $xlist]
    set result [inproduct $values $values]
    expr {sqrt($result)}
 }

 # inproduct --
 #    Determine the inproduct of two data vectors
 #
 # Arguments:
 #    data1     The values in the first vector
 #    data2     The values in the second vector
 # Result:
 #    Inproduct
 #
 proc ::dpt::inproduct { data1 data2 } {
    set result 0.0

    foreach d1 $data1 d2 $data2 {
       set result [expr {$result+$d1*$d2}]
    }
    return $result
 }

 # scalePolynomial --
 #    Scale a polynomial
 #
 # Arguments:
 #    poly      The polynomial to be scaled
 #    factor    The factor to be applied
 # Result:
 #    New polynomial
 #
 proc ::dpt::scalePolynomial { poly factor } {

    set newpoly [lrange $poly 0 1]
    foreach c [lrange $poly 2 end] {
       lappend newpoly [expr {$c*$factor}]
    }
    return $newpoly
 }

 # addPolynomials --
 #    Add two polynomials
 #
 # Arguments:
 #    poly1     The first polynomial
 #    poly2     The second polynomial
 # Result:
 #    New polynomial
 # Note:
 #    The two polynomials need not be of the same degree!
 #
 proc ::dpt::addPolynomials { poly1 poly2 } {

    set newpoly [lrange $poly1 0 1]
    foreach c1 [lrange $poly1 2 end] c2 [lrange $poly2 2 end] {
       if { $c1 == {} } { set c1 0.0 }
       if { $c2 == {} } { set c2 0.0 }
       lappend newpoly [expr {$c1+$c2}]
    }
    return $newpoly
 }

 # axpyPolynomials --
 #    Add two polynomials and scale the first at the same time
 #
 # Arguments:
 #    factor    Factor for scaling the first
 #    poly1     The first polynomial
 #    poly2     The second polynomial
 # Result:
 #    New polynomial
 # Note:
 #    The two polynomials need not be of the same degree!
 #
 proc ::dpt::axpyPolynomials { factor poly1 poly2 } {

    set newpoly [lrange $poly1 0 1]
    foreach c1 [lrange $poly1 2 end] c2 [lrange $poly2 2 end] {
       if { $c1 == {} } { set c1 0.0 }
       if { $c2 == {} } { set c2 0.0 }
       lappend newpoly [expr {$factor*$c1+$c2}]
    }
    return $newpoly
 }

 # toString --
 #    Turn a polynomial into a readable string
 #
 # Arguments:
 #    poly     The polynomial to be converted
 # Result:
 #    String of the form 1.0 + 1.2*x + 3.0*x^2 ...
 #
 # Note:
 #    Not quite right yet - I forgot about minus signs ...
 #
 proc ::dpt::toString { poly } {

    set string ""
    set power  -1
    foreach c [lrange $poly 2 end] {
       incr power
       if { $c == 0 } { continue }

       #
       # Trick: leave out 1.00000
       #
       set cs "$c"
       if { $cs != 1.0 || $power == 0 } {
          append string "$cs"
          if { $power != 0 } {
             append string "*"
          }
       }
       if { $power > 0 } {
          append string "x"
       }
       if { $power > 1 } {
          append string "^$power"
       }
       append string " + "
    }

    return [string range $string 0 end-3]
 }

 # orthogonalPolynomials --
 #    Build a list of N discretely orthogonal polynomials
 #
 # Arguments:
 #    nodata      Number of data
 #    maxdegree   Maximum degree
 # Result:
 #    List of polynomials that are "discretely" orthogonal
 # Note:
 #    This procedure uses an ordinary Gram-Schmidt algorithm
 #    to achieve the orthogonal (orthonormal) polynomials.
 #    It may not be ideal, but it is fairly straightforward.
 #    Improvements in terms of computational speed:
 #    Some things are calculated more than once, for instance
 #    the values of the polynomial at the auxiliary point.
 #
 proc ::dpt::orthogonalPolynomials { nodata maxdegree } {

    set result   {}
    set v_points {}

    #
    # The first polynomial is simple: P(x) = c, where c = 1/sqrt(nodata)
    #
    set polyn [list DISCRETE-POLYNOMIAL $nodata 1.0]
    set norm  [expr {1.0/[discreteNorm $polyn]}]

    lappend result [scalePolynomial $polyn $norm] ;# To get a norm 1

    #
    # The auxiliary points for which to evalue the polynomials
    #
    set xlist [auxiliaryPoints $nodata]

    lappend v_points [valuesPolynomial [lindex $result 0] $xlist]

    #
    # From there we build up our polynomials
    #
    for { set degree 1 } { $degree <= $maxdegree } { incr degree } {
       set polyn    [linsert $polyn 1 0.0]
       set newpolyn $polyn

       foreach orthop $result orthov $v_points {
          set coeff   [inproduct $orthov [valuesPolynomial $newpolyn $xlist]]
          set newpolyn [axpyPolynomials [expr {-$coeff}] $orthop $newpolyn]
       }

       #
       # Normalise
       #
       set newpolyn [scalePolynomial $newpolyn [expr {1.0/[discreteNorm $newpolyn]}]]
       lappend result   $newpolyn
       lappend v_points [valuesPolynomial $newpolyn $xlist]
    }
    return $result
 }

 # fittingPolynomial --
 #    Compute the best fitting (approximating) polynomial of given degree
 #
 # Arguments:
 #    data        The data to be fitted
 #    maxdegree   The maximum degree for the polynomial
 # Result:
 #    New polynomial that fits the data
 # Note:
 #    The procedure is fairly straightforward:
 #    - Compute the orthonormal polynomials first
 #    - Determine the coefficients for each
 #    - Sum the result
 #
 proc ::dpt::fittingPolynomial { data maxdegree } {
    set nodata    [llength $data]
    set orthopols [orthogonalPolynomials $nodata $maxdegree]

    #
    # The auxiliary points for which to evalue the polynomials
    #
    set xlist [auxiliaryPoints $nodata]

    set result [list DISCRETE-POLYNOMIAL 0.0]
    foreach p $orthopols {
       set coeff  [inproduct [valuesPolynomial $p $xlist] $data]
       set result [axpyPolynomials $coeff $p $result]
    }
    return $result
 }

 # testOrthonormality --
 #    Compute how well the list of polynomials fits orthonormality
 #
 # Arguments:
 #    nodata      The number of data in the fits
 #    maxdegree   The maximum degree for the polynomial
 # Result:
 #    None
 # Side effects:
 #    Printed report
 #
 proc ::dpt::testOrthonormality { nodata maxdegree } {
    set orthopols [orthogonalPolynomials $nodata $maxdegree]

    puts "Number of data points: $nodata"
    puts "Discrete norm of the polynomials (should be 1)"
    set degree 0
    foreach p $orthopols {
       puts "degree $degree: norm = [discreteNorm $p]"
       incr degree
    }

    puts "Non-orthogonality between polynomials (inproduct should be zero)"
    set degree 0
    set xlist  [auxiliaryPoints $nodata]
    foreach p $orthopols {
       set degree2 [expr {$degree+1}]
       foreach q [lrange $orthopols $degree2 end] {
          set inp [inproduct [valuesPolynomial $p $xlist] [valuesPolynomial $q $xlist]]
          puts "degrees $degree and $degree2: inproduct = $inp"
          incr degree2
       }
       incr degree
    }
 }

 # a few tests ---
 #

 set polyn [list DISCRETE-POLYNOMIAL 10 1.0 0.0 2.0 1.0]
 puts [::dpt::toString $polyn]

 set polyn1 [list DISCRETE-POLYNOMIAL 10 1.0 2.0]
 set polyn2 [list DISCRETE-POLYNOMIAL 10 0.0 0.0 2.0 1.0]
 puts [::dpt::toString [::dpt::addPolynomials $polyn1 $polyn2]]

 set orthopols [::dpt::orthogonalPolynomials 10 3]
 foreach p $orthopols {
    puts [::dpt::toString $p]
 }

 puts "Fit trivial data:"
 puts "-1+x expected"
 set data   [list 0 1 2 3 4 5 6 7 8 9 10]
 set result [::dpt::fittingPolynomial $data 3]
 puts [::dpt::toString $result]

 puts "1-2*x+x^2 expected"
 set data   [list 0 1 4 9 16 25 36 49 64 81 100]
 set result [::dpt::fittingPolynomial $data 3]
 puts [::dpt::toString $result]

 puts "Test for quality of orthogonal polynomials"
 ::dpt::testOrthonormality 10 4
 ::dpt::testOrthonormality 100 10

The results:

 1.0 + 2.0*x^2 + x^3
 1.0 + 2.0*x + 2.0*x^2 + x^3
 0.316227766017
 -0.605530070819 + 0.110096376513*x
 0.957427107756 + -0.478713553878*x + 0.0435194139889*x^2
 -1.54380482359 + 1.36927211043*x + -0.296885542998*x^2 + 0.017993063212*x^3
 Fit trivial data:
 -1+x expected
 -1.0 + 1.0*x + 3.90981074709e-014*x^2 + -2.11315780976e-015*x^3
 1-2*x+x^2 expected
 1.0 + -2.0*x + 1.0*x^2 + -3.79690387208e-015*x^3
 Test for quality of orthogonal polynomials
 Number of data points: 10
 Discrete norm of the polynomials (should be 1)
 degree 0: norm = 1.0
 degree 1: norm = 1.0
 degree 2: norm = 1.0
 degree 3: norm = 1.0
 degree 4: norm = 1.0
 Non-orthogonality between polynomials (inproduct should be zero)
 degrees 0 and 1: inproduct = -5.55111512313e-017
 degrees 0 and 2: inproduct = -1.36002320517e-015
 degrees 0 and 3: inproduct = 1.20181642416e-014
 degrees 0 and 4: inproduct = -2.02504679692e-013
 degrees 1 and 2: inproduct = -2.10942374679e-015
 degrees 1 and 3: inproduct = 1.93178806285e-014
 degrees 1 and 4: inproduct = -2.31370478332e-013
 degrees 2 and 3: inproduct = 9.90874049478e-015
 degrees 2 and 4: inproduct = -8.87345752432e-014
 degrees 3 and 4: inproduct = -2.43693953905e-014
 Number of data points: 100
 Discrete norm of the polynomials (should be 1)
 degree 0: norm = 1.0
 degree 1: norm = 1.0
 degree 2: norm = 1.0
 degree 3: norm = 1.0
 degree 4: norm = 1.0
 degree 5: norm = 1.0
 degree 6: norm = 1.0
 degree 7: norm = 1.0
 degree 8: norm = 1.0
 degree 9: norm = 1.00000000001
 degree 10: norm = 1.00000000001
 Non-orthogonality between polynomials (inproduct should be zero)
 degrees 0 and 1: inproduct = 1.2490009027e-016
 degrees 0 and 2: inproduct = 1.21430643318e-016
 degrees 0 and 3: inproduct = -6.71337985203e-015
 degrees 0 and 4: inproduct = -1.0762224445e-014
 degrees 0 and 5: inproduct = 7.17013254325e-013
 degrees 0 and 6: inproduct = -9.62304888552e-012
 degrees 0 and 7: inproduct = 9.26707738325e-011
 degrees 0 and 8: inproduct = -7.20321007702e-010
 degrees 0 and 9: inproduct = 4.39475776257e-009
 degrees 0 and 10: inproduct = -1.81058440948e-008
 degrees 1 and 2: inproduct = -2.77555756156e-017
 degrees 1 and 3: inproduct = -4.71150896075e-015
 degrees 1 and 4: inproduct = -1.44675937896e-014
 degrees 1 and 5: inproduct = 4.85986251242e-013
 degrees 1 and 6: inproduct = -6.81009415526e-012
 degrees 1 and 7: inproduct = 7.36949667957e-011
 degrees 1 and 8: inproduct = -6.32568483705e-010
 degrees 1 and 9: inproduct = 4.16724826868e-009
 degrees 1 and 10: inproduct = -1.84985766982e-008
 degrees 2 and 3: inproduct = 9.36750677027e-016
 ...
 degrees 8 and 9: inproduct = -3.20738435811e-011
 degrees 8 and 10: inproduct = 8.70094343797e-010
 degrees 9 and 10: inproduct = -3.22448748258e-010

TV Isn't it so that except for special cases, N points span exactly one Nth degree polynomial Aix^i [No. You need N+1 points, since there are N+1 coefficients in an Nth degree polynomial. - Lars H] by solving for the Ai's, filling the points in, and that that solution arrives at the lagrange interpolation (I've looked it up, that's what it's called) polynomial solution? [Lagrange interpolation is one way to arrive at the unique polynomial passing through N+1 points. The formula is straightforward, but the coefficients with respect to the standard basis x^i of the interpolation polynomial are not readily apparent from this formula. An alternative is Newton interpolation, where the points are added one at a time. LH again.] In fact the question mark isn't needed: for any N+1 data points there is exactly 1 polynomial of degree <=N which passes through those points, so when the polynomial is to pass through N+1 general (yi,xi) with x,y el. |R , 0<i<N , there is one and only one solution according to the legendre polynomial, which is a fraction of N linear factors by N normalisation factors.

Polynomials as basis either to span a solution or interpolation space is of course interesting. Maybe look at what's sort of known in the computer graphics community, I found a nice overview (starts at page 3) here: [L1 ]

An orthogonal/orthonomal power base in Hilbert or Fock space (in mathematica/physics) usually involves an integral measure to determine correlation/orthogonality, taking the whole (continuous) polynomial into account over some interval, not just the control or data points.

N-th degree polynomials usually start to get mighty unstable and annoying for N like 20 or 100, and many problems are of lower degree. So usually one wants a fitting polynomial, of which of course well known ones are splines, partial least squares approximations, beziers (see above link).

For CAGD applications, there are power basises which maintain what is called the 'partition of unity', which means the basis functions add up to one everywhere on the curve, and are used to weigh the vectors spanning the curve or surface (I think it is also in the above link).

I'm sure when polynomials and interpolation are being discussed, each engineer (and of course everybody with interest) should know about Taylor Expansion as phenomena/method.

Lars H: Something probably should be said about the theory of Generalized Fourier series here, because both this page and Discrete Fourier Transform touches upon the subject.

The basic idea is to define an inner product on some set of functions, usually the set of all real-valued (but complex-valued works fine too) funtions on some interval [a,b] that satisfy some "smoothness" condition (such as continuity, Riemann integrability, or having everywhere continuous first derivative). If f and g are two such functions, then their inner product <f,g> is given by an integral on the form

   \int_a^b f(x) g(x) W(x) dx

where W is a "weight function". For the inner products that give rise to trigonometric Fourier series or Legrende-Fourier series this function is constant, but in many standard families of orthogonal polynomials the orthogonality is with respect to a non-constant weight. If the functions are complex-valued then one needs to complex conjugate one of the functions, so that the inner product formula becomes

   \int_a^b f(x) \overline{g(x)} W(x) dx

Finally, it is possible to integrate with respect to a general measure instead of using the Riemann integral as shown above. This makes it possible to treat inner products defined using sums as special cases of inner products defined using integrals, thus unifying the theoretical machinery. The discreteNorm procedure above is the norm corresponding to the inner product <f,g> defined by the integral

   \int_1^N f(x) g(x) d\mu(x)

where \mu is the discrete measure with weight 1 at points 1, 2, ..., N and weight 0 everywhere else.

Once the inner product is defined, it is all Linear Algebra. [To be continued we I have more time.]

jmp If the issue is data fitting with polynomials (*not* data interpolation), what about B-splines? People seldom know that splines can also be used to fit data, not only to draw smooth curves for graphical applications. Moreover there is an explicit solution for the fitting problem which relies on linear problem solving (provided there is enough data to invert matrices). I can not detail what this consists in because I prefer using the Matlab spline toolbox! What are the spline advantages? You can choose the polynomial order and the knot position that best correspond to your data. My experience is that this is a very interesting replacement for global polynomial fitting and linear (lowpass) filtering: knots positions usually match the data "waves" you want to keep so that algorithm tuning is easier. Just tune the spline (= polynomial) order to make fitted data more or less smooth. Unfortunately I don't have a web link dealing with this issue, but I know there is a reference book on spline: A practical guide to splines (de Boor). This is full of formulas...

AM (20 april 2004) Sure, splines do have significant advantages when it comes to interpolation and fitting. The above merely decribes one possible way. Besides B-splines, cubic splines seem useful too and a lot easier as far as the maths is concerned.

However, they do lack the nice physical interpretation of the various components in a Fourier series (sines/cosines, orthogonal polynomials ....) - that of "energy"

TV Just to repeat: for a number of points, there is exactly one fitting lagramnge polynomial, for which there exists a closed form formula (a fraction with terms for the zeros in the enumerator and normalizing factors in the denominator), so no real need to do a lot of complicated approximation stuff. Probably with that one could to least squares for curve fitting where the degree of the curve is lower than lagrange polynomial corresponding directly to the data points.

AM Hm, if monsieur Lagrange's interpolation method had been perfect, nobody would have bothered inventing another one :). The main use of that method, AFAIK, is theoretical - the practical use is severely limited by the fact that there is no useful bound on the error you make. There is no pointwise convergence. So, other methods - many other methods - have been invented, all with their own limitations and awkwardnesses.

[ Category Mathematics

Category Numerical Analysis

]