The Problem with Polynomials -- Sage

To view this worksheet actively, first download it to your computer using the link above. Then, create an account for yourself at https://cloud.sagemath.com/ . Upload this worksheet into that site and enjoy!

"The Problem with Polynomials"

John Travis

Mississippi College

January, 2015

Presented at Tougaloo College Forum...January 23, 2015

Ever done one of those Facebook IQ tests?

from http://www.themarysue.com/iq-tests-how-do-they-work/

A typical question in these exams is something on the order of:

Determine the next term in the sequence given by

1, 4, 7, 11, 15, ...

So, what's the answer?

Choice 1:

1 + 3 = 4

4 + 3 = 7

7 + 4 = 11

11 + 4 = 15

15 + 5 = 20

20 + 5 = 25

26 + 6 = 32

...

Choice 2:

1 = one, the first integer with three letters

4 = four, is the first integer after 1 with four letters

7 = seven, is the first integer after 4 with five letters

11 = eleven, is the first integer after 7 with six letters

15 = fifteen, is the first integer after 11 with seven letters

18 = eighteen, is the first integer after 15 with eight letters

21 = twentyone, is the first integer after 21 with nine letters

24 = twentyfour, ...

So, which is the correct answer?

It can be anything you want...

%hide %auto # Complete the sequence with whatever you want for the next terms... # R = PolynomialRing(QQ, "x") @interact def _(pts=input_box([(1,1),(2,4),(3,7),(4,11),(5,15)],label='Enter list of points',width=50)): n = len(pts) P(x) = R.lagrange_polynomial(pts) p = expand(P) G = points(pts,size=20,color='green') for k in range(n): v = k+1 G += text("$%s$"%str(pts[k][1]),(v,P(x=v)-1)) G += plot(P,x,1,n,color='lightblue') G += text("$y = %s$"%str(latex(p(x))),(n+1,P(x=n)+1)) G.show(title='Polynomial passing through a given sequence')

Enter list of points

Click to the left again to hide and once more to show the dynamic interactive window

How does one determine what comes next? Is there a "formula" to determine things.

Ever since a student is first introduced to the idea of functions, most examples have centered around "nice" functions and especially polynomials. For example, consider the homework set from Stewart's Calculus 7th edition:

html("<IMG SRC='Stewart.png'>")

Polynomials are often used since they are relatively easy to evaluate and use.

Are they helpful though as mathematical models for making predictions and estimations?

From http://www.epa.gov/climatechange/science/future.html , observed and projected changes in global average temperature

Global Warming

The process of determining exact values for past dates is actually not exact but it is even more reasonable to question the projections made for dates in the future.

What about world population?

Consider the following data representing world population (in millions) collected from http://www.ecology.com/population-estimates-year-2050/ . Notice the projections indicated for 2020, 2030, 2040, and 2050.

%hide %auto # http://www.ecology.com/population-estimates-year-2050/ pops = [(1910, 1750), (1920, 1860), (1930, 2070), (1940, 2300), (1950, 2557), (1960, 3042), (1970, 3712), (1980, 4453), (1990, 5291), (2000, 6094), (2010, 6868), (2020, 7656), (2030, 8321), (2040, 8874), (2050, 9306)] X = [] Y = [] for pop in pops: X.append(pop[0]) Y.append(pop[1]) Gpts = points(pops[0:11],color="green",size=30)+points(pops[11:15],color="orange",size=50) Gpts.show(title='World Population Estimates 1910-2050')

For now, let's presume that the population values for 1910-2010 are reasonably close. If so, then we can look back and estimate the population for years between the data points using a process known as interpolation.

Interpolation presumes that the user can utilize some (continuous) function which meets all of the given data values precisely but which can be evaluated at any time to give an estimate between the data values. That is, $f(x)$ interpolates the data points

$(x_0,y_0), (x_1,y_1), (x_2,y_2), ... (x_n,y_n) $

provided

$\forall k \in \left \{ 0, 1, ... , n \right \}, f(x_k) = y_k $.

On the other hand, once a successful interpolating function $f(x)$ has been generated, one often utilizes the formula which was designed to estimate values between data points to estimate values beyond the given data points. This process is known as extrapolation. This is similar to using a formula for the IQ sequence to predict the next terms in the sequence.

%hide %auto # Interpolating at one point var('x,y') @interact def _(x0 = slider(-1,3,1,1,label="given x ="),y0 = slider(-1,5,1,3,label="given y ="), g1 = checkbox(default=true,label="$f_1$"), g2 = checkbox(default=false,label="$f_2$"), g3 = checkbox(default=false,label="$f_3$"), g4 = checkbox(default=false,label="$f_4$"), ): pt = (x0,y0) x1 = x0 - 2 x2 = x0 + 2 f1 = y0+0*x # create function which also interpolates to one at x1 and zero at x2 f2 = (x-x1)*(x-x2)*y0/((x0-x1)*(x0-x2))+(x-x0)*(x-x2)*1/((x1-x0)*(x1-x2)) # create functions which are non-polynomial and interpolate f3 = y0*3/(x-x0+3) f4 = y0*(6/pi)*asin(x-x0+1/2) G = point(pt,color='red',size=60) if g1: G += text("$y = %s$"%str(latex(f1)),(x1,f1(x=x1)-0.2)) G += plot(f1,x,x1,x2,color="lightblue",zorder=3) if g2: G += text("$y = %s$"%str(latex(f2.expand())),(x1,f2(x=x1)-0.2)) G += plot(f2,x,x1,x2,color="lightgreen") if g3: G += text("$y = %s$"%str(latex(f3)),(x1,f3(x=x1)-0.2)) G += plot(f3,x,x1,x2,color="orange") if g4: G += text("$y = %s$"%str(latex(f4)),(x0-1,f4(x=x0-1)-0.2)) G += plot(f4,x,x0-3/2,x0+1/2,color="purple") T = "Interpolating at the data point ("+str(x0)+","+str(y0)+")" G.show(title=T)

given x =

given y =

$f_1$

$f_2$

$f_3$

$f_4$

Click to the left again to hide and once more to show the dynamic interactive window

Notice, each of these do interpolate the single given data point $(1,3)$:

$f_1(1) = 3$

$f_2(1) = - \frac{5}{8}(1)^2 + (1) + \frac{21}{8} = 3$

$f_3(1) = \frac{9}{(1)+2} = 3$

$f_4(1) = \frac{18 sin^{-1}((1)-\frac{1}{2})}{\pi} = \frac{18*\pi/6}{\pi} = 3$

What about more than one data point? For our population data above we are given 11 population data points from 1910-2010 so finding a formula in general which passes through them all in a reasonable fashion can be somewhat messy if we proceed as above. To make this easier, we will look for the foumula to be as simple as possible. Our experience in the past has been that polynomials are the easiest functions to deal with so let's look for a polynomial of smallest degree possible. Such a polynomial $p(x)$ is called the Lagrange Polynomial.

%hide %auto # Given a set of points with unique x values, this function generates # the Lagrange polynomial and graphs it. # # Usage: lagraph(pts,zoom_in) # pts = list containing data points [(x0,y0), ... , (xn,yn)] # zoom_in = Boolean flag indicating whether to zoom graph at (xn,yn) def lagraph(pts,zoom_in): # if nargs == 1: # zoom_in = false global G xs = [] ys = [] for pt in pts: xs.append(pt[0]) ys.append(pt[1]) R = PolynomialRing(QQ, "x") P(x) = (R.lagrange_polynomial(pts)) p = expand(P) if zoom_in: n = len(xs) G = points(pts[n-1],size=20,color='green') delx = (max(xs)-min(xs))/n G += plot(P,x,max(xs)-delx,max(xs),color='lightblue') G += text("$y = %s$"%str(latex(p(x))),(max(xs),P(x=max(xs))-0.2)) else: G = points(pts,size=20,color='green') G += plot(P,x,min(xs),max(xs),color='lightblue') G += text("$y = %s$"%str(latex(p(x))),(max(xs),P(x=max(xs))-0.2)) G.show() return G,p

For the interactive cell below, type in a list of data points. Since we want a function to interpolate these points then all of the x-values must be distinct.

%hide %auto # "Simplest" function is Lagrange @interact def _(pts = input_box([(1,3),(2,1),(4,0)],label="Enter data points to interpolate: ")): lagraph(pts,false) poly = 0 for k in range(len(pts)): p = 1 for j in range(len(pts)): p=p*pts[k][1] if k<>j: p=p*(x-pts[j][0])/(pts[k][0]-pts[j][0]) poly = poly+p html('Lagrange Polynomial (Basis Form):') show(poly)

Enter data points to interpolate:

Click to the left again to hide and once more to show the dynamic interactive window

General basis formula for Lagrange:

p(x) = $\sum_{k=0}^{n} \prod_{k \ne j} {\frac{x-x_j}{x_k-x_j}}$

For example, to interpolate the data points originally given above (1, 3), (2, 1), (4, 0), simply write:

$p(x) = 3 \frac{(x-2)(x-4)}{(1-2)(1-4)}+1\frac{(x-1)(x-4)}{(2-1)(2-4)}+0\frac{(x-1)(x-2)}{(4-1)(4-2)}$

Student look up:

Vandermonde Matrix Interpolation
Newton's Divided Differences

%auto %hide # Creating a Lagrange Interpolating Polynomial using the "Basis approach" html('Lagrange Interpolating Polynomial for Population Data:') @interact def _(num_pts = slider(2,15,1,2,label="Number of years")): poly = 0 for k in range(num_pts): p = 1 for j in range(num_pts): p=p*pops[k][1] if k<>j: p=p*(x-pops[j][0])/(pops[k][0]-pops[j][0]) poly = poly+p show(poly)

Lagrange Interpolating Polynomial for Population Data:

Number of years

Click to the left again to hide and once more to show the dynamic interactive window

Notice, as the number of data points increases so does the degree of the resulting polynomial and therefore so does the number of changes of inflection. In general, the Lagrange polynomial interpolating n+1 data points with distinct x-values will be of degree n (or less.)

So, let's use the Lagrange Interpolating Polynomial to address the world population data. We get different formulas dependent upon how many years we want to take into account.

%hide %auto @interact def _(a = slider(1910,2050,10,1920,label="Start Decade"), b = slider(1910,2050,10,1940,label="End Decade"), zoom_in = checkbox(default=false,label="Zoom into last point")): a1 = (a-1910)/10 b1 = (b-1900)/10 if a==b: print "For this demonstration, please cover at least two time periods." else: pops_past = pops[a1:b1] [G,p] = lagraph(pops_past,zoom_in) print "The Lagrange Interpolating Polynomial: ", p

Start Decade

End Decade

Zoom into last point

Click to the left again to hide and once more to show the dynamic interactive window

Notice that as we interpolate more points not only does the degree of the resulting polynomial increase but also the coefficients become increasingly large and alternating in sign. This yields a corresponding decrease in computational accuracy due to round-off issues and ultimately yields a formula which, while technically precise, is practically useless.

Student look up: What are the benefits of "Horner's Method"?

So, can we expect these formulas to be useful at all for extrapolation as well? Using the cell above we get the following interpolants:

1920-1940:

$ p_1(x) = \frac{1}{10}x^2 - 364x +332100 $

1950-2010:

$p_2(x) = \frac{33}{40000000}x^6 - \frac{117613}{12000000}x^5 + \frac{11643529}{240000}x^4 - \frac{3073782307}{24000}x^3 + \frac{570538028629}{3000}x^2 - \frac{7530504229893}{50}x + 49696300029814$

1970-2010:

$p_3(x) = \frac{23}{40000}x^4 - \frac{367}{80}x^3 + \frac{5489919}{400}x^2 - \frac{91245562}{5}x + 9099040894$

How well do these work for predicting the future (forecasting)? Or recreating the past (hindcasting)?

%hide %auto # extrapolation p1 = 1/10*x^2 - 364*x +332100 p2 = 33/40000000*x^6 - 117613/12000000*x^5 + 11643529/240000*x^4 - 3073782307/24000*x^3 + 570538028629/3000*x^2 - 7530504229893/50*x + 49696300029814 p3 = 23/40000*x^4 - 367/80*x^3 + 5489919/400*x^2 - 91245562/5*x + 9099040894 @interact def _(g1 = checkbox(default=true,label="1920-1940 data"), g2 = checkbox(default=false,label="1950-2010 data"), g3 = checkbox(default=false,label="1970-2010 data")): G = Gpts if g1: G += plot(p1,x,1910,2050,color='green') if g2: G += plot(p2,x,1910,2050,color='orange') if g3: G += plot(p3,x,1910,2050,color='brown') G.show()

1920-1940 data
1950-2010 data
1970-2010 data

Click to the left again to hide and once more to show the dynamic interactive window

Notice how bad these Lagrange polynomials are for extrapolation.

The problem is even worse actually in that polynomials are VERY sensitive to the values of their higher order coefficients. Indeed, the basis approach takes "factored" terms such as described below:

Consider the following polynomial which has easy to determine roots at $1, 2, ..., 20$:

$p(x) = (x-1)(x-2)(x-3) ... (x-20)$

Below, we look at variants of this and note how just determining the roots works out when we change of value of the (n-1)st order term's coefficient by 0.000001.

%hide %auto # The sensitivity of requiring strict interpolation @interact def _(n = slider(1,20,1,3,label="Number of zeros to interpolate"), eps = input_box(default=1/10000000,label="Epsilon",width=12), do_p2 = checkbox(default=false,label="Slightly change one coefficient"), zoom_in = checkbox(default=false,label="Zoom onto the x-axis only")): global p1 p = 1 pts = [] for k in range(n): p = p*(x-k) pts.append((k,0)) p1 = p.expand() rootsP = p.roots() rootsPn = [] for root in rootsP: rootsPn.append(root[0].n()) G = points(pts,color='red',size=50,zorder=3) G += plot(p,x,0,n-1,color='blue',thickness = 5) show(p1) if do_p2: p2 = p1 + eps*x^(n-1) G += plot(p2,x,0,n-1,color='orange',thickness=3) show(p2) rootsP2 = p2.roots() rootsP2n = [] for root in rootsP2: rootsP2n.append(root[0].n()) print "Roots of original polynomial:" show(rootsPn) if do_p2: print "Roots of perturbed polynomial:" show(rootsP2n) if zoom_in: show(G,ymin=-10^14,ymax=10^14) else: show(G)

Number of zeros to interpolate

Epsilon

Slightly change one coefficient

Zoom onto the x-axis only

Click to the left again to hide and once more to show the dynamic interactive window

Certainly there is some function which will precisely measure the world population for any time in the future...if we only knew it then it would be of great interest!

We see that the use of Lagrange polynomials is possible for interplolation and extrapolation but they might not be actually helpful in practice since drastic changes occur when in the presence of roundoff or when the data might be imprecise.

What if we remove the restriction that we precisely interpolate data points but only require us to be close.

Weierstrass Approximation Theorem

If $f$ is a continuous real-valued function on $[a,b]$, then $\forall \epsilon > 0, \exists $ a polynomial $p$ on $[a,b]$ such that

$|f(x) - p(x)| < \epsilon, \forall x \in [a,b]$

This means, in other words, that there is a polynomial of perhaps high degree which approximates the actual world population to within any desired degree of accuracy.

The problem is that this result is an existence result and not a constructive one so we have no idea how to obtain what is guaranteed.

# Problem #1 # Requiring strict adherence to data points can lead one to undesirable behavior in the real world. # Problem #2 # What about extrapolation...using the formula to predict values outside the range of the given data.

Rather than interpolating at a large collection of data points, why not just interpolate "really well" at just one. Indeed, this might work well if you have more information...such as slope and curvature...at a point or collection of points.

Taylor Polynomials - satisfying positional data as well a given number of derivative values at one point

%hide %auto # Taylor's Polynomials # From the Sage wiki var('x') @interact def _(f = input_box(default=sin(x)*e^(-x),label='$f(x) = $',width = 30),x0 = slider(-2,3,1,1,label='$x_0 =$'),order=(1..12), boundsx = input_grid(1,2,default=[-1,5],label='Horizontal Boundaries'), boundsy = input_grid(1,2,default=[-1/2,1],label='Vertical Boundaries')): lowx = boundsx[0][0] highx = boundsx[0][1] lowy = boundsy[0][0] highy = boundsy[0][1] p = plot(f,(x,lowx,highx), thickness=2) dot = point((x0,f(x=x0)),pointsize=80,rgbcolor=(1,0,0)) ft = f.taylor(x,x0,order).factor(x) pt = plot(ft,(x,lowx, highx), color='green', thickness=2) html('$f(x)\;=\;%s$'%latex(f)) html('$\hat{f}(x;%s)\;=\;%s+\mathcal{O}(x^{%s})$'%(x0,latex(ft),order+1)) show(dot + p + pt, ymin = lowy, ymax = highy,title="Taylor's polynomial")

$f(x) = $

$x_0 =$

order

Horizontal Boundaries

Vertical Boundaries

Click to the left again to hide and once more to show the dynamic interactive window

Let's apply this single point approach to the population data...

%hide %auto # Taylor's with our population data # Taylor's Polynomials # From the Sage wiki var('x') @interact def _(x0 = slider(1910,2050,10,1980,label='$x_0 =$'),order=[1..3], boundsx = input_grid(1,2,default=[1950,2010],label='Horizontal Boundaries'), boundsy = input_grid(1,2,default=[2000,10000],label='Vertical Boundaries')): lowx = boundsx[0][0] highx = boundsx[0][1] lowy = boundsy[0][0] highy = boundsy[0][1] # NEED TO DETERMINE APPROXIMATION TO DERIVATIVES AND CONTRUCT TAYLOR BY HAND. # m1 = decade = (x0-1910)/10 m=[0] mm=[0,0] mmm=[0,0,0] for k in range(1,14): m.append((pops[k+1][1]-pops[k-1][1])/20) for k in range(2,13): mm.append((m[k+1]-m[k-1])/20) for k in range(3,12): mmm.append((mm[k+1]-mm[k-1])/20) if order==1: ft = pops[decade][1] + m[decade]*(x-x0) if order==2: ft = pops[decade][1] + m[decade]*(x-x0) + mm[decade]*(x-x0)^2/2 if order==3: ft = pops[decade][1] + m[decade]*(x-x0) + mm[decade]*(x-x0)^2/2 + mmm[decade]*(x-x0)^3/6 p = plot(ft,(x,1910,2050), thickness=2) dot = point((x0,ft(x=x0)),pointsize=80,rgbcolor=(1,0,0)) html('$\hat{f}(x;%s)\;=\;%s\$'%(x0,latex(ft))) show(dot + p + Gpts, ymin = lowy, ymax = highy,title="Taylor's polynomial with Population Data")

Click to the left again to hide and once more to show the dynamic interactive window

It turns out that even interpolation is sometimes messy. For example, let's use a Lagrange Polynomial to approximate the rational function

$f(x) = \frac{1}{1+x^2}$

on $[-4,4]$.

%hide %auto fun = 1/(1+x^2) @interact def _(n=slider(3,40,2,3,label='Number of Points'),zoom_in=checkbox(default=false)): G = plot(fun,(x,-5,5),color='yellow',thickness=1) pts = [] delx = 10/(n-1) for k in range(n): u = -5 + k*delx pts.append((u,fun(x=u))) G += points(pts,color='red',size=30) R = PolynomialRing(QQ, "x") P(x) = (R.lagrange_polynomial(pts)) G += plot(P,(x,-5,5),color='green') if zoom_in: G.show(ymax=5,ymin=-5) else: G.show()

Click to the left again to hide and once more to show the dynamic interactive window

Ok then. If interpolation doesn't yield a good model for extrapolation, let's try to get something "close" to the data points (ala Weierstrass Theorem).

Least Squares

Best Fit

Regression

The next best thing might be to break up the data into pieces and approximate using several low degree polynomials. This will help with interpolation but not with extrapolation.

%hide %auto # Best Fit Line var('m,b,a,b,c,d,r,s,x') n = len(X) @interact def _(g1 = checkbox(default=true,label="Best-Fit Line"), g2 = checkbox(default=false,label="Best-Fit Parabola"), g3 = checkbox(default=false,label="Best-Fit Cubic"), g4 = checkbox(default=false,label="Best-Fit Quartic"), g5 = checkbox(default=false,label="Best-Fit Quintic"), usefuture = checkbox(default=true,label="Use Predicted Data"), goto2070=checkbox(default=false,label="Predict 2070"), goto2100=checkbox(default=false,label="Predict 2100")): G = points(pops,color='red',pointsize=20) if usefuture: data = pops else: data = pops[0:11] if goto2100: M = 2100 elif goto2070: M = 2070 else: M = 2050 if g1: y(x) = m*x + b bestfit = find_fit(data,y,solution_dict=True) f = y(m=bestfit[m],b=bestfit[b]) G += plot(f,(x,min(X)-0.5,M+0.5),color='green') G += text('Linear',(M,f(x=M)),horizontal_alignment='left',color='green') html('$y_{Linear} = %s$'%str(latex(f))) if g2: y(x) = a*x^2 + b*x + c bestfit = find_fit(data,y,solution_dict=True) f = y(a=bestfit[a],b=bestfit[b],c=bestfit[c]) G += plot(f,(x,min(X)-0.5,M+0.5),color='orange') G += text('Quadratic',(M,f(x=M)),horizontal_alignment='left',color='orange') html('$y_{Quadratic} = %s$'%str(latex(f))) if g3: y(x) = a*x^3 + b*x^2 + c*x + d bestfit = find_fit(data,y,solution_dict=True) f = y(a=bestfit[a],b=bestfit[b],c=bestfit[c],d=bestfit[d]) G += plot(f,(x,min(X)-0.5,M+0.5),color='purple') G += text('Cubic',(M,f(x=M)),horizontal_alignment='left',color='purple') html('$y_{Cubic} = %s$'%str(latex(f))) if g4: y(x) = a*x^4 + b*x^3 + c*x^2 + d*x + r bestfit = find_fit(data,y,solution_dict=True) f = y(a=bestfit[a],b=bestfit[b],c=bestfit[c],d=bestfit[d],r=bestfit[r]) G += plot(f,(x,min(X)-0.5,M+0.5),color='purple') G += text('Quartic',(M,f(x=M)),horizontal_alignment='left',color='red') html('$y_{Quartic} = %s$'%str(latex(f))) if g5: y(x) = a*x^5 + b*x^4 + c*x^3 + d*x^2 + r*x + s bestfit = find_fit(data,y,solution_dict=True) f = y(a=bestfit[a],b=bestfit[b],c=bestfit[c],d=bestfit[d],r=bestfit[r],s=bestfit[s]) G += plot(f,(x,min(X)-0.5,M+0.5),color='purple') G += text('Quintic',(M,f(x=M)),horizontal_alignment='left',color='gray') html('$y_{Quintic} = %s$'%str(latex(f))) G.show(title="Least-Squares Approximations")

Click to the left again to hide and once more to show the dynamic interactive window

The next best thing might be to break up the data into pieces and approximate using several low degree polynomials. This will help with interpolation but not with extrapolation.

%hide %auto # Piecewise linear interpolation G = Gpts+line(pops) G.show(figsize=(8,4),title="Piecewise linear Interpolation")

What about using piecewise cubics...with slopes (i.e. derivatives) provided at each data point as well.

Hermite Interpolation

var('u,t') H0 = (1-u)^2*(1+2*u) H1 = (1-u)^2*u H2 = (u-1)*u^2 H3 = (3-2*u)*u^2 X = (2,5,6,8,10) Y = (5,3,4,2,6) slopes = (0,1,2,1,1) G = Graphics() n = len(X) Hermite = 0 for k in range(n-1): x0 = X[k] x1 = X[k+1] y0 = Y[k] y1 = Y[k+1] m0 = slopes[k] m1 = slopes[k+1] delx = x1 - x0 p = y0*H0(u=(x-x0)/delx)+m0*delx*H1(u=(x-x0)/delx)+m1*delx*H2(u=(x-x0)/delx)+y1*H3(u=(x-x0)/delx) Hermite += unit_step(x-x0)*unit_step(x1-x)*p # The step functions appear to not stay p.expand().show() G += plot(p,x,x0,x1) G += points([(x0,y0),(x1,y1)],pointsize=20,color='red') G += parametric_plot((x0+t,m0*t+y0),(t,-0.3,0.3),color='green') G += parametric_plot((x1+t,m1*t+y1),(t,-0.3,0.3),color='green') G.show(title='Piecewise Cubic Hermite Interpolant')

$\newcommand{\Bold}[1]{\mathbf{#1}}\frac{7}{27} \, x^{3} - \frac{23}{9} \, x^{2} + \frac{64}{9} \, x - \frac{29}{27}$

$\newcommand{\Bold}[1]{\mathbf{#1}}x^{3} - 16 \, x^{2} + 86 \, x - 152$

$\newcommand{\Bold}[1]{\mathbf{#1}}\frac{5}{4} \, x^{3} - \frac{53}{2} \, x^{2} + 185 \, x - 422$

$\newcommand{\Bold}[1]{\mathbf{#1}}-\frac{1}{2} \, x^{3} + \frac{27}{2} \, x^{2} - 119 \, x + 346$

So, we can use polynomials to interpolate so long as they don't get of too high degree or provided the numerical coefficients are too large. But neither regular polynomials or piecewise versions are very helpful for extrapolation.

What to do?

Rather than trying to predict the future population data using the past only, perhaps it would be better to model the underlying dynamics to obtain a formula for extrapolation.

Model 1: Population growth is proportional to the size of the existing population.

In mathematical terms, if P(t) is the population at time t:

$P'(t) = k P(t)$

for some parameter k, which has solution

$P(t) = P_0 e^{kt}$

%hide %auto # Exponential population growth model @interact def _(k=slider(0.005,0.025,0.001,0.001,label="Growth Rate Parameter"),start=slider(1910,2000,10,1910,label="Starting Date"),goto2070=checkbox(default=false,label="Predict 2070")): if goto2070: M = 2070 else: M = 2050 decade = (start-1910)/10 P0=float(pops[decade][1]) print P0 model = P0*e^(k*(x-start)) G = points(pops) G += plot(model,x,1910,M) G.show(title="Exponential Growth Model")

Click to the left again to hide and once more to show the dynamic interactive window

%hide %auto # Logistic population growth model @interact def _(k=slider(0.000001,0.000005,0.0000002,0.000001,label="Growth Rate Parameter"), C = input_box(default=12000,label="Carrying Capacity"), start=slider(1910,2000,10,1910,label="Starting Date"), goto2070=checkbox(default=false,label="Predict 2070")): if goto2070: M = 2070 else: M = 2050 decade = (start-1910)/10 P0=float(pops[decade][1]) print P0 model = C/(1+(C/P0-1)*e^(-C*k*(x-start))) G = points(pops) G += points((start,P0),size=30,color='red') G += plot(model,x,1910,M) G += plot(C,x,1910,M,color="lightgrey") G.show(title="Logistic Growth Model") show(model.simplify())

Click to the left again to hide and once more to show the dynamic interactive window

Conclusion: We love polynomials because they are easy to evaluate and manipulate. However, using them in practical situations must be done with great care and especially so if we are using them for making predication that matter.

The Problem with Polynomials

2850 days ago by travis

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window

Click to the left again to hide and once more to show the dynamic interactive window