Approximation function f(x) with a polynomial

I have a number of x,y value pairs. I want to approximate a function y= a + bx + cx^2 + dx^3 + ex^4 + fx^5 + gx^6 ...

How do I get this function with the use of the known x1, ... xn and y1, ... yn?
Who is Participating?
nullsquidConnect With a Mentor Commented:
A simple but cpu intensive way is to substitute the x and y into the function
D= y - a + bx + cx^2 + dx^3 + ex^4 + fx^5 +..
where you have guessed the coefficients abcd..
This would give a 'distance', D, from the line for each point; add the absolute values of these together to get a score for the set of guessed coefficients. you then guess different sets of coefficients until you minimise D - such that the points are as close to the line as you can get.

I think Excel has a function that does it, you will get better results with more powers in the equation, because, I believe, you can always fit n points exactly to a nth - order polynomial.

further reading:
thehagmanConnect With a Mentor Commented:
Actually, you can always fit a degree n-1 polynomial to n points.
One can also look for lower degree polynomials and fit them using least square methods.

One way to find an exact interpolation is to look at polynomials of the form
f_k(x) = (x - x_1)*(x - x_2)*...*(x - x_{k-1})*(x - x_{k+1})* ... * (x-x_n)
These are 0 at all sample points except at x_k.
Hence one can easily combine them to a polynomlia with f(x_i)=y_i for all i.

However, it is often the case that polynomial interpolation simply "does not look good" (shows large oscillation and other artefacts). You may want to have a look at spline interpolation (smoothly joined piecewise polynomials).
gatorvipConnect With a Mentor Commented:
It really depends on what you are trying to achieve. As has already been pointed out, if you have n points, you will always have at least one polynomial of degree at most n that fits your points exactly.

However, as thehagman points out, this type of polynomial is very dependent on the exact number of points. If you so happen to add the (n+1) point, your initial polynomial may not look at all like the new one. If you're dealing with a real-world application, this can have unintended results.

This is why, if your number of input points may vary, you don't always want to deal with max-degree polys. Instead, you choose something like a cubic spline where the addition or removal of a single point will generally not have any influence on the behavior of the curve outside of the neighborhood of the new point.

Take a look at
A proven path to a career in data science

At Springboard, we know how to get you a job in data science. With Springboard’s Data Science Career Track, you’ll master data science  with a curriculum built by industry experts. You’ll work on real projects, and get 1-on-1 mentorship from a data scientist.

>>at least one polynomial of degree at most n
That should read "at most n-1"
Nullsquid has given  you an intuitive way of doing what you wish. If the errors in your data are random (the usual assumption) you want to minimize the square of the distance, not the absolute value. The result will be a least square approximation as thehagman pointed out. Almost all math programs have a built in function which will fit a polynomial of any dergee you wish to your data. As thehagman pointed out any amount of data can be exactly fitted to a polynomial of n-1 degree but the utility of the exact fit is very limited.  Rarely does more than a third or fourth degree polynomial do any good. There are other functions which might be better depending upon the source of your data. Log curves and exponentials come to mind.
Excell, mathcad, matlab, mathematica, and most statistical programs will have a least square function which will do what you want.
ozoConnect With a Mentor Commented:
y1 = a + b x1 + c x1^2 + d x1^3 + e x1^4 + f x1^5 + g x1^6 .
 y2 = a + b x2 + c x2^2 + d x2^3 + e x2^4 + f x2^5 + g x2^6 .
is a system of linear equations which you can solve for a,b,c,d,e,f,g
if you have more points than a...g coefficients, there may not be a simultaneous solution to all the equations,
If you want to find coefficients that minimize the sum of the squares of the errors, that turns out to be equivalent to solving another system of linear equations, this time with coefficients like:

sum(Yi) =          a  n                + b sum(Xi)     + c sum(Xi^2) + d sum(Xi^3 ) + ...
sum(Yi*Xi) =     a sum(Xi)       + b sum(Xi^2) + c sum(Xi^3) + d sum(Xi^3 ) + ...
sum(Yi*Xi^2) = a sum(Xi^2)  + b sum(Xi^3) + c sum(Xi^4) + d sum(Xi^5 ) + ...
where sums are over i=1..n for n points

    Fitting a high order polynomial to data for purposes of interpolation is a BAD idea. This is because in general polynomials have many turning points. Although your polynomial may pass through or close to  the points it could osicillate wildly in between.

    There is no magic wand for data fitting. Giving more details of the problem you are trying to solve would help.
CaptainCyrilFounder, Software Engineer, Data ScientistCommented:
The points could be approximated by a line, a simple curve like a parabola or a hyperbola with some errors or any other curve. If you are trying to find a trend or approximation is one thing and if you are trying to find an exact match is another thing.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.