Solved

Regression for a second degree polynomium

Posted on 2005-04-29
Medium Priority
446 Views
Hi!

If I have a number of coordinates fx

(1,3)
(2,9)
(3, 29)
(4, 69)

and I would like to draw a second degree polynomium regression graph: y=ax^2+bx+c

How do I find a, b and c?

Regards,

Nicolai
0
Question by:JNic
• 9
• 3
• 3
• +8

LVL 2

Expert Comment

ID: 13893601
Not that it matters, but aren't they polynomials?
0

LVL 1

Author Comment

ID: 13893669
The coordinates I wrote are completely random. But anyway I need the formula for the "best" approximated second degree polynomium for these pairs.
0

LVL 9

Accepted Solution

jhshukla earned 600 total points
ID: 13893970
[ 1 1 1]    [a]    [3]
[ 4 2 1] * [b] = [9]
[ 9 3 1]    [c]    [29]
[16 4 1]           [69]

M*a = b

now since you can't get an inverse of non-square matrix
M'M*a = M'*b
(M'M)*a = M'*b
a = inv(M'M)*M'*b
0

LVL 9

Expert Comment

ID: 13894033
to get a sine regresion instead y = a*sin(x)+b
M will have two columns: col1 = sin(x), col2 = 1. solve the matrix eqn and voila! you have it. note that it can't be used to find phase in this form. probably you can get phase by y = a*sin + b*cos + c. not sure. try it you. once you have relative amplitudes of sin and cos, you can represent it as sin(x+phase) pretty easily.
0

LVL 6

Assisted Solution

Sergio_Hdez earned 400 total points
ID: 13894194
Here you have a Delphi code that do exactly that.

Note that Pto is an array of points, being Pto.Count the number of points and Pto[i].x the x-coordinate of the point number i. The delphi function Power(a,b) computes a^b. I hope all the other things are not delphi related so you all understand it.

//Calc coefs of the 2nd degree polinomium that better fits an array of points (minimun cuadratic difference on Y coordinates)
procedure TGParabola.Calcula;
var
i,j,k: integer;
s: array[1..3] of array[1..4] of double;
s40,s30,s20,s10,s21,s11,s01: double;
begin
s40:=0;s30:=0;s20:=0;s10:=0;s21:=0;s11:=0;s01:=0;
for i:=1 to Pto.count do begin
s40:= s40 + power(Pto[i].x, 4);
s30:= s30 + power(Pto[i].x, 3);
s20:= s20 + power(Pto[i].x,2);
s10:= s10 + Pto[i].x;
s21:= s21 + power(Pto[i].x, 2) * Pto[i].y;
s11:= s11 + Pto[i].x * Pto[i].y;
s01:= s01 + Pto[i].y;
end;
s[1,1]:= s40; s[1,2]:= s30; s[1,3]:= s20; s[1,4]:= s21;
s[2,1]:= s30; s[2,2]:= s20; s[2,3]:= s10; s[2,4]:= s11;
s[3,1]:= s20; s[3,2]:= s10; s[3,3]:= Pto.count; s[3,4]:= s01;
for i:=1 to 3 do begin
for k:=i+1 to 4 do
if s[i,i]<>0 then s[i,k]:= s[i,k]/s[i,i];
s[i,i]:=1;
for k:=i+1 to 3 do begin
for j:=i+1 to 4 do s[k,j]:= s[k,j]-s[i,j]*s[k,i];
s[k,i]:=0;
end;
end;
c:=s[3,4];
b:=s[2,4]-s[2,3]*c;
a:=s[1,4]-s[1,3]*c-s[1,2]*b;
// Polinomium is: a*x^2 + b*x + c = y
end;
0

LVL 1

Author Comment

ID: 13894658
H! Thanks a lot for this!

But: I need to calculate this without looping! In the environment where I must enter this formula, there is no looping-possibility, so I need a flat mathematical way to express this.

Is there a (simple) way?
0

LVL 1

Author Comment

ID: 13894671
BTW: I will boost the points further, since you actually answered the original question very well! :-)
0

LVL 10

Expert Comment

ID: 13895696
0

LVL 22

Expert Comment

ID: 13896112
For your (x,y) pairs, are the x's always 1,2,3,4 or can they be arbitrary?

btw, your original progression looks cubic, not quadratic.  Paybe a third-order polynomial would be more appropriate.
0

LVL 27

Assisted Solution

d-glitch earned 400 total points
ID: 13896494
You can always convert a polynomial regression problem into a linear regression:

Your equation will be of the form:       y   =   Ax^2  +  Bx  +  C

Your input data are:                      (x, y)  ==>  (1, 3)   (2, 9)   (3, 29)   (4, 69)

Your linear equations are:                  3   =   A       +  B     +  C
9   =   A*4   +  B*2  +  C
29  =   A*9   +  B*3  +  C
69   =  A*16 +  B*4  +  C

This gives quite a good fit.

0

LVL 6

Expert Comment

ID: 13897060
How would you do it without any loops?

Imagine you have 20 points, not only 3, or even you could have 2000 points... you can not make a simple formula to fit all the 2000 points without a loop!

You could get this if the problem was not so general: May be you will always have 4 points, or something similar that could simplify the question.
0

LVL 27

Expert Comment

ID: 13897541
>> Apologies to jhshukla

We said essentially the same thing, except you were earlier, more complete, and more concise.
0

LVL 9

Expert Comment

ID: 13898348
d-glitch>> You can always convert a polynomial regression problem into a linear regression:
thanks for the terminology. i knew that this could be done but did not think about it that way.
0

LVL 1

Author Comment

ID: 13900146
Hi all!

Thank you for your suggestions! :-)

I need to stress (I said it earlier) that the coordinates I wrote are completely random - they could actually be anything. What I DO need, is the "closest" second degree polynomium to the points. I guess there are many different methods for this, I would be happy with just one.

You can consider the number of coordinate pairs to be fixed. - There will always be four (in my case). The values could be ANYTHING. (so the x's are not neccesarily 1,2,3,4) However it would be nice with a "method" for handling any number of points.

d-glitch: This looks nice, - but I dont understand completely how to continue. Your four equations will of course not "fit together" since the coordinates are not in fact belonging to a second degree polynomium. How do I go on?

BoBSiemens: Which formula(s) exactly do you think solves my problem?

jhshukla: Looks very nice. Could this be "converted" to some sort of formula?  - Maybe instead of using loops like

for i:=1 to Pto.count do begin
s40:= s40 + power(Pto[i].x, 4);
end

we could write:   SUM(Xi^4)   <-- This text-editor has its limitations for math, so I wrote it excel-style :-/

And for all of you: If I ask silly questions or write silly things it is possibly because I am not a mathematician. Please bare with me.... :-)
0

LVL 85

Assisted Solution

ozo earned 200 total points
ID: 13900350
For a least squares fit of a second degree polynomial to 4 points, we have

[      y1+       y2+      y3+       y4]
[ x1*y1+  x2*y2+ x3*y3+  x4*y4]
[x1Â²*y1+x2Â²*y2+x3Â²*y3+x4Â²*y4]
=
[c]
X * [b]
[a]

where X =
[1   +1   +1    +1     x1 +x2  +x3 +x4   x1Â²   +x2Â²  +x3Â²  +x4Â²  ]
[x1  +x2 +x3  +x4   x1Â²+x2Â²+x3Â²+x4Â²  x1Â³   +x2Â³  +x3Â³  +x4Â³   ]
[x1Â²+x2Â²+x3Â²+x4Â²  x1Â³+x2Â³+x3Â³+x4Â³  x1^4+x2^4+x3^4+x4^4]
so
[c]
[b]
[a]
=
[      y1+       y2+      y3+       y4]
[ x1*y1+  x2*y2+ x3*y3+  x4*y4]   * XÂ¯Â¹
[x1Â²*y1+x2Â²*y2+x3Â²*y3+x4Â²*y4]

for a 3Ã—3 matrix
XÂ¯Â¹
=
1/|X|
*
[|x22 x23|  |x13 x12|  |x12 x13|]
[|x32 x33|  |x33 x32|  |x22 x23|]
[                                            ]
[|x23 x21|  |x11 x13|  |x13 x11|]
[|x33 x31|  |x31 x33|  |x23 x21|]
[                                            ]
[|x21 x22|  |x12 x11|  |x11 x12|]
[|x31 x32|  |x32 x31|  |x21 x22|]

0

LVL 37

Expert Comment

ID: 13901299
Hi JNic,
Newton's General Interpolation Formula can be used for this purpose quite easily...

First you will have to create a "Difference Table" like this...

First column is x and second is y, then you have differences per unit change in x...

(9-6)/(2-1), (29-9)/(3-2), (69-29)/(4-3) in first column
(20-6)/(3-1), (40-20)/(4-2) in second column
(10-7)/(4-1) in third column ...

x    f(a)  f(a,b) f(a,b,c) f(a,b,c,d)
_____________________________________
1       3
6
2       9               7
20               1
3      29              10
40
4      69
_____________________________________

Now,
f(x) = f(x0) + (x-x0) f(x0,x1) + (x-x0)(x-x1) f(x0,x1,x2) + (x-x0)(x-x1)(x-x2) f(x0,x1,x2,x3) + ...

ie.,

f(x) = 3 + (x-1) 6 + (x-1)(x-2) 7 + (x-1)(x-2)(x-3) 1

= 3 + 6 (x-1) + 7 (x^2 - 3x + 2)  + 1 (x^3 - 6x^2 + 11x - 6)

= 3 + 6x - 6 + 7x^2 - 21x + 14 + x^3 - 6x^2 + 11x - 6

= x^3 + x^2 - 4x + 5
___________________________

Now you can cross check the results...

f(1) = 1 + 1 - 4 + 5 = 3
f(2) = 8 + 4 - 8 + 5 = 9
f(3) = 27 + 9 - 12 + 5  = 29
f(4) = 64 + 16 - 16 + 5 = 69

Bye
---
Harish
0

LVL 37

Expert Comment

ID: 13901323
JNic,
Note that it is a third degree polynomial...

In general, (with some exceptions), any function will have (n-1)th degree when there are n points given.

This (NGIF) can be applied to any number of points and also when the points are not equidistant, or even when they are in any order.

Here are the links to Newton's General Interpolation Formula (Newton's Divided Difference Formula)

Or Lagrange's Interpolation Formula...

Or Newton's Forward/Backward Interpolation Formulae...
0

LVL 37

Assisted Solution

Harisha M G earned 200 total points
ID: 13901333
(9-6)/(2-1), (29-9)/(3-2), (69-29)/(4-3) in first column

Should have been
(9-3)/(2-1), (29-9)/(3-2), (69-29)/(4-3) in first column
___
0

LVL 1

Author Comment

ID: 13901919
mgh mgharish: Thanks a lot for that. But as far as I can see, you found a third degree polynomial that fits my points excactly. What I needed was a second degree regression graph = The second degree polynomium that "looks most like" my four points.
This is possible to do in different ways, since there is no perfectly correct answer.

Now for all who provided matrix-solutions, I must (shameful) admit that I have never calcutated anything using Matrixes, so I cant even tell if your equations have solved my problems. :-(    :-(    :-(

I have a feeling that Sergio Hdez was on the right track with his Delphi-code, it would be interesting to see if this could be written as a formula.

No matter what, I will hand out some points tomorrow, - I really appreciate all your comments!
0

LVL 6

Expert Comment

ID: 13904005
great comments posted until now, none are realy using engineerical assumptions/approximation:

I assume we can omit the 4th point, get ANALYTICAL solution for the remaining 3 points, calculate "error" for the 4th point, do that 4 times, and check for the best solution with the minimal "error":

just solve this (replace numbers with your random numbers and use solutuion from jhshukla)
[ 1 1 1]    [a4]    [3]
[ 4 2 1] * [b4] = [9]
[ 9 3 1]    [c4]    [29]
and get (a4,b4,c4) , e4 = sqrt((69-16a4-4b4-c4)^2)
repeat for other 3 cases:
[ 1 1 1]    [a]    [3]
[ 4 2 1] * [b] = [9]
[16 4 1]    [c]    [69]
and get (a3,b3,c3) , e3 = sqrt((29-9a3-3b3-c3)^2)
...
choose the solution with the minimal e
this method is commonly used in engineering !

* you can choose a combined solution from the 4 solutions: let e1,e2,e3,e4 be the "errors" found from each solution (represent the distance from the omitted 4th point each time respectively)
(A,B,C) = E1*(a1,b1,c1)+E2*(a2,b2,c2)+E3*(a3,b3,c3)+E4*(a4,b4,c4)

there are some methods to set (E1,E2,E3,E4) coefficients:
1) E1+E2+E3+E4 = 1
2) E1 = (1-e1)/3 , ... E4=(1-e4)/3

this approximation will give you (A,B,C) that are (a5,b5,c5) with e5, check e5 compared with the minimal from e1,..,e4

this is the best solution I can "react spontaneously", I'll check for better apprixamating analytic solution.
tal

0

LVL 22

Expert Comment

ID: 13904621
Talmash,
Are you aware that the solution given by jhshukla gives an answer for a,b,c with the least-squares difference between the regression value and the sample points?  His answer is guaranteed to give a better root-mean-square error metric than yours.
0

LVL 1

Author Comment

ID: 13905600
Further points has been added to this thread, that you can soon buy in the bookstores.... ;-)

Now will somebody have mercy on me, and tell me who deserves how many points?? - I must shamefully admit that a lot of this goes over my head.... :-/

Anyway it would be nice to see where it goes with the Talmash solution compared to jhskukla's.

I beg you all: Can somebody show me how to use jhskuklas formula, because I do not understand it. - As I mentioned I am not comfortable with matrixes.
0

LVL 8

Assisted Solution

tonsofpcs earned 200 total points
ID: 13905882
0

LVL 1

Author Comment

ID: 13907587
Hi, guys!

On

I found this:

Parabola models: Parabola regressions have three coefficients with a general form:

Y = a + bX + cX2,
where

c = { S (xi - xbar)2Ã—yi - n[S(xi - xbar) 2Ã— Syi]} / {n S(xi - xbar) 4 - [S(xi - xbar)2] 2}

b = [S(xi- xbar) yi]/[ S(xi - xbar)2] - 2Ã—cÃ—xbar

a = {Syi - [cÃ— S(x i - xbar) 2)}/n - (cÃ—xbarÃ—xbar + bÃ—xbar),

where xbar is the mean of xi's.

What do you think?
0

LVL 1

Author Comment

ID: 13909083
AHHHHHHHHHHHHHHHHHHHHHHHHH!!!!!!!!!!!!!!!!!!!! :-D

FINALLY! This works:

Approximated second degree polynomium f(x)=a*x^2 + b*x + c    for i coordinate pairs:

XMEAN=MEAN(Xi)
SUMY=SUM(Yi)
D2=SUM((Xi-XMEAN)^2)
D4=SUM((Xi-XMEAN)^4)
DU=SUM((Xi-XMEAN)*Yi)
D2U=SUM((Xi-XMEAN)*Yi)

a =    D2 * SUMY - N * D2U
___________________
D2^2 - N * D4

b =    DU/D2 - 2 * A * XMEAN

c=    (SUMY - A * D2)/N - A*XMEAN^2 - B * XMEAN

Perhaps this somehow expresses your matrixes. I for sure dont know! :-)

Thanks for all the help.
0

LVL 1

Author Comment

ID: 13909093
Sorry > for i coordinate pairs

should have been   > for N coordinate pairs
0

Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article provides a brief introduction to tissue engineering, the process by which organs can be grown artificially. It covers the problems with organ transplants, the tissue engineering process, and the current successes and problems of the tecâ€¦
Aerodynamic noise is the cause of the majority of the noise produced by helicopters. The inordinate amount of noise helicopters produce is a major problem in the both a military and civilian setting. To remedy this problem the use of an aerogel coatâ€¦
This is a video describing the growing solar energy use in Utah. This is a topic that greatly interests me and so I decided to produce a video about it.
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaacâ€¦
Suggested Courses
Course of the Month13 days, 17 hours left to enroll