?
Solved

Regression for a second degree polynomium

Posted on 2005-04-29
26
Medium Priority
?
446 Views
Last Modified: 2008-08-02
Hi!

If I have a number of coordinates fx

(1,3)
(2,9)
(3, 29)
(4, 69)

and I would like to draw a second degree polynomium regression graph: y=ax^2+bx+c

How do I find a, b and c?

Regards,

Nicolai
0
Comment
Question by:JNic
  • 9
  • 3
  • 3
  • +8
26 Comments
 
LVL 2

Expert Comment

by:SWOne
ID: 13893601
Not that it matters, but aren't they polynomials?
0
 
LVL 1

Author Comment

by:JNic
ID: 13893669
The coordinates I wrote are completely random. But anyway I need the formula for the "best" approximated second degree polynomium for these pairs.
0
 
LVL 9

Accepted Solution

by:
jhshukla earned 600 total points
ID: 13893970
[ 1 1 1]    [a]    [3]
[ 4 2 1] * [b] = [9]
[ 9 3 1]    [c]    [29]
[16 4 1]           [69]

M*a = b

now since you can't get an inverse of non-square matrix
M'M*a = M'*b
(M'M)*a = M'*b
a = inv(M'M)*M'*b
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 9

Expert Comment

by:jhshukla
ID: 13894033
to get a sine regresion instead y = a*sin(x)+b
M will have two columns: col1 = sin(x), col2 = 1. solve the matrix eqn and voila! you have it. note that it can't be used to find phase in this form. probably you can get phase by y = a*sin + b*cos + c. not sure. try it you. once you have relative amplitudes of sin and cos, you can represent it as sin(x+phase) pretty easily.
0
 
LVL 6

Assisted Solution

by:Sergio_Hdez
Sergio_Hdez earned 400 total points
ID: 13894194
Here you have a Delphi code that do exactly that.

Note that Pto is an array of points, being Pto.Count the number of points and Pto[i].x the x-coordinate of the point number i. The delphi function Power(a,b) computes a^b. I hope all the other things are not delphi related so you all understand it.


//Calc coefs of the 2nd degree polinomium that better fits an array of points (minimun cuadratic difference on Y coordinates)
procedure TGParabola.Calcula;
var
  i,j,k: integer;
  s: array[1..3] of array[1..4] of double;
  s40,s30,s20,s10,s21,s11,s01: double;
begin
  s40:=0;s30:=0;s20:=0;s10:=0;s21:=0;s11:=0;s01:=0;
  for i:=1 to Pto.count do begin
    s40:= s40 + power(Pto[i].x, 4);
    s30:= s30 + power(Pto[i].x, 3);
    s20:= s20 + power(Pto[i].x,2);
    s10:= s10 + Pto[i].x;
    s21:= s21 + power(Pto[i].x, 2) * Pto[i].y;
    s11:= s11 + Pto[i].x * Pto[i].y;
    s01:= s01 + Pto[i].y;
  end;
  s[1,1]:= s40; s[1,2]:= s30; s[1,3]:= s20; s[1,4]:= s21;
  s[2,1]:= s30; s[2,2]:= s20; s[2,3]:= s10; s[2,4]:= s11;
  s[3,1]:= s20; s[3,2]:= s10; s[3,3]:= Pto.count; s[3,4]:= s01;
  for i:=1 to 3 do begin
    for k:=i+1 to 4 do
      if s[i,i]<>0 then s[i,k]:= s[i,k]/s[i,i];
    s[i,i]:=1;
    for k:=i+1 to 3 do begin
      for j:=i+1 to 4 do s[k,j]:= s[k,j]-s[i,j]*s[k,i];
      s[k,i]:=0;
    end;
  end;
  c:=s[3,4];
  b:=s[2,4]-s[2,3]*c;
  a:=s[1,4]-s[1,3]*c-s[1,2]*b;
  // Polinomium is: a*x^2 + b*x + c = y
end;
0
 
LVL 1

Author Comment

by:JNic
ID: 13894658
H! Thanks a lot for this!

But: I need to calculate this without looping! In the environment where I must enter this formula, there is no looping-possibility, so I need a flat mathematical way to express this.

Is there a (simple) way?
0
 
LVL 1

Author Comment

by:JNic
ID: 13894671
BTW: I will boost the points further, since you actually answered the original question very well! :-)
0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 13896112
For your (x,y) pairs, are the x's always 1,2,3,4 or can they be arbitrary?

btw, your original progression looks cubic, not quadratic.  Paybe a third-order polynomial would be more appropriate.
0
 
LVL 27

Assisted Solution

by:d-glitch
d-glitch earned 400 total points
ID: 13896494
You can always convert a polynomial regression problem into a linear regression:

Your equation will be of the form:       y   =   Ax^2  +  Bx  +  C

Your input data are:                      (x, y)  ==>  (1, 3)   (2, 9)   (3, 29)   (4, 69)


Your linear equations are:                  3   =   A       +  B     +  C
                                                       9   =   A*4   +  B*2  +  C
                                                      29  =   A*9   +  B*3  +  C
                                                      69   =  A*16 +  B*4  +  C

This gives quite a good fit.


0
 
LVL 6

Expert Comment

by:Sergio_Hdez
ID: 13897060
How would you do it without any loops?

Imagine you have 20 points, not only 3, or even you could have 2000 points... you can not make a simple formula to fit all the 2000 points without a loop!

You could get this if the problem was not so general: May be you will always have 4 points, or something similar that could simplify the question.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 13897541
>> Apologies to jhshukla

We said essentially the same thing, except you were earlier, more complete, and more concise.
0
 
LVL 9

Expert Comment

by:jhshukla
ID: 13898348
d-glitch>> You can always convert a polynomial regression problem into a linear regression:
thanks for the terminology. i knew that this could be done but did not think about it that way.
0
 
LVL 1

Author Comment

by:JNic
ID: 13900146
Hi all!

Thank you for your suggestions! :-)

I need to stress (I said it earlier) that the coordinates I wrote are completely random - they could actually be anything. What I DO need, is the "closest" second degree polynomium to the points. I guess there are many different methods for this, I would be happy with just one.

You can consider the number of coordinate pairs to be fixed. - There will always be four (in my case). The values could be ANYTHING. (so the x's are not neccesarily 1,2,3,4) However it would be nice with a "method" for handling any number of points.

d-glitch: This looks nice, - but I dont understand completely how to continue. Your four equations will of course not "fit together" since the coordinates are not in fact belonging to a second degree polynomium. How do I go on?

BoBSiemens: Which formula(s) exactly do you think solves my problem?

jhshukla: Looks very nice. Could this be "converted" to some sort of formula?  - Maybe instead of using loops like

for i:=1 to Pto.count do begin
    s40:= s40 + power(Pto[i].x, 4);
end

we could write:   SUM(Xi^4)   <-- This text-editor has its limitations for math, so I wrote it excel-style :-/


And for all of you: If I ask silly questions or write silly things it is possibly because I am not a mathematician. Please bare with me.... :-)
0
 
LVL 85

Assisted Solution

by:ozo
ozo earned 200 total points
ID: 13900350
For a least squares fit of a second degree polynomial to 4 points, we have

[      y1+       y2+      y3+       y4]      
[ x1*y1+  x2*y2+ x3*y3+  x4*y4]    
[x1²*y1+x2²*y2+x3²*y3+x4²*y4]      
=
      [c]
X * [b]
      [a]

where X =
[1   +1   +1    +1     x1 +x2  +x3 +x4   x1²   +x2²  +x3²  +x4²  ]
[x1  +x2 +x3  +x4   x1²+x2²+x3²+x4²  x1³   +x2³  +x3³  +x4³   ]
[x1²+x2²+x3²+x4²  x1³+x2³+x3³+x4³  x1^4+x2^4+x3^4+x4^4]
so
[c]
[b]
[a]
=
[      y1+       y2+      y3+       y4]      
[ x1*y1+  x2*y2+ x3*y3+  x4*y4]   * X¯¹
[x1²*y1+x2²*y2+x3²*y3+x4²*y4]

for a 3×3 matrix
 X¯¹
=
1/|X|
*
[|x22 x23|  |x13 x12|  |x12 x13|]
[|x32 x33|  |x33 x32|  |x22 x23|]
[                                            ]
[|x23 x21|  |x11 x13|  |x13 x11|]
[|x33 x31|  |x31 x33|  |x23 x21|]
[                                            ]
[|x21 x22|  |x12 x11|  |x11 x12|]
[|x31 x32|  |x32 x31|  |x21 x22|]




0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 13901299
Hi JNic,
    Newton's General Interpolation Formula can be used for this purpose quite easily...

First you will have to create a "Difference Table" like this...

First column is x and second is y, then you have differences per unit change in x...

(9-6)/(2-1), (29-9)/(3-2), (69-29)/(4-3) in first column
(20-6)/(3-1), (40-20)/(4-2) in second column
(10-7)/(4-1) in third column ...

x    f(a)  f(a,b) f(a,b,c) f(a,b,c,d)
_____________________________________
1       3  
                6
2       9               7
               20               1
3      29              10
               40
4      69
_____________________________________

Now,
f(x) = f(x0) + (x-x0) f(x0,x1) + (x-x0)(x-x1) f(x0,x1,x2) + (x-x0)(x-x1)(x-x2) f(x0,x1,x2,x3) + ...

ie.,

f(x) = 3 + (x-1) 6 + (x-1)(x-2) 7 + (x-1)(x-2)(x-3) 1

= 3 + 6 (x-1) + 7 (x^2 - 3x + 2)  + 1 (x^3 - 6x^2 + 11x - 6)

= 3 + 6x - 6 + 7x^2 - 21x + 14 + x^3 - 6x^2 + 11x - 6

= x^3 + x^2 - 4x + 5
___________________________

Now you can cross check the results...

f(1) = 1 + 1 - 4 + 5 = 3
f(2) = 8 + 4 - 8 + 5 = 9
f(3) = 27 + 9 - 12 + 5  = 29
f(4) = 64 + 16 - 16 + 5 = 69

Bye
---
Harish
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 13901323
JNic,
    Note that it is a third degree polynomial...

    In general, (with some exceptions), any function will have (n-1)th degree when there are n points given.

    This (NGIF) can be applied to any number of points and also when the points are not equidistant, or even when they are in any order.

    Ask if you don't understand...

Here are the links to Newton's General Interpolation Formula (Newton's Divided Difference Formula)
http://kr.cs.ait.ac.th/~radok/math/mat7/step24.htm

Or Lagrange's Interpolation Formula...
http://kr.cs.ait.ac.th/~radok/math/mat7/step23.htm

Or Newton's Forward/Backward Interpolation Formulae...
http://kr.cs.ait.ac.th/~radok/math/mat7/step22.htm
0
 
LVL 37

Assisted Solution

by:Harisha M G
Harisha M G earned 200 total points
ID: 13901333
(9-6)/(2-1), (29-9)/(3-2), (69-29)/(4-3) in first column

Should have been
(9-3)/(2-1), (29-9)/(3-2), (69-29)/(4-3) in first column
   ___
0
 
LVL 1

Author Comment

by:JNic
ID: 13901919
mgh mgharish: Thanks a lot for that. But as far as I can see, you found a third degree polynomial that fits my points excactly. What I needed was a second degree regression graph = The second degree polynomium that "looks most like" my four points.
This is possible to do in different ways, since there is no perfectly correct answer.

Now for all who provided matrix-solutions, I must (shameful) admit that I have never calcutated anything using Matrixes, so I cant even tell if your equations have solved my problems. :-(    :-(    :-(

I have a feeling that Sergio Hdez was on the right track with his Delphi-code, it would be interesting to see if this could be written as a formula.

No matter what, I will hand out some points tomorrow, - I really appreciate all your comments!
0
 
LVL 6

Expert Comment

by:Talmash
ID: 13904005
great comments posted until now, none are realy using engineerical assumptions/approximation:

I assume we can omit the 4th point, get ANALYTICAL solution for the remaining 3 points, calculate "error" for the 4th point, do that 4 times, and check for the best solution with the minimal "error":

just solve this (replace numbers with your random numbers and use solutuion from jhshukla)
[ 1 1 1]    [a4]    [3]
[ 4 2 1] * [b4] = [9]
[ 9 3 1]    [c4]    [29]
and get (a4,b4,c4) , e4 = sqrt((69-16a4-4b4-c4)^2)
repeat for other 3 cases:
[ 1 1 1]    [a]    [3]
[ 4 2 1] * [b] = [9]
[16 4 1]    [c]    [69]
and get (a3,b3,c3) , e3 = sqrt((29-9a3-3b3-c3)^2)
...
choose the solution with the minimal e
this method is commonly used in engineering !

* you can choose a combined solution from the 4 solutions: let e1,e2,e3,e4 be the "errors" found from each solution (represent the distance from the omitted 4th point each time respectively)
(A,B,C) = E1*(a1,b1,c1)+E2*(a2,b2,c2)+E3*(a3,b3,c3)+E4*(a4,b4,c4)

there are some methods to set (E1,E2,E3,E4) coefficients:
1) E1+E2+E3+E4 = 1
2) E1 = (1-e1)/3 , ... E4=(1-e4)/3

this approximation will give you (A,B,C) that are (a5,b5,c5) with e5, check e5 compared with the minimal from e1,..,e4

this is the best solution I can "react spontaneously", I'll check for better apprixamating analytic solution.
tal

0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 13904621
Talmash,
Are you aware that the solution given by jhshukla gives an answer for a,b,c with the least-squares difference between the regression value and the sample points?  His answer is guaranteed to give a better root-mean-square error metric than yours.
0
 
LVL 1

Author Comment

by:JNic
ID: 13905600
Further points has been added to this thread, that you can soon buy in the bookstores.... ;-)

Thank you for all your skilled comments!

Now will somebody have mercy on me, and tell me who deserves how many points?? - I must shamefully admit that a lot of this goes over my head.... :-/

Anyway it would be nice to see where it goes with the Talmash solution compared to jhskukla's.

I beg you all: Can somebody show me how to use jhskuklas formula, because I do not understand it. - As I mentioned I am not comfortable with matrixes.
0
 
LVL 8

Assisted Solution

by:tonsofpcs
tonsofpcs earned 200 total points
ID: 13905882
0
 
LVL 1

Author Comment

by:JNic
ID: 13907587
Hi, guys!

On

http://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rwlls

I found this:

Parabola models: Parabola regressions have three coefficients with a general form:


Y = a + bX + cX2,
where


c = { S (xi - xbar)2×yi - n[S(xi - xbar) 2× Syi]} / {n S(xi - xbar) 4 - [S(xi - xbar)2] 2}

b = [S(xi- xbar) yi]/[ S(xi - xbar)2] - 2×c×xbar

a = {Syi - [c× S(x i - xbar) 2)}/n - (c×xbar×xbar + b×xbar),

where xbar is the mean of xi's.



What do you think?
0
 
LVL 1

Author Comment

by:JNic
ID: 13909083
AHHHHHHHHHHHHHHHHHHHHHHHHH!!!!!!!!!!!!!!!!!!!! :-D

FINALLY! This works:

Approximated second degree polynomium f(x)=a*x^2 + b*x + c    for i coordinate pairs:

XMEAN=MEAN(Xi)
SUMY=SUM(Yi)
D2=SUM((Xi-XMEAN)^2)
D4=SUM((Xi-XMEAN)^4)
DU=SUM((Xi-XMEAN)*Yi)
D2U=SUM((Xi-XMEAN)*Yi)

a =    D2 * SUMY - N * D2U
      ___________________
          D2^2 - N * D4

b =    DU/D2 - 2 * A * XMEAN

c=    (SUMY - A * D2)/N - A*XMEAN^2 - B * XMEAN


Perhaps this somehow expresses your matrixes. I for sure dont know! :-)

Thanks for all the help.
0
 
LVL 1

Author Comment

by:JNic
ID: 13909093
Sorry > for i coordinate pairs

should have been   > for N coordinate pairs
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article provides a brief introduction to tissue engineering, the process by which organs can be grown artificially. It covers the problems with organ transplants, the tissue engineering process, and the current successes and problems of the tec…
Aerodynamic noise is the cause of the majority of the noise produced by helicopters. The inordinate amount of noise helicopters produce is a major problem in the both a military and civilian setting. To remedy this problem the use of an aerogel coat…
This is a video describing the growing solar energy use in Utah. This is a topic that greatly interests me and so I decided to produce a video about it.
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
Suggested Courses

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question