Least Squares Data Fit Implementation...

Posted on 2004-04-24
Medium Priority
Last Modified: 2007-12-19
Hello Experts,

I am developing a mathematical module for my application and I am required to implement a Least Squares Data Fit algorithm in C#. Can anyone direct me to resources on line that have an implementation or can anyone with a strong mathematical background guide me to the solution.

Basically I have a set of coordinates that define a curve and I would like to use these points to a derive an equation of nth order (could be 2nd, 3rd, 4th etc) that represent these points.

Please cany anyone help.

Thanking you in advance.

Question by:imran89
  • 5
  • 2
  • 2

Accepted Solution

NTAC earned 600 total points
ID: 10908914
Hi Imran

Do you want to implement it yourself?  There are many free packages out there that can do this for you, such as:
Mapack for .NET:  http://www.aisto.com/roeder/dotnet/  (scroll down)

Or if you have to implement it, it isn't too bad.

Here is a great example in C++: http://home.wxs.nl/~ammeraal/stlcpp.html
Download the source code and you can follow it very easily.  

Author Comment

ID: 10913272
Thankyou for responding NTAC.

Ideally I would like to implement it myself but I assume the task is too daunting. Do you have any experience with this type of algorithm?

Is it easy to integrate Mapack with an application?

regards Imran

Author Comment

ID: 10913464

Another point that I forgot to mention was that how difficult the algorithm would become if I required the facility to select the order of the derived equation and whether to perform a straight line fit or a curve fit.

regards Imran
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.


Expert Comment

ID: 10914757

Sorry for the delay in responding

It is very easy to use the Mapack with your application.  If you are using Visual Studio--you can add the mapack dll to your project--then have the line: using Mapack; at the top of yoru file.  Then you can call any of the functions their dll has implemented.

As for your second question, I'm not sure if mappack has the flexibility to do that or not.  You can check the readme--I did but I don't understand the math (its been a few years out of school for me).  

Good luck

Author Comment

ID: 10916015
Thanks for responding NTAC.

Do you have any experience in using the least squares functionality as it is with Mapack. If there are a few mathematical commands to process, are you aware of them?


regards Imran
LVL 22

Assisted Solution

_TAD_ earned 400 total points
ID: 10921531
based on your description (derive an equation of nth order), I am going to assume that you are looking for something more complex than a simple linear regression analysis.

Visually, it is exceptionally easy to view a scatter plot and draw a best fit curve.  To mathematically generate this curve requires multi-variable calculus and partial derivatives.

Here's a complete description:

However... if you are just looking for a rough estimate, there are some simple calculations that will get you close.

Here's something to get you started ({R} represents a real  number (1.0, 1.5, 12.723, etc)

Gather all of your data points (x's and y's).

{we'll use these points as samples:  (.5, 2); (.25, 3); (.75,5); (1,11) }

Now take a look at all of your Y values where 0<X<1

Y Calc1:  first find the population mean of all of the Y values and then for the final sample mean we'll use the least squares method.

[(2 + 3 + 5 + 11)/4]  = 5.25

Now we are going to re-calculate using least squares method, but we are going to exclude points that are too extreme.  To determine extreme points we will take our population mean calculation and multiply it by 0.66 (this is two standard deviations from the mean)

5.25 * 0.66 = 3.465   --->  Recalc using least squares, but only include values betwwen (5.25 + 3.465) and (5.25 - 3.465)

Since Y = 11 is out of scope, we will call this an extreme point and ignore it.

SqrRt[(2*2 + 3*3 + 5*5)/3]  =  3.55

This is now our new Y value that we can place at the X = 0.5 position

Following the same procedure for each set of x values you will calculate sets of coordinates that follow a best fit curve.

The caviat is that this really only works well with a lot of points.  With few points the estimate become very inaccurate compared to the calculus version.

Also, for goo estimators you may want to find groupings of approximately 10 points.  If you only have 6 points between 0<X<1 then you may want to increase the range to 2 and do the same calculations for 0<X<2.  Then you would place your point at the middle of the range (X=1).

This comment does not answer your question in a way you want, but I'm afraid that a better explanation requires you to be well versed in advanced calculus and also requires a better forum in which to discuss this (ASCII characters and HTML is not adequate).


Author Comment

ID: 10922226
Thankyou for responding TAD and thanks for the theory.

A pure solution for least squares seems a very involved task and my math is very weak to tackle such a thing. I was hoping the algorithm was not as involved.

I will then need to focus on finding an exisitng implementation of least squares in C#. I know libraries like Mapack exist (thanks to NTAC) but I dont know if it will do the job.

I am dealing with around 50 coordinates at a time and require to find an equation most probably at cubic or quartet level hence having the option to choose the nth degree.

Are you aware of C# implementation of least squares upto quartet level?

Thanking you in advance

regards Imran
LVL 22

Expert Comment

ID: 10922700

I am not aware of any .Net package that does what you are asking, although they may exist.

Something that may help you is to look at some applications geared for mathematical computations (like MatLab) and see if you can find what you are looing for there and borrow some of the logic of their code.

Sorry I can't be of more help.

Author Comment

ID: 11071023
Thankyou very much for both of your assistance.

NTAC and _TAD_, both of you have provided me with pointers to start research on.

regards Imran

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This article series is supposed to shed some light on the use of IDisposable and objects that inherit from it. In essence, a more apt title for this article would be: using (IDisposable) {}. I’m just not sure how many people would ge…
This article is for Object-Oriented Programming (OOP) beginners. An Interface contains declarations of events, indexers, methods and/or properties. Any class which implements the Interface should provide the concrete implementation for each Inter…
this video summaries big data hadoop online training demo (http://onlineitguru.com/big-data-hadoop-online-training-placement.html) , and covers basics in big data hadoop .
With just a little bit of  SQL and VBA, many doors open to cool things like synchronize a list box to display data relevant to other information on a form.  If you have never written code or looked at an SQL statement before, no problem! ...  give i…
Suggested Courses

621 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question