Solved

# Best Fit Equation for a Line

Posted on 2013-10-23

I am working on a module in our application that will calculate the volume of a tank given a height. Let us say X = height in inches of Tank 001, the user will specify a formula to calculate volume Y (note that computed volume would vary depending on the density of what is being stored in the tank, but kind of irrelevant to this question). As an example:

Y = C * Power(X,3) + B * Power(X,2) + A * X + k

A, B, C are coefficients in this 3rd order polynomial equation, X is the height in inches, and k is a constant.

The users used Excel to come up with this equation by having Excel plot a line for a series of X, Y coordinates and then it gave them a best fit equation to interpolate the line.

I already have an algorithm (in SQL) that will calculate Y based on the above user supplied formula (for given A, B, C, and k).

Now what they are asking for is the ability to enter in a series of sample points, let us say between 15 and 40 samples of X and Y, and have our routine spit back out a formula. In other words, it would determine what the A, B, C, and k are for the polynomial equation.

Note that not all of their example formulas are third order polynomials, some are just linear equations, Y = [A] * X + [k], and possibly other stuff is theoretically possible.

I did find a routine that can calculate a very rough (VERY rough) 3rd order polynomial for a sample set but it does it by splitting the data set (ordered) into 4 regions and solves a system of 4 equations using the averages of X for each region. This actually didn't turn out to be that bad, but on the lower and upper ends of the data set it's just too much error.

Has anyone already solved this before and if so, any advice or an algorithm?

This would not have to be an algoritm in TSQL, I can convert it from whatever. Or suppose I could make this a CLR if it was .NET.