Link to home
Start Free TrialLog in
Avatar of Michael Hamacher
Michael Hamacher

asked on

Least Squares optimization in excel

I would like to perform a least squares optimization in excel using solver.
I tried the following set-up:
- Given is a vector of original exposure across a range of seven nodes (C8:I2)
- The aim is to replicate this exposure at each point as close as possible from a set of 5 instruments. The contribution of each instrument on any of the 7 nodes is respresented in the matrix C11:I15
- I try to solve for the number of units of each instrument (K11:K15) such that the sum of the squared differences (cell C24) is minimized , using solver

The issue is that the solution returned by solver is fairly poor in terms of fit/node (C20:I20). As a check, I have used Excel's linest functions, and the fit appears much better (S9:S14, which indicate a much better fit per node than the solver-based solution as displayed in row B20-H20).

Any idea as to how I may improve the set-up of the Lsq-optimization using excel would be appreciated
Avatar of hypercube
hypercube
Flag of United States of America image

How about an Excel file of the data?  There is much needed to understand.

C8:I2 is a matrix, not a vector .... ????

There are "nodes", "points", "instruments", etc.

I've built a trial spreadsheet but don't understand what the values in rows 2-8 represent.  Measurements over time? or?......
Avatar of Michael Hamacher
Michael Hamacher

ASKER

Hi

pls find a sample spreadsheet attached

Thank you
Lsq-concept-v2.xlsx
OK.  Thank you! I'm looking at it now....
OK.  I worked on it some and got what appear to be good results.

I couldn't tell what constraints you put on the variables so I bounded them between -1,000,000,000 and 1,000,000,000.
This is necessary for Solver to run.

I ran one case as you did with the same weight applied for each Instrument.  This gives the same results as the straight line fit.

I ran another case using different weights for each instrument for each node.  It's overdetermined so it gets zero error at each node and the weights can have any number of outcomes.  So perhaps not too useful.

I ran another case which may be realistic for you by using the first case and applying a node weight.  That looks pretty good.  So, the instruments are all weighted the same but then they are weighted according to the node - so they are *relatively* weighed the same.
Thanks vm, well appreciated. Can you pls share you spreadsheet solution?

Cheers!
Sorry, I'm sure that I had included it.  I don't know what happened.
Least-Squares-Question-Answer.xlsx
First of all, thank you very much indeed for undertaking the effort to construct and post your own spreadsheet solution, appreciated really.

I went through the sheet and several questions/issues emerged (it almost appears that some information was 'lost' in the process of attaching the sheet/re-opening it on my end)

(1)      I am unable to replicate your results even just for the base case (equal weights for nodes and instruments). It appears that my original formulae have been retained, therefore it would be very helpful to see what caused the difference. Would you be able to  share how you set the solver up?

(2)      I am looking at variant 3, i.e. any one instrument retains the same weight across nodes (as this is a necessary condition), but each node may be assigned a different weighting.
-      Where exactly are the node weights specified? I do noticed you have a row (64) ‘Node Weight’, but these numbers are in the order of magnitude of millions? (for weighted least squares, my understanding is that the weights are specified in the format of 1/(y^2). Moreover, the instrument weights appear to differ across each node, at least when I open the file (e.g. C59:C63 = weights for 1st node shows different weights for Instruments 1-5 than E59:E63 = weights for 2nd node, and the same applies for all consecutive nodes)
-      How do you set up the solver in this case please? When I open your spreadsheet & the solver set up, the solver input mask (I have attached the screenshot for your reference) shows as objective to minimize cell O59 (which appears to ), and by minimizing any of the instrument weights in each node (plus the node weights), i.e. C59:C64, E59:E64 etc), in contrast to the idea that the instrument weights are to be kept constant
Would it be possible to maybe highlight the decision variable cells, and the optimization target, and explain which solver parameters you use?

3)      It appears that you use the ‘Evolutionary Solver’ Method. Is there any particular reason why? It seems that this particular method takes longer than the other two (GRG nonlinear & simplex LP)

Thank you very much!
SolverSet-Up.png
(1)      I am unable to replicate your results even just for the base case (equal weights for nodes and instruments). It appears that my original formulae have been retained, therefore it would be very helpful to see what caused the difference. Would you be able to  share how you set the solver up?
The first case used I11:I15 as the variable cells.  
Then each of those cells was set with constraints:
>=-1,000,000,000 and <=1,000,000,000
User generated image
(2)  
  I am looking at variant 3, i.e. any one instrument retains the same weight across nodes (as this is a necessary condition), but each node may be assigned a different weighting.
Correct.

Where exactly are the node weights specified? I do noticed you have a row (64) ‘Node Weight’, but these numbers are in the order of magnitude of millions? (for weighted least squares, my understanding is that the weights are specified in the format of 1/(y^2).
They are specified in Row 64.  This really has nothing to do with least squares, it's just a scaling that goes along with each node.  Then these node weights are also "variables" in the Solver and are given a wide range (actually the same range as the other variables just because I didn't know any better).

I believe that the variables were the parameters in P59:P63 and the node weights in row 64.  This would keep the relative instrument weights the same as scaled by the node weights.  Somehow the cell expressions have been lost so I'll have to redo it.

So, each Instrument/Node weight was the (node weight) X P59:P63 respectively.

More to come....
Thanks a lot! and yes, if you can repost it with the re-instated cell expressions that would be much appreciated indeed (I tried fooling around with it myself, but don't quite trust the results).
One more q: I noticed you added Sqrt(Sum(Sq)): do you minimize this expression (instead of Sum(SqD)?
Lastly, It seems that GLG nonlinear seems to be sufficient (and faster than the evolutionary approach)....
ASKER CERTIFIED SOLUTION
Avatar of hypercube
hypercube
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks a lot, that's perfect really.

Fyi: I do have one follow up question which I however decided to post as a separate question as I feel you have already dedicated so much effort to this one (ID: 28543341, Weighted least Squares Excel). Feel free to have a look (after another coffee that is)....
Very comprehensive solution; re-designed the spreadsheet and included additional case which was very insightful