Be seen. Boost your questionâ€™s priority for more expert views and faster solutions

I would like to perform a least squares optimization in excel using solver.

I tried the following set-up:

- Given is a vector of original exposure across a range of seven nodes (C8:I2)

- The aim is to replicate this exposure at each point as close as possible from a set of 5 instruments. The contribution of each instrument on any of the 7 nodes is respresented in the matrix C11:I15

- I try to solve for the number of units of each instrument (K11:K15) such that the sum of the squared differences (cell C24) is minimized , using solver

The issue is that the solution returned by solver is fairly poor in terms of fit/node (C20:I20). As a check, I have used Excel's linest functions, and the fit appears much better (S9:S14, which indicate a much better fit per node than the solver-based solution as displayed in row B20-H20).

Any idea as to how I may improve the set-up of the Lsq-optimization using excel would be appreciated

I tried the following set-up:

- Given is a vector of original exposure across a range of seven nodes (C8:I2)

- The aim is to replicate this exposure at each point as close as possible from a set of 5 instruments. The contribution of each instrument on any of the 7 nodes is respresented in the matrix C11:I15

- I try to solve for the number of units of each instrument (K11:K15) such that the sum of the squared differences (cell C24) is minimized , using solver

The issue is that the solution returned by solver is fairly poor in terms of fit/node (C20:I20). As a check, I have used Excel's linest functions, and the fit appears much better (S9:S14, which indicate a much better fit per node than the solver-based solution as displayed in row B20-H20).

Any idea as to how I may improve the set-up of the Lsq-optimization using excel would be appreciated

C8:I2 is a matrix, not a vector .... ????

There are "nodes", "points", "instruments", etc.

I've built a trial spreadsheet but don't understand what the values in rows 2-8 represent. Measurements over time? or?......

I couldn't tell what constraints you put on the variables so I bounded them between -1,000,000,000 and 1,000,000,000.

This is necessary for Solver to run.

I ran one case as you did with the same weight applied for each Instrument. This gives the same results as the straight line fit.

I ran another case using different weights for each instrument for each node. It's overdetermined so it gets zero error at each node and the weights can have any number of outcomes. So perhaps not too useful.

I ran another case which may be realistic for you by using the first case and applying a node weight. That looks pretty good. So, the instruments are all weighted the same but then they are weighted according to the node - so they are *relatively* weighed the same.

Least-Squares-Question-Answer.xlsx

I went through the sheet and several questions/issues emerged (it almost appears that some information was 'lost' in the process of attaching the sheet/re-opening it on my end)

(1) I am unable to replicate your results even just for the base case (equal weights for nodes and instruments). It appears that my original formulae have been retained, therefore it would be very helpful to see what caused the difference. Would you be able to share how you set the solver up?

(2) I am looking at variant 3, i.e. any one instrument retains the same weight across nodes (as this is a necessary condition), but each node may be assigned a different weighting.

- Where exactly are the node weights specified? I do noticed you have a row (64) â€˜Node Weightâ€™, but these numbers are in the order of magnitude of millions? (for weighted least squares, my understanding is that the weights are specified in the format of 1/(y^2). Moreover, the instrument weights appear to differ across each node, at least when I open the file (e.g. C59:C63 = weights for 1st node shows different weights for Instruments 1-5 than E59:E63 = weights for 2nd node, and the same applies for all consecutive nodes)

- How do you set up the solver in this case please? When I open your spreadsheet & the solver set up, the solver input mask (I have attached the screenshot for your reference) shows as objective to minimize cell O59 (which appears to ), and by minimizing any of the instrument weights in each node (plus the node weights), i.e. C59:C64, E59:E64 etc), in contrast to the idea that the instrument weights are to be kept constant

Would it be possible to maybe highlight the decision variable cells, and the optimization target, and explain which solver parameters you use?

3) It appears that you use the â€˜Evolutionary Solverâ€™ Method. Is there any particular reason why? It seems that this particular method takes longer than the other two (GRG nonlinear & simplex LP)

Thank you very much!

SolverSet-Up.png

(1) I am unable to replicate your results even just for the base case (equal weights for nodes and instruments). It appears that my original formulae have been retained, therefore it would be very helpful to see what caused the difference. Would you be able to share how you set the solver up?The first case used I11:I15 as the variable cells.

Then each of those cells was set with constraints:

>=-1,000,000,000 and <=1,000,000,000

I am looking at variant 3, i.e. any one instrument retains the same weight across nodes (as this is a necessary condition), but each node may be assigned a different weighting.Correct.

Where exactly are the node weights specified? I do noticed you have a row (64) â€˜Node Weightâ€™, but these numbers are in the order of magnitude of millions? (for weighted least squares, my understanding is that the weights are specified in the format of 1/(y^2).They are specified in Row 64. This really has nothing to do with least squares, it's just a scaling that goes along with each node. Then these node weights are also "variables" in the Solver and are given a wide range (actually the same range as the other variables just because I didn't know any better).

I believe that the variables were the parameters in P59:P63 and the node weights in row 64. This would keep the relative instrument weights the same as scaled by the node weights. Somehow the cell expressions have been lost so I'll have to redo it.

So, each Instrument/Node weight was the (node weight) X P59:P63 respectively.

More to come....

One more q: I noticed you added Sqrt(Sum(Sq)): do you minimize this expression (instead of Sum(SqD)?

Lastly, It seems that GLG nonlinear seems to be sufficient (and faster than the evolutionary approach)....

All Courses

From novice to tech pro — start learning today.

I Saved each Solver case in column R. This is better and less work than pictures. All you have to do is open Solver and Load from the cells below the "Solver Case X" title and down until it's empty. e.g. R6:R11 and so on.... Then you can see and examine how Solver was set up and you can run it.

Case 1 is the original case and the results are the same as those you had originally using the straight line fit.

Case 2 uses separate weights for each instrument/node pair. As I mentioned before, this is an overdetermined case and there will be many solutions. You can always get zero error doing it this way but it's not very satisfying or valid. For example, you could use all zero instrument weights except for one at each node at least for five of the instruments.

Case 3 uses the same weights for each instrument (shown in column P). And, it adds a weight for each node (shown in row 64). This time, I limited the node weights to be positive so the instrument weights would always "look the same" (sign) relative to the column P numbers.

Case 3 could be solved in many ways also but not pathologically as in Case 2. For example, divide the node weights by 2 and multiply the instrument weights in column P by 2 and the node/instrument weights will remain the same.

The node instrument weights are all:

[instrument weight Px]*[node weight x64]*1,000,000.

One could just as easily multiply the values in col P by 1,000,000 and leave them at that.

The limits for the variables in Solver are simply big numbers in order to make sure they aren't actually limiting the solution. They are there just to satisfy the rules for Solver variables. Well, except for the node weights which I did want to keep >=0 as I mention above.

Case 3 gives you the opportunity to play around with the values.

For example, you can take the weights from Case 1 and put them in column P59:P63.

Then set the node weights all to 1.0.

The result will be the same as Case 1. A good check.

Then you can run the Case 3 Solver to get the result that's shown here.

I also set the node weight in column C to be 1.0 and not a variable. So, you might say it's the "reference" node. The other node weights and all of the instrument weights then change to accommodate this.

Yes, you can use GRG Nonlinear and it is faster. My Solver was already set to Evolutionary because I've been running some large, nonlinear solutions lately.

Least-Squares-Question-Answer3.xlsx