Link to home
Start Free TrialLog in
Avatar of ingo_klein
ingo_klein

asked on

Excel: Analysis Toolpack -> regression

I have read that it is possible to regress one y variable on, say, two x variables, and the tool to do this would be the analysis toolpack.
When I go to regression, and select the input range for y, there is no problem. But how do I select the input range for the x variables? I have one sheet with x1 variables and another one with x2 variables, all in columns.
Thank you for your help!
Avatar of cri
cri
Flag of Switzerland image

Simply click on the worksheet tab to switch between them, then just select the area _in_ the worksheet. If you do not see the tabs: Tools|Options|View

If the above does not help: Which Excel version, service release and language do you have ?

Avatar of ingo_klein
ingo_klein

ASKER

I use Excel 2002 (XP) in English. I have no problem in running a simple regression, for example:

input y range:  $A$1:$A$6
input x range:  $B$1:$B$6

The issue is also not being able to slect sheets. The problem is that as soon as I try to add another x variable, so that, for example:

input y range:  $A$1:$A$6
input x range:  $B$1:$B$6,$C$1:$C$6

it will not work. the error message is as follows:

Regression - Input Range must be a contiguous reference.


maybe the comma is not the right seperator?


thanks again
I use
Must have misunderstood you, thought you were refering to having your data on different sheets.

Still not sure I understand you correctly this time: You have 2 x values per y value ? If you want all in one regression: Duplicate the y range, i.e copy A1:A6 to A7:A12 and move the second x series to from C1:C6 to B7:B12. If not, i.e. you want to make two regressions, you will have to run the wizard twice. If none of the above: Sorry. You must state what you are trying to do, because I still do not get it.

Remark: Use the same list separator as you do for a function, like
=IF(1=2,"really?","ok")
ok, consider the following regression:

y(t) = A*x1(t)+B*X2(t)+e

hence, a bi-variate regression, which needs to be run for a total of 1265 datasets. Each dataset contains about 2000 observations.

to save space ;) consider the following example:

  y  x1 x2
  7  2  3
  6  2  4
  2  6  3
  4  1  5
  6  2  6
  7  7  6
  9  5  8


i want to know how well y is explained by x1 and x2 TOGETHER.
What I said above, The problem is that as soon as I try to add another x variable, so that, for example:

input y range:  $A$1:$A$6
input x range:  $B$1:$B$6,$C$1:$C$6

it will not work. the error message is as follows:

Regression - Input Range must be a contiguous reference.
ok, consider the following regression:

y(t) = A*x1(t)+B*X2(t)+e

hence, a bi-variate regression, which needs to be run for a total of 1265 datasets. Each dataset contains about 2000 observations.

to save space ;) consider the following example:

  y  x1 x2
  7  2  3
  6  2  4
  2  6  3
  4  1  5
  6  2  6
  7  7  6
  9  5  8


i want to know how well y is explained by x1 and x2 TOGETHER.
What I said above, The problem is that as soon as I try to add another x variable, so that, for example:

input y range:  $A$1:$A$6
input x range:  $B$1:$B$6,$C$1:$C$6

it will not work. the error message is as follows:

Regression - Input Range must be a contiguous reference.
ASKER CERTIFIED SOLUTION
Avatar of TigerMan
TigerMan
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ok that's good news! the problem is now, however, that I have a total of 49 sheets, each with 49 series (columns). Since Excel will not allow me to put everything in one sheet, what do you suggest?
The x1 and x2 series are all in one sheet, but the y's are in 49 different sheets and within these sheets in 49 different columns, if you see what i mean?
Apparently you have to pass the X input 'en-bloc' i.e. $B$1:$C$6

Remarks:

a) When duplicating the regression in http://www.mathsci.appstate.edu/~hph/3310/linreg/reg5.pdf , Data|DataAnalysis|Regression ran fine BUT I obtained other (!?) results than those in the link. Go figure.

b) A nice link, loads slow as mud, but worth to remember: http://www.stat.uiowa.edu/~jcryer/JSMTalk2001.pdf 

Therefore, consider using a professional package. As you already found out, I am no statistics expert, so consult overviews _like_ http://condor.bcm.tmc.edu/Statistics/links.html
Did not see Tigerman's post.

Re "...but the y's are in 49 different sheets and within these sheets in 49 different columns, if you see what i mean? ..." Sorry, I am not. Unless you asking how to speed up the input for 49 bi-variate regressions. In this case IMO you will have to record a macro for 1-2 operations and then tweaking it to make it generalized.
Dear Cri, Dear Tigerman!

Indeed, the problem is that I cannot pass the x input 'en bloc', because the file has too many x's and y's; it is just to big (>100mb) for rearranging all the columns in the sheets to be next to each other.
 I have several professional stats packages, but the problem here is of a manual labour one: i would have to run over a thousand manual regressions, although i am only interested in the distributions of the r's or r^2s (correlation coefficients or coefficient of determinations), but the correl function I cannot use because it assumes just one explanetory variable (one x).

should this really not be possible with Excel?
hi ingo_klein,

Just FWIW, i lecture in statistics at uni - so sort of know what you are on about.

Best I can offer is you get me a copy of your file and I can play.

I am only dialup at home - but can pick it up via DSL at work tomorrow [I'm in Australia, so time is different].

Give me a chance and I can flick through and at least determine "Yes" or "No" and/or recommend a course of action.

Sounds like either a Macro OR some tinkering to pull into sensible format.

If you choose to do this, please

: ZIP to smallest file size [or even pull just a couple of sheets that explain the magnitude of problem] AND

: provide text box hints to what you want done - in the workbook itself AND

: email to d64471023a@hotmail.com

Dave
and I haven't received any files, so you have solved your problem?

dave
yes, i have, actually. It took me one night, one pack of cigarettes and two jugs of coffee: I have created a sheet, which dynamically pulls in the 3 series (or columns) from the other sheets. By dynamically, i mean that all you need to do is to change the location of a number (i have chosen 1) relative to a list of possible combinations.
The Regression Tool Pack does not work with a normal Visual Basic macro, therefore the toolpack had to be loaded 1177 times manually. It would then create the regression in a seperate sheet. From this sheet, a macro is run to copy all the relevant results into a sheet. Then the regression sheet is deleted automatically, and the number 1 is moved to the next combination field. After mouse click on alalysis toolpack, return, return, ctrl+z for macro three houndered times, i got really annoyed. i found the shareware program Mouse and Key Recorder v4.5a, which then did the rest for me (thank god!). After this, i discovered that it would be nice to modify parameters and run the  1177 regression another three times. I was very happy that i had taken the night for the sheet and found the program.

Since I do appreciate both of your help, how would you suggest should i distribute the points?

best,

IKlein
Oh, before I forget: TigerMan, are you fammiliar with GARCH and EGARCH (conditional variance modelling)?
hi ingo_klein,

Garch/Egarch - only knowledge is by virtue of reading not using these models - so I do not know enough to talk sensibly about them.  I have seen application in riskmetrics and forecasting Var - but not a thorough working knowledge - so sorry, cannot help you there.

Dave
Oh yeah, glad to see you found a workaround - notwithstanding the time/effort - but its worth it to 'beat Excel' I reckon?
Re "...The Regression Tool Pack does not work with a normal Visual Basic macro..."

I am quite sure it does. Will go into this.
Option Explicit

Sub RegressExpressDemo()
  'Requires Tools|Addins|AnalysisToolPak_VBA to be loaded
  Dim rngX12 As Range, rngY As Range
  Dim i As Integer
  With ActiveSheet.UsedRange 'Quick and dirty, better hardcode the actual range
    For i = 1 To 5 'same
      Set rngY = .Columns(1)
      Set rngX12 = Application.Union(.Columns(i + 1), .Columns(i + 2))
      MsgBox "Run " & i & ": rngX12=" & rngX12.Address
      Application.Run "ATPVBAEN.XLA!Regress", rngY, rngX12, False, False, , "", False, False, False, False, , False
    Next i
  End With
End Sub
ingo_klein: Award the Q to Tigerman, as he was first with the answer to your question i.e. pass range 'en-bloc'. I am simply brushing up my (feeble) stats knowledge.
BTW, if you get the macro to run: This does not make the results more trustworthy. See my first link.
For the record, should this Q ever be PAQued.

Useful link regarding Using Analysis ToolPak Functions In VBA
http://www.rb-ad.dircon.co.uk/rob/excelvba/tips/index.htm
Any luck ?
Sighhh...
Force Accepted Tigerman's answer.

SpideyMod
Community Support Moderator @Experts Exchange