Multiple Linear Regression

DColin
DColin used Ask the Experts™
on
Hi Experts,

I have just entered some data into a website MLR calculator. What are the results telling me?

With standard linear regression I would know how to graph the relationship between X1 and Y or X2 and Y but how do I graph the relationship of X1, X2 and Y?
MLR.jpg
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Can't tell what your numbers mean, or what you are trying to do.

But what the MLR is trying to go is find a plane that fits all your points the best.

But it seems your numbers are not linear.  All of your Y values are either 0 or 1.

In that case, if you round the calculated y value and get your original y value, the MLR has succeeded.  If it is doesn't, then it has failed.

I can only see 8 of 199 points, so there's no way for me to tell if it makes any sense.
But it looks like it gets all the 0's right, and all the 1's wrong.  
And some of the 0's have calc values higher than some of the 1's.

What do the numbers mean, and what are you trying to do?
Use a 3 D plotting program and plot x, y, and z

Author

Commented:
See attached image.

The figures are for sales team performance.

X1 is a customer rating. A negative rating means they have refused sales team offers in the past. A positive rating means they have accepted sales team offers in the past.

X2 is the distance the sales team travel to the customer.

Y is weather a sale was made or not.

If I want to know what the chance of a sale being made to a customer with a rating of 10 I can use linear regression to tell me about 55%

If I want to know what is the chance of a sale being made to a customer 100 miles away I can use linear regression to tell me about 45%

What I want to know is what is the chance of a sale being made to a 10 rated customer who is 100 miles away. Can multiple linear regression tell me this?
MLR2.jpg
Why Diversity in Tech Matters

Kesha Williams, certified professional and software developer, explores the imbalance of diversity in the world of technology -- especially when it comes to hiring women. She showcases ways she's making a difference through the Colors of STEM program.

Top Expert 2010
Commented:
What I want to know is what is the chance of a sale being made to a 10 rated customer who is 100 miles away. Can multiple linear regression tell me this?

Maybe, maybe not.  Like d-glitch above, I suspect that your data are non-linear, although perhaps with a transformation there can be a linear approximation.  (For example, if the true relationship between two variables is logarithmic, then regressing on the logs rather than the actual values will show a linear relationship.)

It would a heck of a lot easier to answer, though, if you uploaded an actual spreadsheet and not image files :)

Author

Commented:
Please find data as excel file attached.
MLR2.xls
Top Expert 2010

Commented:
You might be better off using the magnitude of a sale rather than a binary as your dependent variable.

As it is, your model has practically no explanatory power:
Using rating + distance has an r squared of 0.11
Using rating only: 0.10
Using distance only: 0.002

Author

Commented:
matthewspatrick

I am aware of the low R squared values due to the use of binary data. The models have proved accurate in the field.

What I need to know is how to combine the two sets of data because we are using a messy work around at the moment and I was thinking multiple regression was the way to go. Is it not?
You are trying to characterize your customers.  Their sales history is very likely meaningful in predicting their future behavior.

But the distance the sales team has to travel may not be a predictor of their behavior.  It is clearly a cost on your side, but how and why would it affect the customer?  Do you think they feel guilty if you have traveled a long way to make a sales call?

If distance doesn't affect customer behavior, it shouldn't be in the model.  

It probably should be a model that helps you allocate sales resources.  But that would be a different calculation.

Author

Commented:
d-glitch

"But the distance the sales team has to travel may not be a predictor of their behavior."

You are wrong. This is not an academic exercise this is the real world. We have been acting on the distance model since September and it is working as predicted.

All I want to know is how do I predict a sale based on customer rating and distance.
"If I want to know what the chance of a sale being made to a customer with a rating of 10 I can use linear regression to tell me about 55%

If I want to know what is the chance of a sale being made to a customer 100 miles away I can use linear regression to tell me about 45%

What I want to know is what is the chance of a sale being made to a 10 rated customer who is 100 miles away. Can multiple linear regression tell me this? "

-
"Can multiple linear regression tell me this?"
With the data you have, probably not
BUT
there is a better way.

You say
rating of 10 = 55%
100 miles = 45%
combine the two probabilities
Combined = rating 10 * 100 miles
0.55 * 0.45 = 0.25 probability of getting sale from a 10 at 100 miles.
Do that for all.
ozo
Most Valuable Expert 2014
Top Expert 2015

Commented:
That's like using a plane to go through the log of the probabilities.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial