Link to home
Start Free TrialLog in
Avatar of DColin
DColinFlag for Thailand

asked on

Multiple Linear Regression

Hi Experts,

I have just entered some data into a website MLR calculator. What are the results telling me?

With standard linear regression I would know how to graph the relationship between X1 and Y or X2 and Y but how do I graph the relationship of X1, X2 and Y?
MLR.jpg
ASKER CERTIFIED SOLUTION
Avatar of d-glitch
d-glitch
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of DColin

ASKER

See attached image.

The figures are for sales team performance.

X1 is a customer rating. A negative rating means they have refused sales team offers in the past. A positive rating means they have accepted sales team offers in the past.

X2 is the distance the sales team travel to the customer.

Y is weather a sale was made or not.

If I want to know what the chance of a sale being made to a customer with a rating of 10 I can use linear regression to tell me about 55%

If I want to know what is the chance of a sale being made to a customer 100 miles away I can use linear regression to tell me about 45%

What I want to know is what is the chance of a sale being made to a 10 rated customer who is 100 miles away. Can multiple linear regression tell me this?
MLR2.jpg
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of DColin

ASKER

Please find data as excel file attached.
MLR2.xls
You might be better off using the magnitude of a sale rather than a binary as your dependent variable.

As it is, your model has practically no explanatory power:
Using rating + distance has an r squared of 0.11
Using rating only: 0.10
Using distance only: 0.002
Avatar of DColin

ASKER

matthewspatrick

I am aware of the low R squared values due to the use of binary data. The models have proved accurate in the field.

What I need to know is how to combine the two sets of data because we are using a messy work around at the moment and I was thinking multiple regression was the way to go. Is it not?
You are trying to characterize your customers.  Their sales history is very likely meaningful in predicting their future behavior.

But the distance the sales team has to travel may not be a predictor of their behavior.  It is clearly a cost on your side, but how and why would it affect the customer?  Do you think they feel guilty if you have traveled a long way to make a sales call?

If distance doesn't affect customer behavior, it shouldn't be in the model.  

It probably should be a model that helps you allocate sales resources.  But that would be a different calculation.
Avatar of DColin

ASKER

d-glitch

"But the distance the sales team has to travel may not be a predictor of their behavior."

You are wrong. This is not an academic exercise this is the real world. We have been acting on the distance model since September and it is working as predicted.

All I want to know is how do I predict a sale based on customer rating and distance.
"If I want to know what the chance of a sale being made to a customer with a rating of 10 I can use linear regression to tell me about 55%

If I want to know what is the chance of a sale being made to a customer 100 miles away I can use linear regression to tell me about 45%

What I want to know is what is the chance of a sale being made to a 10 rated customer who is 100 miles away. Can multiple linear regression tell me this? "

-
"Can multiple linear regression tell me this?"
With the data you have, probably not
BUT
there is a better way.

You say
rating of 10 = 55%
100 miles = 45%
combine the two probabilities
Combined = rating 10 * 100 miles
0.55 * 0.45 = 0.25 probability of getting sale from a 10 at 100 miles.
Do that for all.
That's like using a plane to go through the log of the probabilities.