Link to home
Start Free TrialLog in
Avatar of DColin
DColinFlag for Thailand

asked on

Which regression is more accurate

Hi experts,

Please take a look at the attached Excel worksheet.

The two data sets and accompanying graphs represent the same findings. The y values are encoded as yes/No (1/0) in the first data set and as observed probability in the second data set.

The Intercept and slope is displayed on each graph and you will notice they are different.

Why are they different and what graph is the more accurate?

Thanks.

 EasyLogit.xls
ASKER CERTIFIED SOLUTION
Avatar of d-glitch
d-glitch
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
If you weight each probability by the number of data points it represents, the graphs should be the same.

But averaging the data and adjusting the weights would actually be the wrong
thing to do since it masks the significant scatter in the measurements.

The correlation coefficient (goodness of fit) in the averaging case will be unjustifiably high.