Which regression is more accurate

Hi experts,

Please take a look at the attached Excel worksheet.

The two data sets and accompanying graphs represent the same findings. The y values are encoded as yes/No (1/0) in the first data set and as observed probability in the second data set.

The Intercept and slope is displayed on each graph and you will notice they are different.

Why are they different and what graph is the more accurate?


I think the first case is more accurate.  
The number of observations goes up as the index goes from 28 to 33.

The first plot captures this correctly.
In the second plot, averaging the results for each index value, gives the 28 point the same weight as the 33 point even though there is less data.

The different weighting is also why the plots are different.

If you weight each probability by the number of data points it represents, the graphs should be the same.

But averaging the data and adjusting the weights would actually be the wrong
thing to do since it masks the significant scatter in the measurements.

The correlation coefficient (goodness of fit) in the averaging case will be unjustifiably high.
