Which regression is more accurate

Posted on 2014-08-10
Last Modified: 2014-08-11
Hi experts,

Please take a look at the attached Excel worksheet.

The two data sets and accompanying graphs represent the same findings. The y values are encoded as yes/No (1/0) in the first data set and as observed probability in the second data set.

The Intercept and slope is displayed on each graph and you will notice they are different.

Why are they different and what graph is the more accurate?


Question by:DColin
    LVL 26

    Accepted Solution

    I think the first case is more accurate.  
    The number of observations goes up as the index goes from 28 to 33.

    The first plot captures this correctly.
    In the second plot, averaging the results for each index value, gives the 28 point the same weight as the 33 point even though there is less data.

    The different weighting is also why the plots are different.
    LVL 26

    Expert Comment

    If you weight each probability by the number of data points it represents, the graphs should be the same.

    But averaging the data and adjusting the weights would actually be the wrong
    thing to do since it masks the significant scatter in the measurements.

    The correlation coefficient (goodness of fit) in the averaging case will be unjustifiably high.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    How to run any project with ease

    Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
    - Combine task lists, docs, spreadsheets, and chat in one
    - View and edit from mobile/offline
    - Cut down on emails

    Drop Down List with Unique/Distinct Values (Part II - ComboBox or ListBox and Data Validation List Bonus!) David Miller (dlmille) Intro This article focuses on delivering unique, sorted lists to list objects (e.g., ComboBox, ListBox) and Dat…
    Improved? Move/Copy Add-in Replacement - How to avoid the annoying, “A formula or sheet you want to move or copy contains the name XXX, which already exists on the destination worksheet.” David Miller (dlmille)  It was one of those days… I wa…
    The viewer will learn how to use a discrete random variable to simulate the return on an investment over a period of years, create a Monte Carlo simulation using the discrete random variable, and create a graph to represent the possible returns over…
    This Micro Tutorial demonstrate the bugs in Microsoft Excel for Mac with Pivot Charts.

    759 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    10 Experts available now in Live!

    Get 1:1 Help Now