Outer joins and summing values

Posted on 2006-04-11
Last Modified: 2010-03-19

I have the following tables and show the relevant columns
RefCaseId (PK)

OrganId (PK)
TransplantCount (integer value... generally 1 or 2)

RefCaseId (FK)
OrganId (FK)

what I want is a sum of all the TransplantCount's for every referralcase where the organ outcome code is 'TRAN' (transplanted).

Resultset like this, for example
RefCaseId Transplants
1001              5
1002              1
1003              4

My problem occurs (I think) when a ReferralCase has no entry in the ReferralCaseOrgan table AND/OR when there IS a record but the outcome is not 'TRAN'
1004              0

I thought I could use left joins but no go....
,  sum(O.TransplantCount) [Transplants]
from ReferralCase RC
left join ReferralCaseOrgan RCO on RC.RefCaseId = RCO.RefCaseId
left join Organ O on RCO.OrganId = O.OrganId

where RCO.OrganOutcomeCd = 'TRAN'

group by RC.RefCaseId

I get 31 records back.  I should get 36 records, of which 5 will have [Transplants] = 0 because either (a)  there was no record in the ReferralCaseOrgan table or more likely (b) none of the records for that refcaseId have an outcome of 'TRAN'.
Question by:ibost
    LVL 68

    Expert Comment

    SELECT rfc.RefCaseId, SUM(org.TransplantCount) AS [Transplants]
    FROM referralCase rfc
    LEFT OUTER JOIN referralCaseOrgan rfco ON rfco.RefCaseId = rfc.RefCaseId AND
        rfco.OrganOutcomeCd = 'TRAN'
    LEFT OUTER JOIN organ org ON org.OrganId = rfco.OrganId
    LVL 10

    Author Comment

    OMG I ask this question then I delete because I thought I had found the answer.... Then I have re-ask because I didn't... Now I figure it out.

    I'll still give points if anyone will explain why this makes a difference:

    instead of "where RCO.OrganOutcomeCd = 'TRAN'" in the where clause, if I move it to the join itself everything works:

    left join ReferralCaseOrgan RCO on RC.RefCaseId = RCO.RefCaseId AND RCO.ORGANOUTCOMECD = 'TRAN'

    This gives me Transplants of NULL which is easy to fix in the select clause:
    isnull(sum(O.TransplantCount), 0) [Transplants]
    LVL 10

    Author Comment

    hehe Hi Scott

    We crisscrossed posts :)

    Can you tell me why it has to be in the join?  I think I know the answer but would rather just hear it from a pro.

    LVL 19

    Assisted Solution

    Change the where to  Coalesce(RCO.ORGANOUTCOMECD,'TRAN') = 'TRAN'

    will produce the same results as putting the criteria into the LEFT JOIN.  It will allow rows that don't match to be selected, the way you proposed the where is the same as an INNER JOIN.  

    LVL 19

    Expert Comment

    By the way, since my post doesn't address the original question, no points to me.
    LVL 68

    Accepted Solution

    As we know, the LEFT JOIN will insure that the RC table is kept, even if RCO and O do not match.  If SQL can't find a match for them, it automatically sets all column values from that table(s) to NULL.  So, when SQL gets to the WHERE:

    where RCO.OrganOutcomeCd = 'TRAN'

    OrganOutcomeCd is NULL, and NULL is never equal to anything, so the test fails and SQL doesn't include the row.

    WHERE (RCO.OrganOutcomeCd IS NULL OR RCO.OrganOutcomeCd = 'TRAN')

    is, to me, the clearest way of testing for that.  But putting it in the WHERE may help SQL include fewer rows from the join in the initial result, so I've started checking it there :-) .
    LVL 10

    Author Comment

    Ok that is sort of what I thought...  the WHERE is evaluated after.  

    For some reason I initially was thinking the WHERE was evaluated first, so all the base tables were filtered down and THEN joined together (filling in NULLs where appropriate).  Instead it goes the other way around.

    Folderol - actually I tried your way and it does fix the original issue.  It's probably not how I'll implement but it's a neat trick I never thought of.  I like learning tricks :)


    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    PRTG Network Monitor: Intuitive Network Monitoring

    Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

    The steps for moving the system databases to a new location are documented in the following technical article: However sometimes after the moving process is finished, though SQL i…
    Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
    Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
    Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    9 Experts available now in Live!

    Get 1:1 Help Now