SQL Query Help

I am a little puzzled by the results of the query below. When I look up the invoice number in our application, it correctly shows the following results:

InvNo         Amount   TransactionDate
393848       $256.07       10/15/2012
393848       $374.32       2/13/2012
393848       $381.00       3/12/2012
393848       $381.00       4/16/2012
393848       $381.00       5/14/2012
393848       $381.00       6/11/2012
393848       $381.00       7/17/2012
393848       $381.00       8/13/2012
393848       $381.00       9/17/2012


Since the transaction date is of no importance to me, I tried running a query without TransDate in the SELECT clause, and am receiving these results:

393848       $256.07
393848       $374.32
393848       $381.00

However, when I do include the TransDate, it shows everything correctly, just like the application shows (all 9 records).

How do I get it to show me the same thing the application is showing without using the TransDate?

Select  DISTINCT I.InvNo AS 'InvNo', PD.Credit AS 'Credit', P.TransDate as 'TransDate'
From Invoices I 
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo
Where datepart(year, P.TransDate) = '2012'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN (
'90001',
'90002',
'90003',
'90004',
'90005',
'90005',
'90006',
'90007',
'90008',
'90010',
'90011',
'90012',
'90013',
'90014',
'90015',
'90016',
'90017',
'90018',
'90019',
'90020',
'90021',
'90023',
'90024',
'90025',
'90026',
'90027',
'90028',
'90029',
'90031',
'90032',
'90033',
'90034',
'90035',
'90036',
'90037',
'90038',
'90039',
'90041',
'90042',
'90043',
'90044',
'90045',
'90046',
'90047',
'90048',
'90049',
'90056',
'90057',
'90058',
'90059',
'90061',
'90062',
'90063',
'90064',
'90065',
'90066',
'90067',
'90068',
'90069',
'90071',
'90077',
'90089',
'90094',
'90095',
'90210',
'90211',
'90212',
'90230',
'90232',
'90245',
'90247',
'90248',
'90272',
'90290',
'90291',
'90292',
'90293',
'90302',
'90402',
'90501',
'90502',
'90710',
'90717',
'90731',
'90732',
'90732',
'90744',
'90810',
'91040',
'91042',
'91214',
'91303',
'91304',
'91306',
'91307',
'91311',
'91316',
'91324',
'91325',
'91326',
'91330',
'91331',
'91335',
'91340',
'91342',
'91343',
'91344',
'91345',
'91352',
'91356',
'91364',
'91367',
'91401',
'91402',
'91403',
'91405',
'91406',
'91411',
'91423',
'91436',
'91504',
'91505',
'91601',
'91602',
'91604',
'91605',
'91606',
'91607',
'91608')

Open in new window

LVL 8
pzozulkaAsked:
Who is Participating?
 
PortletPaulConnect With a Mentor freelancerCommented:
it is "oh so tempting" to simply toss in "distinct" when you see results like this:
InvNo      Credit      TransDate
393848      374.32      2012-02-13 00:00:00
393848      374.32      2012-02-13 00:00:00
393848      381.00      2012-03-12 00:00:00
393848      381.00      2012-03-12 00:00:00
...
but as you look deeper the cause of the "unwanted repetition" comes from the joining of tables

>> if I include R.DisplayName in the SELECT clause, it shows there are two resources on this invoice:
it's likely that (at least) one of your joins is wrong

You could start be removing any non-essential joins (i.e. if not used in where clause or select clause).

The you need to consider the many-to-one relationships, some tables may need to be changed to nested subqueries that are grouped by invoice.

Some Simple SQL Rules to Live By "DISTINCT is *usually* bad"
A good rule of thumb -- if you need a distinct list of values in a single column or two, DISTINCT is the way to go.  But if your results "don't look right" or you see some duplicate rows and can't figure out why, do NOT just add DISTINCT to your SELECT to "fix it" !!  Step back, look at your joins, and re-write your query properly.  Even worse, I've seen people simply add DISTINCT to all their SELECT's right from the start, to preemptively "avoid duplicates".  Not good.  (Side note: If you are a DISTINCT abuser, try adding meaningful primary keys to your tables).
also see
Why I Hate DISTINCT
The effects of DISTINCT in a SQL query
0
 
Brian CroweDatabase AdministratorCommented:
Although you are left-joining to the transaction table you are specifying a value for Payments.TransDate in the WHERE clause which effectively turns it into an INNER JOIN since any records with a NULL TransDate are excluded.

Honestly I'm a little confused by your objective.  Do you want TransDate in your result set or not?  What does "without using the TransDate" mean?
0
 
pzozulkaAuthor Commented:
To clarify, I do not want TransDate in the result set.

My goal is to get the following in the result set:

393848       $256.07
393848       $374.32
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
0
Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

 
Brian CroweDatabase AdministratorCommented:
What else are you changing in the query besides removing it from the SELECT clause?  contents of the SELECT clause have no effect on rowcount.
0
 
duttcomCommented:
You are using Select DISTINCT, so it sees only three records as being distinctively different. You have 7 amounts of $381 so it is only returned once. Try removing the word DISTINCT from line 1.
0
 
Brian CroweDatabase AdministratorCommented:
Thanks duttcom...i completely missed the DISTINCT
0
 
pzozulkaAuthor Commented:
That's exactly what I thought too, but for some reason this:

Select  DISTINCT I.InvNo AS 'InvNo', PD.Credit AS 'Credit', P.TransDate as 'TransDate'
From Invoices I 
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo
Where datepart(year, P.TransDate) = '2012'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN (
'90001',
'90002',
'90003',
'90004',
'90005',
'90005',
'90006',
'90007',
'90008',
'90010',
'90011',
'90012',
'90013',
'90014',
'90015',
'90016',
'90017',
'90018',
'90019',
'90020',
'90021',
'90023',
'90024',
'90025',
'90026',
'90027',
'90028',
'90029',
'90031',
'90032',
'90033',
'90034',
'90035',
'90036',
'90037',
'90038',
'90039',
'90041',
'90042',
'90043',
'90044',
'90045',
'90046',
'90047',
'90048',
'90049',
'90056',
'90057',
'90058',
'90059',
'90061',
'90062',
'90063',
'90064',
'90065',
'90066',
'90067',
'90068',
'90069',
'90071',
'90077',
'90089',
'90094',
'90095',
'90210',
'90211',
'90212',
'90230',
'90232',
'90245',
'90247',
'90248',
'90272',
'90290',
'90291',
'90292',
'90293',
'90302',
'90402',
'90501',
'90502',
'90710',
'90717',
'90731',
'90732',
'90732',
'90744',
'90810',
'91040',
'91042',
'91214',
'91303',
'91304',
'91306',
'91307',
'91311',
'91316',
'91324',
'91325',
'91326',
'91330',
'91331',
'91335',
'91340',
'91342',
'91343',
'91344',
'91345',
'91352',
'91356',
'91364',
'91367',
'91401',
'91402',
'91403',
'91405',
'91406',
'91411',
'91423',
'91436',
'91504',
'91505',
'91601',
'91602',
'91604',
'91605',
'91606',
'91607',
'91608')

Open in new window


gives me this result:

393848       $256.07       10/15/2012
393848       $374.32       2/13/2012
393848       $381.00       3/12/2012
393848       $381.00       4/16/2012
393848       $381.00       5/14/2012
393848       $381.00       6/11/2012
393848       $381.00       7/17/2012
393848       $381.00       8/13/2012
393848       $381.00       9/17/2012

while this:

Select  DISTINCT I.InvNo AS 'InvNo', PD.Credit AS 'Credit'--, P.TransDate as 'TransDate'
From Invoices I 
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo
Where datepart(year, P.TransDate) = '2012'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN (
'90001',
'90002',
'90003',
'90004',
'90005',
'90005',
'90006',
'90007',
'90008',
'90010',
'90011',
'90012',
'90013',
'90014',
'90015',
'90016',
'90017',
'90018',
'90019',
'90020',
'90021',
'90023',
'90024',
'90025',
'90026',
'90027',
'90028',
'90029',
'90031',
'90032',
'90033',
'90034',
'90035',
'90036',
'90037',
'90038',
'90039',
'90041',
'90042',
'90043',
'90044',
'90045',
'90046',
'90047',
'90048',
'90049',
'90056',
'90057',
'90058',
'90059',
'90061',
'90062',
'90063',
'90064',
'90065',
'90066',
'90067',
'90068',
'90069',
'90071',
'90077',
'90089',
'90094',
'90095',
'90210',
'90211',
'90212',
'90230',
'90232',
'90245',
'90247',
'90248',
'90272',
'90290',
'90291',
'90292',
'90293',
'90302',
'90402',
'90501',
'90502',
'90710',
'90717',
'90731',
'90732',
'90732',
'90744',
'90810',
'91040',
'91042',
'91214',
'91303',
'91304',
'91306',
'91307',
'91311',
'91316',
'91324',
'91325',
'91326',
'91330',
'91331',
'91335',
'91340',
'91342',
'91343',
'91344',
'91345',
'91352',
'91356',
'91364',
'91367',
'91401',
'91402',
'91403',
'91405',
'91406',
'91411',
'91423',
'91436',
'91504',
'91505',
'91601',
'91602',
'91604',
'91605',
'91606',
'91607',
'91608')

Open in new window


gives me this result:

393848       $256.07
393848       $374.32
393848       $381.00
0
 
duttcomConnect With a Mentor Commented:
Both your code samples above still hve Select DISTINCT there. I would expect the two results you have shown there if you leave the word DISTINCT in the query.
0
 
pzozulkaAuthor Commented:
Here are the results when removing DISTINCT (18 rows, with or without the P.TransDate -- so that's a good start):

InvNo      Credit      TransDate
393848      374.32      2012-02-13 00:00:00
393848      374.32      2012-02-13 00:00:00
393848      381.00      2012-03-12 00:00:00
393848      381.00      2012-03-12 00:00:00
393848      381.00      2012-04-16 00:00:00
393848      381.00      2012-04-16 00:00:00
393848      381.00      2012-05-14 00:00:00
393848      381.00      2012-05-14 00:00:00
393848      381.00      2012-06-11 00:00:00
393848      381.00      2012-06-11 00:00:00
393848      381.00      2012-07-17 00:00:00
393848      381.00      2012-07-17 00:00:00
393848      381.00      2012-08-13 00:00:00
393848      381.00      2012-08-13 00:00:00
393848      381.00      2012-09-17 00:00:00
393848      381.00      2012-09-17 00:00:00
393848      256.07      2012-10-15 00:00:00
393848      256.07      2012-10-15 00:00:00

I'm not sure why 18 rows, but if I include R.DisplayName in the SELECT clause, it shows there are two resources on this invoice:

393848      374.32      2012-02-13 00:00:00      Vivian
393848      374.32      2012-02-13 00:00:00      Alexian
393848      381.00      2012-03-12 00:00:00      Vivian
393848      381.00      2012-03-12 00:00:00      Alexian
393848      381.00      2012-04-16 00:00:00      Vivian
393848      381.00      2012-04-16 00:00:00      Alexian
393848      381.00      2012-05-14 00:00:00      Vivian
393848      381.00      2012-05-14 00:00:00      Alexian
393848      381.00      2012-06-11 00:00:00      Vivian
393848      381.00      2012-06-11 00:00:00      Alexian
393848      381.00      2012-07-17 00:00:00      Vivian
393848      381.00      2012-07-17 00:00:00      Alexian
393848      381.00      2012-08-13 00:00:00      Vivian
393848      381.00      2012-08-13 00:00:00      Alexian
393848      381.00      2012-09-17 00:00:00      Vivian
393848      381.00      2012-09-17 00:00:00      Alexian
393848      256.07      2012-10-15 00:00:00      Vivian
393848      256.07      2012-10-15 00:00:00      Alexian
0
 
duttcomCommented:
If you don't have a lot of records, you could always just use the code you have (that correctly generates the list you want but with the date column included) to make a view. The you can just select the two columns you want from the view instead. Just a thought.
0
 
PortletPaulfreelancerCommented:
DISTINCT

this is the cause of the different results

do it without using "distinct"
0
 
duttcomCommented:
See above for the OP's results when taking out the DISTINCT as previously suggested.
0
 
PortletPaulConnect With a Mentor freelancerCommented:
there are several suggestions in the query below
a. if you reference a left joined table in the where clause you create the impact of an inner join - so use inner join
b. I've altered the sequence of join conditions (so the 'left' is the one on left of the joining condition)
d. to remove the possibility the job related info is causing "unwanted repetition" this is joined as a nested subquery grouped by job_no
d. all that job related stuff boils down to an elaborate set of where conditions, and again this has the effect of an inner join
e. there are no clues given on behaviour of E & R, so I've commented them out for testing
f. I recommend you don't use datepart(year, P.TransDate) as this is applying a function to every row of data - and this is avoidable simply by comparing to a date range.

perhaps try this?
SELECT 
     I.InvNo     AS 'InvNo'
   , PD.Credit   AS 'Credit'
   , P.TransDate AS 'TransDate'
FROM Invoices I
INNER JOIN PaymentsDetail PD ON I.InvNo = PD.InvNo -- as P is used in where clause, use an inner join here too
INNER JOIN Payments P        ON PD.PmtNo = P.PmtNo -- used in where clause hence an inner join

--LEFT JOIN Earnings E         ON I.InvNo = E.InvNo
--LEFT JOIN Resources R        ON E.RsrcNo = R.RsrcNo

INNER JOIN (              
                SELECT J1.JobNo      
                FROM Jobs J1 
                INNER JOIN JobsLoc JL        ON J1.JobNo = JL.JobNo   -- used in where clause hence an inner join
                LEFT JOIN Firms F            ON JL.FirmNo = F.FirmNo
                LEFT JOIN Locations L        ON JL.LocNo = L.LocNo
                WHERE LEFT(
                               CASE
                                  WHEN JL.SameLocOrdering = 1
                                  THEN F.PostCode
                                  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0
                                  THEN JL.LocPostCode
                                  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1
                                  THEN F.PostCode
                                  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2
                                  THEN L.PostCode
                               END
                           , 5) IN ( 
                                    '90001', '90002', '90003', '90004', '90005', '90005', '90006', '90007', '90008', '90010',
                                    '90011', '90012', '90013', '90014', '90015', '90016', '90017', '90018', '90019', '90020',
                                    '90021', '90023', '90024', '90025', '90026', '90027', '90028', '90029', '90031', '90032',
                                    '90033', '90034', '90035', '90036', '90037', '90038', '90039', '90041', '90042', '90043',
                                    '90044', '90045', '90046', '90047', '90048', '90049', '90056', '90057', '90058', '90059',
                                    '90061', '90062', '90063', '90064', '90065', '90066', '90067', '90068', '90069', '90071',
                                    '90077', '90089', '90094', '90095', '90210', '90211', '90212', '90230', '90232', '90245', 
                                    '90247', '90248', '90272', '90290', '90291', '90292', '90293', '90302', '90402', '90501', 
                                    '90502', '90710', '90717', '90731', '90732', '90732', '90744', '90810', '91040', '91042', 
                                    '91214', '91303', '91304', '91306', '91307', '91311', '91316', '91324', '91325', '91326',
                                    '91330', '91331', '91335', '91340', '91342', '91343', '91344', '91345', '91352', '91356',
                                    '91364', '91367', '91401', '91402', '91403', '91405', '91406', '91411', '91423', '91436',
                                    '91504', '91505', '91601', '91602', '91604', '91605', '91606', '91607', '91608'
                                    )
                GROUP BY  J1.JobNo      
            ) as J ON I.JobNo = J.JobNo -- as JL was used in where clause, use an inner join here too

WHERE P.TransDate >= '20120101' and P.TransDate < '20130101' -- don't apply functions to data if avoidable, adjust the filter criteria instead
AND I.VoidDate IS NULL
AND I.InvNo = 393848
 

Open in new window

{+ edit} by the way: if you can avoid performing LEFT(....,5) on all those postcode fields then please do avoid doing it. Again this is applying functions to data to suit filtering criteria which can be inefficient.
0
 
PortletPaulfreelancerCommented:
@duttcom
hopefully my intent is now clear

the original dilemma (query row differences) is caused purely because of 'distinct'
but the solution is not to simply remove that, the query needs much more
(and hope I've suggested some useful ideas for that)
0
 
pzozulkaAuthor Commented:
if you reference a left joined table in the where clause you create the impact of an inner join - so use inner join

Can you please explain why this is, or provide a link where I can read/learn more about this? This is the first time I've heard about this.
0
 
pzozulkaAuthor Commented:
Thank you all for your help.

PortletPaul: I have a working query now that uses almost all of your suggestions mentioned above, and I removed 'distinct'. I can't get rid of the LEFT function because some zip codes are listed as '90046-2574'.

Having said that, I went back to try to figure out what specifically caused my query to fail. Using trial and error, after removing 'distinct', I commented out two table joins that were not being used in any part of the query -- Earnings, and Resources, and it started working. I tried commenting out just the Resources join, but that had no effect, so it has to be something to do with the Earnings table, but I'm not sure why. Any ideas? Also, what if I needed the Earnings table? I'm assuming I would have to then go the route you took using a subquery, right?

Also, why is referencing P.TransDate make that join an inner join all of a sudden?


Here's the version that works:
Select I.InvNo AS 'InvNo', PD.Credit AS 'Credit', P.TransDate as 'TransDate'

From Invoices I
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
--LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
--LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo

Where P.TransDate >= '20120101' and P.TransDate < '20130101'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN ( 
        '90001', '90002', '90003', '90004', '90005', '90005', '90006', '90007', '90008', '90010',
        '90011', '90012', '90013', '90014', '90015', '90016', '90017', '90018', '90019', '90020',
        '90021', '90023', '90024', '90025', '90026', '90027', '90028', '90029', '90031', '90032',
        '90033', '90034', '90035', '90036', '90037', '90038', '90039', '90041', '90042', '90043',
        '90044', '90045', '90046', '90047', '90048', '90049', '90056', '90057', '90058', '90059',
        '90061', '90062', '90063', '90064', '90065', '90066', '90067', '90068', '90069', '90071',
        '90077', '90089', '90094', '90095', '90210', '90211', '90212', '90230', '90232', '90245', 
        '90247', '90248', '90272', '90290', '90291', '90292', '90293', '90302', '90402', '90501', 
        '90502', '90710', '90717', '90731', '90732', '90732', '90744', '90810', '91040', '91042', 
        '91214', '91303', '91304', '91306', '91307', '91311', '91316', '91324', '91325', '91326',
        '91330', '91331', '91335', '91340', '91342', '91343', '91344', '91345', '91352', '91356',
        '91364', '91367', '91401', '91402', '91403', '91405', '91406', '91411', '91423', '91436',
        '91504', '91505', '91601', '91602', '91604', '91605', '91606', '91607', '91608'
        )

Open in new window

0
 
PortletPaulfreelancerCommented:
removing non-essential joins is a good thing to do in any case :)
You could start be removing any non-essential joins (i.e. if not used in where clause or select clause).

e. there are no clues given on behaviour of E & R, so I've commented them out for testing

>>it has to be something to do with the Earnings table, but I'm not sure why. Any ideas?
yes, that is causing a multiplication of rows probalay a symptom of a many-to-one relations and/or maybe that the join conditions aren't sufficient (could it need more than just invoice no.?)

also note that there is little point declaring a LEFT join then using that table in the where clause (unless you also permit null rows from that table).

>> I can't get rid of the LEFT function because some zip codes are listed as '90046-2574'.
pity, so be it

this is a really good start by the way - well done.
0
 
pzozulkaAuthor Commented:
Still perplexed by: a. if you reference a left joined table in the where clause you create the impact of an inner join - so use inner join

Here's a test:

Table 1
ID     Age
1       17
2       50
3       55

Table 2
ID    Name
1      John
3      Bill

Select t1.id, t1.age, t2.name
From Table1 t1 left join table2 t2 on t1.id = t2.id
Where t1.age > 5

Open in new window


Results:
1 17 John
2 50 Null
3 55 Bill

If I used inner join instead of left join, the results would have been:
1 17 John
3 55 Bill
0
 
PortletPaulfreelancerCommented:
>>Also, what if I needed the Earnings table? I'm assuming I would have to then go the route you took using a subquery, right?
that would depend on the expected outcome, but assuming you still wanted to restrict the rows like the current query then - yes, a nested subquery using group by (or perhaps row_number() ) could be used.

nb: row_umber() is useful for locating "the most recent record" which a group by isn't so good at.

in discussing SQL "it depends" is used at lot :)
0
 
PortletPaulfreelancerCommented:
Select t1.id, t1.age, t2.name
From Table1 t1
left join table2 t2 on t1.id = t2.id -- T2 is not referenced in the where clause
Where t1.age > 5

Results:
1 17 John
2 50 Null
3 55 Bill

IF however you did this:

Select t1.id, t1.age, t2.name
From Table1 t1
left join table2 t2 on t1.id = t2.id -- T2 IS referenced in the where clause now
Where t1.age > 5
and t2,name like '%ill' or t2.name like '%ohn'

Results:
1 17 John
3 55 Bill

which is the same as the inner join

that was the point

In your query you left join J, JL, F & L
then you use JL, F & L in the where clause (J gets involved due to the linking to JL)
so, for JL, F, L & J
you are getting the effect of inner joins

try it
0
 
PortletPaulfreelancerCommented:
just wondering if this is better:

Table 1
ID     Age
1       17
2       50
3       55

Table 2
ID    Name
1      Benny
3      Bill

----------- IN1
Select t1.id, t1.age, t2.name
From Table1 t1
INNER join table2 t2 on t1.id = t2.id
Where t1.age > 5
and t2.name like 'B%'    -- T2 now referenced in where clause

Result: IN1
1 17 Benny
3 55 Bill


----------- OUT1
Select t1.id, t1.age, t2.name
From Table1 t1
LEFT join table2 t2 on t1.id = t2.id -- here T2 is not referenced in the where clause
Where t1.age > 5

Result: OUT1
1 17 Benny
2 50 Null
3 55 Bill


----------- OUT2
Select t1.id, t1.age, t2.name
From Table1 t1
LEFT join table2 t2 on t1.id = t2.id -- T2 IS referenced in the where clause now
Where t1.age > 5
and t2.name like 'B%' -- T2 now referenced

Result: OUT2
1 17 Benny
3 55 Bill


----------- OUT3
Select t1.id, t1.age, t2.name
From Table1 t1
LEFT join table2 t2 on t1.id = t2.id
Where t1.age > 5
and (t2.name like 'B%'   -- T2 now referenced in where clause
     or t2.name is null) -- T2 left join is preseved
Result: OUT3
1 17 Benny
2 50 Null
3 55 Bill


OUT1 = OUT3 (permitting T2 nulls in where clause)

otherwise

IN1 = OUT2 (by ignoring T2 nulls in where clause

Message:
Beware of "where"
if you filter a query in the where clause on an outer joined table
you can create the same conditions as an inner join
thus making the outer join a redundant cost on the query
either:
use an inner join
or
allow nulls via the where clause
or
use join conditions instead of the where clause
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.