Avatar of pzozulka
pzozulka asked on

SQL Query Help

I am a little puzzled by the results of the query below. When I look up the invoice number in our application, it correctly shows the following results:

InvNo         Amount   TransactionDate
393848       $256.07       10/15/2012
393848       $374.32       2/13/2012
393848       $381.00       3/12/2012
393848       $381.00       4/16/2012
393848       $381.00       5/14/2012
393848       $381.00       6/11/2012
393848       $381.00       7/17/2012
393848       $381.00       8/13/2012
393848       $381.00       9/17/2012


Since the transaction date is of no importance to me, I tried running a query without TransDate in the SELECT clause, and am receiving these results:

393848       $256.07
393848       $374.32
393848       $381.00

However, when I do include the TransDate, it shows everything correctly, just like the application shows (all 9 records).

How do I get it to show me the same thing the application is showing without using the TransDate?

Select  DISTINCT I.InvNo AS 'InvNo', PD.Credit AS 'Credit', P.TransDate as 'TransDate'
From Invoices I 
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo
Where datepart(year, P.TransDate) = '2012'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN (
'90001',
'90002',
'90003',
'90004',
'90005',
'90005',
'90006',
'90007',
'90008',
'90010',
'90011',
'90012',
'90013',
'90014',
'90015',
'90016',
'90017',
'90018',
'90019',
'90020',
'90021',
'90023',
'90024',
'90025',
'90026',
'90027',
'90028',
'90029',
'90031',
'90032',
'90033',
'90034',
'90035',
'90036',
'90037',
'90038',
'90039',
'90041',
'90042',
'90043',
'90044',
'90045',
'90046',
'90047',
'90048',
'90049',
'90056',
'90057',
'90058',
'90059',
'90061',
'90062',
'90063',
'90064',
'90065',
'90066',
'90067',
'90068',
'90069',
'90071',
'90077',
'90089',
'90094',
'90095',
'90210',
'90211',
'90212',
'90230',
'90232',
'90245',
'90247',
'90248',
'90272',
'90290',
'90291',
'90292',
'90293',
'90302',
'90402',
'90501',
'90502',
'90710',
'90717',
'90731',
'90732',
'90732',
'90744',
'90810',
'91040',
'91042',
'91214',
'91303',
'91304',
'91306',
'91307',
'91311',
'91316',
'91324',
'91325',
'91326',
'91330',
'91331',
'91335',
'91340',
'91342',
'91343',
'91344',
'91345',
'91352',
'91356',
'91364',
'91367',
'91401',
'91402',
'91403',
'91405',
'91406',
'91411',
'91423',
'91436',
'91504',
'91505',
'91601',
'91602',
'91604',
'91605',
'91606',
'91607',
'91608')

Open in new window

Microsoft SQL Server

Avatar of undefined
Last Comment
PortletPaul

8/22/2022 - Mon
Brian Crowe

Although you are left-joining to the transaction table you are specifying a value for Payments.TransDate in the WHERE clause which effectively turns it into an INNER JOIN since any records with a NULL TransDate are excluded.

Honestly I'm a little confused by your objective.  Do you want TransDate in your result set or not?  What does "without using the TransDate" mean?
ASKER
pzozulka

To clarify, I do not want TransDate in the result set.

My goal is to get the following in the result set:

393848       $256.07
393848       $374.32
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
Brian Crowe

What else are you changing in the query besides removing it from the SELECT clause?  contents of the SELECT clause have no effect on rowcount.
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
duttcom

You are using Select DISTINCT, so it sees only three records as being distinctively different. You have 7 amounts of $381 so it is only returned once. Try removing the word DISTINCT from line 1.
Brian Crowe

Thanks duttcom...i completely missed the DISTINCT
ASKER
pzozulka

That's exactly what I thought too, but for some reason this:

Select  DISTINCT I.InvNo AS 'InvNo', PD.Credit AS 'Credit', P.TransDate as 'TransDate'
From Invoices I 
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo
Where datepart(year, P.TransDate) = '2012'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN (
'90001',
'90002',
'90003',
'90004',
'90005',
'90005',
'90006',
'90007',
'90008',
'90010',
'90011',
'90012',
'90013',
'90014',
'90015',
'90016',
'90017',
'90018',
'90019',
'90020',
'90021',
'90023',
'90024',
'90025',
'90026',
'90027',
'90028',
'90029',
'90031',
'90032',
'90033',
'90034',
'90035',
'90036',
'90037',
'90038',
'90039',
'90041',
'90042',
'90043',
'90044',
'90045',
'90046',
'90047',
'90048',
'90049',
'90056',
'90057',
'90058',
'90059',
'90061',
'90062',
'90063',
'90064',
'90065',
'90066',
'90067',
'90068',
'90069',
'90071',
'90077',
'90089',
'90094',
'90095',
'90210',
'90211',
'90212',
'90230',
'90232',
'90245',
'90247',
'90248',
'90272',
'90290',
'90291',
'90292',
'90293',
'90302',
'90402',
'90501',
'90502',
'90710',
'90717',
'90731',
'90732',
'90732',
'90744',
'90810',
'91040',
'91042',
'91214',
'91303',
'91304',
'91306',
'91307',
'91311',
'91316',
'91324',
'91325',
'91326',
'91330',
'91331',
'91335',
'91340',
'91342',
'91343',
'91344',
'91345',
'91352',
'91356',
'91364',
'91367',
'91401',
'91402',
'91403',
'91405',
'91406',
'91411',
'91423',
'91436',
'91504',
'91505',
'91601',
'91602',
'91604',
'91605',
'91606',
'91607',
'91608')

Open in new window


gives me this result:

393848       $256.07       10/15/2012
393848       $374.32       2/13/2012
393848       $381.00       3/12/2012
393848       $381.00       4/16/2012
393848       $381.00       5/14/2012
393848       $381.00       6/11/2012
393848       $381.00       7/17/2012
393848       $381.00       8/13/2012
393848       $381.00       9/17/2012

while this:

Select  DISTINCT I.InvNo AS 'InvNo', PD.Credit AS 'Credit'--, P.TransDate as 'TransDate'
From Invoices I 
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo
Where datepart(year, P.TransDate) = '2012'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN (
'90001',
'90002',
'90003',
'90004',
'90005',
'90005',
'90006',
'90007',
'90008',
'90010',
'90011',
'90012',
'90013',
'90014',
'90015',
'90016',
'90017',
'90018',
'90019',
'90020',
'90021',
'90023',
'90024',
'90025',
'90026',
'90027',
'90028',
'90029',
'90031',
'90032',
'90033',
'90034',
'90035',
'90036',
'90037',
'90038',
'90039',
'90041',
'90042',
'90043',
'90044',
'90045',
'90046',
'90047',
'90048',
'90049',
'90056',
'90057',
'90058',
'90059',
'90061',
'90062',
'90063',
'90064',
'90065',
'90066',
'90067',
'90068',
'90069',
'90071',
'90077',
'90089',
'90094',
'90095',
'90210',
'90211',
'90212',
'90230',
'90232',
'90245',
'90247',
'90248',
'90272',
'90290',
'90291',
'90292',
'90293',
'90302',
'90402',
'90501',
'90502',
'90710',
'90717',
'90731',
'90732',
'90732',
'90744',
'90810',
'91040',
'91042',
'91214',
'91303',
'91304',
'91306',
'91307',
'91311',
'91316',
'91324',
'91325',
'91326',
'91330',
'91331',
'91335',
'91340',
'91342',
'91343',
'91344',
'91345',
'91352',
'91356',
'91364',
'91367',
'91401',
'91402',
'91403',
'91405',
'91406',
'91411',
'91423',
'91436',
'91504',
'91505',
'91601',
'91602',
'91604',
'91605',
'91606',
'91607',
'91608')

Open in new window


gives me this result:

393848       $256.07
393848       $374.32
393848       $381.00
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
SOLUTION
duttcom

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
ASKER
pzozulka

Here are the results when removing DISTINCT (18 rows, with or without the P.TransDate -- so that's a good start):

InvNo      Credit      TransDate
393848      374.32      2012-02-13 00:00:00
393848      374.32      2012-02-13 00:00:00
393848      381.00      2012-03-12 00:00:00
393848      381.00      2012-03-12 00:00:00
393848      381.00      2012-04-16 00:00:00
393848      381.00      2012-04-16 00:00:00
393848      381.00      2012-05-14 00:00:00
393848      381.00      2012-05-14 00:00:00
393848      381.00      2012-06-11 00:00:00
393848      381.00      2012-06-11 00:00:00
393848      381.00      2012-07-17 00:00:00
393848      381.00      2012-07-17 00:00:00
393848      381.00      2012-08-13 00:00:00
393848      381.00      2012-08-13 00:00:00
393848      381.00      2012-09-17 00:00:00
393848      381.00      2012-09-17 00:00:00
393848      256.07      2012-10-15 00:00:00
393848      256.07      2012-10-15 00:00:00

I'm not sure why 18 rows, but if I include R.DisplayName in the SELECT clause, it shows there are two resources on this invoice:

393848      374.32      2012-02-13 00:00:00      Vivian
393848      374.32      2012-02-13 00:00:00      Alexian
393848      381.00      2012-03-12 00:00:00      Vivian
393848      381.00      2012-03-12 00:00:00      Alexian
393848      381.00      2012-04-16 00:00:00      Vivian
393848      381.00      2012-04-16 00:00:00      Alexian
393848      381.00      2012-05-14 00:00:00      Vivian
393848      381.00      2012-05-14 00:00:00      Alexian
393848      381.00      2012-06-11 00:00:00      Vivian
393848      381.00      2012-06-11 00:00:00      Alexian
393848      381.00      2012-07-17 00:00:00      Vivian
393848      381.00      2012-07-17 00:00:00      Alexian
393848      381.00      2012-08-13 00:00:00      Vivian
393848      381.00      2012-08-13 00:00:00      Alexian
393848      381.00      2012-09-17 00:00:00      Vivian
393848      381.00      2012-09-17 00:00:00      Alexian
393848      256.07      2012-10-15 00:00:00      Vivian
393848      256.07      2012-10-15 00:00:00      Alexian
duttcom

If you don't have a lot of records, you could always just use the code you have (that correctly generates the list you want but with the date column included) to make a view. The you can just select the two columns you want from the view instead. Just a thought.
PortletPaul

DISTINCT

this is the cause of the different results

do it without using "distinct"
Your help has saved me hundreds of hours of internet surfing.
fblack61
ASKER CERTIFIED SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
duttcom

See above for the OP's results when taking out the DISTINCT as previously suggested.
SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
PortletPaul

@duttcom
hopefully my intent is now clear

the original dilemma (query row differences) is caused purely because of 'distinct'
but the solution is not to simply remove that, the query needs much more
(and hope I've suggested some useful ideas for that)
ASKER
pzozulka

if you reference a left joined table in the where clause you create the impact of an inner join - so use inner join

Can you please explain why this is, or provide a link where I can read/learn more about this? This is the first time I've heard about this.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
pzozulka

Thank you all for your help.

PortletPaul: I have a working query now that uses almost all of your suggestions mentioned above, and I removed 'distinct'. I can't get rid of the LEFT function because some zip codes are listed as '90046-2574'.

Having said that, I went back to try to figure out what specifically caused my query to fail. Using trial and error, after removing 'distinct', I commented out two table joins that were not being used in any part of the query -- Earnings, and Resources, and it started working. I tried commenting out just the Resources join, but that had no effect, so it has to be something to do with the Earnings table, but I'm not sure why. Any ideas? Also, what if I needed the Earnings table? I'm assuming I would have to then go the route you took using a subquery, right?

Also, why is referencing P.TransDate make that join an inner join all of a sudden?


Here's the version that works:
Select I.InvNo AS 'InvNo', PD.Credit AS 'Credit', P.TransDate as 'TransDate'

From Invoices I
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
--LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
--LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo

Where P.TransDate >= '20120101' and P.TransDate < '20130101'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN ( 
        '90001', '90002', '90003', '90004', '90005', '90005', '90006', '90007', '90008', '90010',
        '90011', '90012', '90013', '90014', '90015', '90016', '90017', '90018', '90019', '90020',
        '90021', '90023', '90024', '90025', '90026', '90027', '90028', '90029', '90031', '90032',
        '90033', '90034', '90035', '90036', '90037', '90038', '90039', '90041', '90042', '90043',
        '90044', '90045', '90046', '90047', '90048', '90049', '90056', '90057', '90058', '90059',
        '90061', '90062', '90063', '90064', '90065', '90066', '90067', '90068', '90069', '90071',
        '90077', '90089', '90094', '90095', '90210', '90211', '90212', '90230', '90232', '90245', 
        '90247', '90248', '90272', '90290', '90291', '90292', '90293', '90302', '90402', '90501', 
        '90502', '90710', '90717', '90731', '90732', '90732', '90744', '90810', '91040', '91042', 
        '91214', '91303', '91304', '91306', '91307', '91311', '91316', '91324', '91325', '91326',
        '91330', '91331', '91335', '91340', '91342', '91343', '91344', '91345', '91352', '91356',
        '91364', '91367', '91401', '91402', '91403', '91405', '91406', '91411', '91423', '91436',
        '91504', '91505', '91601', '91602', '91604', '91605', '91606', '91607', '91608'
        )

Open in new window

PortletPaul

removing non-essential joins is a good thing to do in any case :)
You could start be removing any non-essential joins (i.e. if not used in where clause or select clause).

e. there are no clues given on behaviour of E & R, so I've commented them out for testing

>>it has to be something to do with the Earnings table, but I'm not sure why. Any ideas?
yes, that is causing a multiplication of rows probalay a symptom of a many-to-one relations and/or maybe that the join conditions aren't sufficient (could it need more than just invoice no.?)

also note that there is little point declaring a LEFT join then using that table in the where clause (unless you also permit null rows from that table).

>> I can't get rid of the LEFT function because some zip codes are listed as '90046-2574'.
pity, so be it

this is a really good start by the way - well done.
ASKER
pzozulka

Still perplexed by: a. if you reference a left joined table in the where clause you create the impact of an inner join - so use inner join

Here's a test:

Table 1
ID     Age
1       17
2       50
3       55

Table 2
ID    Name
1      John
3      Bill

Select t1.id, t1.age, t2.name
From Table1 t1 left join table2 t2 on t1.id = t2.id
Where t1.age > 5

Open in new window


Results:
1 17 John
2 50 Null
3 55 Bill

If I used inner join instead of left join, the results would have been:
1 17 John
3 55 Bill
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
PortletPaul

>>Also, what if I needed the Earnings table? I'm assuming I would have to then go the route you took using a subquery, right?
that would depend on the expected outcome, but assuming you still wanted to restrict the rows like the current query then - yes, a nested subquery using group by (or perhaps row_number() ) could be used.

nb: row_umber() is useful for locating "the most recent record" which a group by isn't so good at.

in discussing SQL "it depends" is used at lot :)
PortletPaul

Select t1.id, t1.age, t2.name
From Table1 t1
left join table2 t2 on t1.id = t2.id -- T2 is not referenced in the where clause
Where t1.age > 5

Results:
1 17 John
2 50 Null
3 55 Bill

IF however you did this:

Select t1.id, t1.age, t2.name
From Table1 t1
left join table2 t2 on t1.id = t2.id -- T2 IS referenced in the where clause now
Where t1.age > 5
and t2,name like '%ill' or t2.name like '%ohn'

Results:
1 17 John
3 55 Bill

which is the same as the inner join

that was the point

In your query you left join J, JL, F & L
then you use JL, F & L in the where clause (J gets involved due to the linking to JL)
so, for JL, F, L & J
you are getting the effect of inner joins

try it
PortletPaul

just wondering if this is better:

Table 1
ID     Age
1       17
2       50
3       55

Table 2
ID    Name
1      Benny
3      Bill

----------- IN1
Select t1.id, t1.age, t2.name
From Table1 t1
INNER join table2 t2 on t1.id = t2.id
Where t1.age > 5
and t2.name like 'B%'    -- T2 now referenced in where clause

Result: IN1
1 17 Benny
3 55 Bill


----------- OUT1
Select t1.id, t1.age, t2.name
From Table1 t1
LEFT join table2 t2 on t1.id = t2.id -- here T2 is not referenced in the where clause
Where t1.age > 5

Result: OUT1
1 17 Benny
2 50 Null
3 55 Bill


----------- OUT2
Select t1.id, t1.age, t2.name
From Table1 t1
LEFT join table2 t2 on t1.id = t2.id -- T2 IS referenced in the where clause now
Where t1.age > 5
and t2.name like 'B%' -- T2 now referenced

Result: OUT2
1 17 Benny
3 55 Bill


----------- OUT3
Select t1.id, t1.age, t2.name
From Table1 t1
LEFT join table2 t2 on t1.id = t2.id
Where t1.age > 5
and (t2.name like 'B%'   -- T2 now referenced in where clause
     or t2.name is null) -- T2 left join is preseved
Result: OUT3
1 17 Benny
2 50 Null
3 55 Bill


OUT1 = OUT3 (permitting T2 nulls in where clause)

otherwise

IN1 = OUT2 (by ignoring T2 nulls in where clause

Message:
Beware of "where"
if you filter a query in the where clause on an outer joined table
you can create the same conditions as an inner join
thus making the outer join a redundant cost on the query
either:
use an inner join
or
allow nulls via the where clause
or
use join conditions instead of the where clause
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.