Link to home
Start Free TrialLog in
Avatar of pzozulka
pzozulka

asked on

SQL Query Help

I am a little puzzled by the results of the query below. When I look up the invoice number in our application, it correctly shows the following results:

InvNo         Amount   TransactionDate
393848       $256.07       10/15/2012
393848       $374.32       2/13/2012
393848       $381.00       3/12/2012
393848       $381.00       4/16/2012
393848       $381.00       5/14/2012
393848       $381.00       6/11/2012
393848       $381.00       7/17/2012
393848       $381.00       8/13/2012
393848       $381.00       9/17/2012


Since the transaction date is of no importance to me, I tried running a query without TransDate in the SELECT clause, and am receiving these results:

393848       $256.07
393848       $374.32
393848       $381.00

However, when I do include the TransDate, it shows everything correctly, just like the application shows (all 9 records).

How do I get it to show me the same thing the application is showing without using the TransDate?

Select  DISTINCT I.InvNo AS 'InvNo', PD.Credit AS 'Credit', P.TransDate as 'TransDate'
From Invoices I 
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo
Where datepart(year, P.TransDate) = '2012'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN (
'90001',
'90002',
'90003',
'90004',
'90005',
'90005',
'90006',
'90007',
'90008',
'90010',
'90011',
'90012',
'90013',
'90014',
'90015',
'90016',
'90017',
'90018',
'90019',
'90020',
'90021',
'90023',
'90024',
'90025',
'90026',
'90027',
'90028',
'90029',
'90031',
'90032',
'90033',
'90034',
'90035',
'90036',
'90037',
'90038',
'90039',
'90041',
'90042',
'90043',
'90044',
'90045',
'90046',
'90047',
'90048',
'90049',
'90056',
'90057',
'90058',
'90059',
'90061',
'90062',
'90063',
'90064',
'90065',
'90066',
'90067',
'90068',
'90069',
'90071',
'90077',
'90089',
'90094',
'90095',
'90210',
'90211',
'90212',
'90230',
'90232',
'90245',
'90247',
'90248',
'90272',
'90290',
'90291',
'90292',
'90293',
'90302',
'90402',
'90501',
'90502',
'90710',
'90717',
'90731',
'90732',
'90732',
'90744',
'90810',
'91040',
'91042',
'91214',
'91303',
'91304',
'91306',
'91307',
'91311',
'91316',
'91324',
'91325',
'91326',
'91330',
'91331',
'91335',
'91340',
'91342',
'91343',
'91344',
'91345',
'91352',
'91356',
'91364',
'91367',
'91401',
'91402',
'91403',
'91405',
'91406',
'91411',
'91423',
'91436',
'91504',
'91505',
'91601',
'91602',
'91604',
'91605',
'91606',
'91607',
'91608')

Open in new window

Avatar of Brian Crowe
Brian Crowe
Flag of United States of America image

Although you are left-joining to the transaction table you are specifying a value for Payments.TransDate in the WHERE clause which effectively turns it into an INNER JOIN since any records with a NULL TransDate are excluded.

Honestly I'm a little confused by your objective.  Do you want TransDate in your result set or not?  What does "without using the TransDate" mean?
Avatar of pzozulka
pzozulka

ASKER

To clarify, I do not want TransDate in the result set.

My goal is to get the following in the result set:

393848       $256.07
393848       $374.32
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
393848       $381.00
What else are you changing in the query besides removing it from the SELECT clause?  contents of the SELECT clause have no effect on rowcount.
You are using Select DISTINCT, so it sees only three records as being distinctively different. You have 7 amounts of $381 so it is only returned once. Try removing the word DISTINCT from line 1.
Thanks duttcom...i completely missed the DISTINCT
That's exactly what I thought too, but for some reason this:

Select  DISTINCT I.InvNo AS 'InvNo', PD.Credit AS 'Credit', P.TransDate as 'TransDate'
From Invoices I 
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo
Where datepart(year, P.TransDate) = '2012'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN (
'90001',
'90002',
'90003',
'90004',
'90005',
'90005',
'90006',
'90007',
'90008',
'90010',
'90011',
'90012',
'90013',
'90014',
'90015',
'90016',
'90017',
'90018',
'90019',
'90020',
'90021',
'90023',
'90024',
'90025',
'90026',
'90027',
'90028',
'90029',
'90031',
'90032',
'90033',
'90034',
'90035',
'90036',
'90037',
'90038',
'90039',
'90041',
'90042',
'90043',
'90044',
'90045',
'90046',
'90047',
'90048',
'90049',
'90056',
'90057',
'90058',
'90059',
'90061',
'90062',
'90063',
'90064',
'90065',
'90066',
'90067',
'90068',
'90069',
'90071',
'90077',
'90089',
'90094',
'90095',
'90210',
'90211',
'90212',
'90230',
'90232',
'90245',
'90247',
'90248',
'90272',
'90290',
'90291',
'90292',
'90293',
'90302',
'90402',
'90501',
'90502',
'90710',
'90717',
'90731',
'90732',
'90732',
'90744',
'90810',
'91040',
'91042',
'91214',
'91303',
'91304',
'91306',
'91307',
'91311',
'91316',
'91324',
'91325',
'91326',
'91330',
'91331',
'91335',
'91340',
'91342',
'91343',
'91344',
'91345',
'91352',
'91356',
'91364',
'91367',
'91401',
'91402',
'91403',
'91405',
'91406',
'91411',
'91423',
'91436',
'91504',
'91505',
'91601',
'91602',
'91604',
'91605',
'91606',
'91607',
'91608')

Open in new window


gives me this result:

393848       $256.07       10/15/2012
393848       $374.32       2/13/2012
393848       $381.00       3/12/2012
393848       $381.00       4/16/2012
393848       $381.00       5/14/2012
393848       $381.00       6/11/2012
393848       $381.00       7/17/2012
393848       $381.00       8/13/2012
393848       $381.00       9/17/2012

while this:

Select  DISTINCT I.InvNo AS 'InvNo', PD.Credit AS 'Credit'--, P.TransDate as 'TransDate'
From Invoices I 
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo
Where datepart(year, P.TransDate) = '2012'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN (
'90001',
'90002',
'90003',
'90004',
'90005',
'90005',
'90006',
'90007',
'90008',
'90010',
'90011',
'90012',
'90013',
'90014',
'90015',
'90016',
'90017',
'90018',
'90019',
'90020',
'90021',
'90023',
'90024',
'90025',
'90026',
'90027',
'90028',
'90029',
'90031',
'90032',
'90033',
'90034',
'90035',
'90036',
'90037',
'90038',
'90039',
'90041',
'90042',
'90043',
'90044',
'90045',
'90046',
'90047',
'90048',
'90049',
'90056',
'90057',
'90058',
'90059',
'90061',
'90062',
'90063',
'90064',
'90065',
'90066',
'90067',
'90068',
'90069',
'90071',
'90077',
'90089',
'90094',
'90095',
'90210',
'90211',
'90212',
'90230',
'90232',
'90245',
'90247',
'90248',
'90272',
'90290',
'90291',
'90292',
'90293',
'90302',
'90402',
'90501',
'90502',
'90710',
'90717',
'90731',
'90732',
'90732',
'90744',
'90810',
'91040',
'91042',
'91214',
'91303',
'91304',
'91306',
'91307',
'91311',
'91316',
'91324',
'91325',
'91326',
'91330',
'91331',
'91335',
'91340',
'91342',
'91343',
'91344',
'91345',
'91352',
'91356',
'91364',
'91367',
'91401',
'91402',
'91403',
'91405',
'91406',
'91411',
'91423',
'91436',
'91504',
'91505',
'91601',
'91602',
'91604',
'91605',
'91606',
'91607',
'91608')

Open in new window


gives me this result:

393848       $256.07
393848       $374.32
393848       $381.00
SOLUTION
Avatar of duttcom
duttcom
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Here are the results when removing DISTINCT (18 rows, with or without the P.TransDate -- so that's a good start):

InvNo      Credit      TransDate
393848      374.32      2012-02-13 00:00:00
393848      374.32      2012-02-13 00:00:00
393848      381.00      2012-03-12 00:00:00
393848      381.00      2012-03-12 00:00:00
393848      381.00      2012-04-16 00:00:00
393848      381.00      2012-04-16 00:00:00
393848      381.00      2012-05-14 00:00:00
393848      381.00      2012-05-14 00:00:00
393848      381.00      2012-06-11 00:00:00
393848      381.00      2012-06-11 00:00:00
393848      381.00      2012-07-17 00:00:00
393848      381.00      2012-07-17 00:00:00
393848      381.00      2012-08-13 00:00:00
393848      381.00      2012-08-13 00:00:00
393848      381.00      2012-09-17 00:00:00
393848      381.00      2012-09-17 00:00:00
393848      256.07      2012-10-15 00:00:00
393848      256.07      2012-10-15 00:00:00

I'm not sure why 18 rows, but if I include R.DisplayName in the SELECT clause, it shows there are two resources on this invoice:

393848      374.32      2012-02-13 00:00:00      Vivian
393848      374.32      2012-02-13 00:00:00      Alexian
393848      381.00      2012-03-12 00:00:00      Vivian
393848      381.00      2012-03-12 00:00:00      Alexian
393848      381.00      2012-04-16 00:00:00      Vivian
393848      381.00      2012-04-16 00:00:00      Alexian
393848      381.00      2012-05-14 00:00:00      Vivian
393848      381.00      2012-05-14 00:00:00      Alexian
393848      381.00      2012-06-11 00:00:00      Vivian
393848      381.00      2012-06-11 00:00:00      Alexian
393848      381.00      2012-07-17 00:00:00      Vivian
393848      381.00      2012-07-17 00:00:00      Alexian
393848      381.00      2012-08-13 00:00:00      Vivian
393848      381.00      2012-08-13 00:00:00      Alexian
393848      381.00      2012-09-17 00:00:00      Vivian
393848      381.00      2012-09-17 00:00:00      Alexian
393848      256.07      2012-10-15 00:00:00      Vivian
393848      256.07      2012-10-15 00:00:00      Alexian
If you don't have a lot of records, you could always just use the code you have (that correctly generates the list you want but with the date column included) to make a view. The you can just select the two columns you want from the view instead. Just a thought.
DISTINCT

this is the cause of the different results

do it without using "distinct"
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
See above for the OP's results when taking out the DISTINCT as previously suggested.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@duttcom
hopefully my intent is now clear

the original dilemma (query row differences) is caused purely because of 'distinct'
but the solution is not to simply remove that, the query needs much more
(and hope I've suggested some useful ideas for that)
if you reference a left joined table in the where clause you create the impact of an inner join - so use inner join

Can you please explain why this is, or provide a link where I can read/learn more about this? This is the first time I've heard about this.
Thank you all for your help.

PortletPaul: I have a working query now that uses almost all of your suggestions mentioned above, and I removed 'distinct'. I can't get rid of the LEFT function because some zip codes are listed as '90046-2574'.

Having said that, I went back to try to figure out what specifically caused my query to fail. Using trial and error, after removing 'distinct', I commented out two table joins that were not being used in any part of the query -- Earnings, and Resources, and it started working. I tried commenting out just the Resources join, but that had no effect, so it has to be something to do with the Earnings table, but I'm not sure why. Any ideas? Also, what if I needed the Earnings table? I'm assuming I would have to then go the route you took using a subquery, right?

Also, why is referencing P.TransDate make that join an inner join all of a sudden?


Here's the version that works:
Select I.InvNo AS 'InvNo', PD.Credit AS 'Credit', P.TransDate as 'TransDate'

From Invoices I
LEFT OUTER JOIN Jobs J					 ON J.JobNo = I.JobNo
--LEFT OUTER JOIN Earnings E				 ON E.InvNo = I.InvNo
LEFT OUTER JOIN PaymentsDetail PD		 ON PD.InvNo = I.InvNo
LEFT OUTER JOIN Payments P				 ON P.PmtNo = PD.PmtNo
LEFT OUTER JOIN JobsLoc JL				 ON JL.JobNo = J.JobNo
LEFT OUTER JOIN Firms F					 ON JL.FirmNo = F.FirmNo
LEFT OUTER JOIN Locations L				 ON L.LocNo = JL.LocNo
--LEFT OUTER JOIN Resources R				 ON R.RsrcNo = E.RsrcNo

Where P.TransDate >= '20120101' and P.TransDate < '20130101'
AND I.VoidDate is NULL
AND I.InvNo = 393848
AND
  LEFT(CASE
  WHEN JL.SameLocOrdering = 1						 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 0 THEN JL.LocPostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 1 THEN F.PostCode
  WHEN JL.SameLocOrdering = 0 AND JL.LocLinkType = 2 THEN L.PostCode
  END,5) IN ( 
        '90001', '90002', '90003', '90004', '90005', '90005', '90006', '90007', '90008', '90010',
        '90011', '90012', '90013', '90014', '90015', '90016', '90017', '90018', '90019', '90020',
        '90021', '90023', '90024', '90025', '90026', '90027', '90028', '90029', '90031', '90032',
        '90033', '90034', '90035', '90036', '90037', '90038', '90039', '90041', '90042', '90043',
        '90044', '90045', '90046', '90047', '90048', '90049', '90056', '90057', '90058', '90059',
        '90061', '90062', '90063', '90064', '90065', '90066', '90067', '90068', '90069', '90071',
        '90077', '90089', '90094', '90095', '90210', '90211', '90212', '90230', '90232', '90245', 
        '90247', '90248', '90272', '90290', '90291', '90292', '90293', '90302', '90402', '90501', 
        '90502', '90710', '90717', '90731', '90732', '90732', '90744', '90810', '91040', '91042', 
        '91214', '91303', '91304', '91306', '91307', '91311', '91316', '91324', '91325', '91326',
        '91330', '91331', '91335', '91340', '91342', '91343', '91344', '91345', '91352', '91356',
        '91364', '91367', '91401', '91402', '91403', '91405', '91406', '91411', '91423', '91436',
        '91504', '91505', '91601', '91602', '91604', '91605', '91606', '91607', '91608'
        )

Open in new window

removing non-essential joins is a good thing to do in any case :)
You could start be removing any non-essential joins (i.e. if not used in where clause or select clause).

e. there are no clues given on behaviour of E & R, so I've commented them out for testing

>>it has to be something to do with the Earnings table, but I'm not sure why. Any ideas?
yes, that is causing a multiplication of rows probalay a symptom of a many-to-one relations and/or maybe that the join conditions aren't sufficient (could it need more than just invoice no.?)

also note that there is little point declaring a LEFT join then using that table in the where clause (unless you also permit null rows from that table).

>> I can't get rid of the LEFT function because some zip codes are listed as '90046-2574'.
pity, so be it

this is a really good start by the way - well done.
Still perplexed by: a. if you reference a left joined table in the where clause you create the impact of an inner join - so use inner join

Here's a test:

Table 1
ID     Age
1       17
2       50
3       55

Table 2
ID    Name
1      John
3      Bill

Select t1.id, t1.age, t2.name
From Table1 t1 left join table2 t2 on t1.id = t2.id
Where t1.age > 5

Open in new window


Results:
1 17 John
2 50 Null
3 55 Bill

If I used inner join instead of left join, the results would have been:
1 17 John
3 55 Bill
>>Also, what if I needed the Earnings table? I'm assuming I would have to then go the route you took using a subquery, right?
that would depend on the expected outcome, but assuming you still wanted to restrict the rows like the current query then - yes, a nested subquery using group by (or perhaps row_number() ) could be used.

nb: row_umber() is useful for locating "the most recent record" which a group by isn't so good at.

in discussing SQL "it depends" is used at lot :)
Select t1.id, t1.age, t2.name
From Table1 t1
left join table2 t2 on t1.id = t2.id -- T2 is not referenced in the where clause
Where t1.age > 5

Results:
1 17 John
2 50 Null
3 55 Bill

IF however you did this:

Select t1.id, t1.age, t2.name
From Table1 t1
left join table2 t2 on t1.id = t2.id -- T2 IS referenced in the where clause now
Where t1.age > 5
and t2,name like '%ill' or t2.name like '%ohn'

Results:
1 17 John
3 55 Bill

which is the same as the inner join

that was the point

In your query you left join J, JL, F & L
then you use JL, F & L in the where clause (J gets involved due to the linking to JL)
so, for JL, F, L & J
you are getting the effect of inner joins

try it
just wondering if this is better:

Table 1
ID     Age
1       17
2       50
3       55

Table 2
ID    Name
1      Benny
3      Bill

----------- IN1
Select t1.id, t1.age, t2.name
From Table1 t1
INNER join table2 t2 on t1.id = t2.id
Where t1.age > 5
and t2.name like 'B%'    -- T2 now referenced in where clause

Result: IN1
1 17 Benny
3 55 Bill


----------- OUT1
Select t1.id, t1.age, t2.name
From Table1 t1
LEFT join table2 t2 on t1.id = t2.id -- here T2 is not referenced in the where clause
Where t1.age > 5

Result: OUT1
1 17 Benny
2 50 Null
3 55 Bill


----------- OUT2
Select t1.id, t1.age, t2.name
From Table1 t1
LEFT join table2 t2 on t1.id = t2.id -- T2 IS referenced in the where clause now
Where t1.age > 5
and t2.name like 'B%' -- T2 now referenced

Result: OUT2
1 17 Benny
3 55 Bill


----------- OUT3
Select t1.id, t1.age, t2.name
From Table1 t1
LEFT join table2 t2 on t1.id = t2.id
Where t1.age > 5
and (t2.name like 'B%'   -- T2 now referenced in where clause
     or t2.name is null) -- T2 left join is preseved
Result: OUT3
1 17 Benny
2 50 Null
3 55 Bill


OUT1 = OUT3 (permitting T2 nulls in where clause)

otherwise

IN1 = OUT2 (by ignoring T2 nulls in where clause

Message:
Beware of "where"
if you filter a query in the where clause on an outer joined table
you can create the same conditions as an inner join
thus making the outer join a redundant cost on the query
either:
use an inner join
or
allow nulls via the where clause
or
use join conditions instead of the where clause