Link to home
Start Free TrialLog in
Avatar of DON Swiatek
DON SwiatekFlag for United States of America

asked on

Join query criteria

Trying to come up with a query that will find all CustomerIDs from table A where all Items From table B have been shipped or cancelled

from this data query would return CustomerID 123 and 543.  
234 would be excluded because Table B, ID 6  is false for both shipped and cancelled.

SQL 2012
Table B Shipped and Cancelled are both BIT data type


Table A
OrderID  CustomerID
1               123
2               123
3               234
4               234
5               543

Table B
ID   OrderID(FK) Shipped  Cancelled
1     1                        True     False
2     1                        True     False
3     2                        True     False
4     2                        False    True
5     3                        True     False
6     3                        False    False
7     4                        True     False
8     5                        True     False
9     5                        False    True
10   5                        True     False
Avatar of _agx_
_agx_
Flag of United States of America image

SELECT DISTINCT a.CustomerID
FROM   TableA a INNER JOIN TableB b ON b.OrderID = a.OrderID
WHERE  b.Shipped = 1
OR     b.Cancelled = 1
Avatar of DON Swiatek

ASKER

_AGX_

This returns 234 when I try it.  Table B row 6 should exclude it.

maybe because other rows in Table B satisfy the WHERE?
ASKER CERTIFIED SOLUTION
Avatar of _agx_
_agx_
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Yea.  Only customers who have ALL orders where both are false would make sense.  Customer #234 has 1 order that is False/False and one that is True/False.  It's because of the True/False order that you're getting customer #234.

Think through your logic again....
where ALL orders are either shipped OR cancelled.....

You could check if the condition for exclusion EXISTS


SELECT * FROM TABLEA A
WHERE NOT EXISTS ( SELECT NULL FROM TABLEB B WHERE A.OrderID = B.OrderID AND A.CustomerID = B.CustomerID and B.Shipped = 0 and B.Cancelled = 0)


The good thing about EXISTS is that it stops checking if it finds an instance.

In which case, having a PK, or a secondary index on TABLEB on OrderID,CustomerID and have Shipped and Cacelled as included columns would help a lot.

Because you have ID in the table, I would assume that is an IDENTITY and the PK

So your Secondary index would be

create index idx_tableB_OrderID_CustomerID on tableb (OrderID, CustomerID) include (shipped,Cancelled)


And that way, it should be contained within index
_agx_
that works! Thanks!

Mark Wills
I like the idea as it seems more efficient but,
TableB does not contain CustomerID.  If I take out the AND A.CustomerID = B.CustomerID then it returns 234 with 123 and 543.
Oops.... Sorry about that.

It does need it in the test, and is more efficient... Easy enough to fix :

SELECT * FROM #TABLEA C
WHERE NOT EXISTS ( SELECT NULL FROM #TableA A INNER JOIN  #TABLEB B ON A.OrderID = B.OrderID AND A.CustomerID = C.CustomerID and B.Shipped = 0 and B.Cancelled = 0)


Welcome to EE :)
I had a semi-anti-join solution (24ms) up here which looked 'readable' and worked similarly as Mark's solution (Mark's solution is a true semi-anti-join), however, I deleted my comment earlier because of how fast and efficient _agx_'s query was (4ms).

http://sqlfiddle.com/#!18/26460/26
@DON Swiatek - Do profile both queries.  Mark's suggestion will probably be more efficient with larger volume.. but it's always good to verify it against real data.
with cte as
(select orderid from tablea
 intersect
 select orderid from tableb where shipped = 0 and cancelled = 0),
cte2 as
(select customerid from tablea where orderid in (select orderid from cte))
select customerid from tablea
except
select customerid from cte2;