DON Swiatek
asked on
Join query criteria
Trying to come up with a query that will find all CustomerIDs from table A where all Items From table B have been shipped or cancelled
from this data query would return CustomerID 123 and 543.
234 would be excluded because Table B, ID 6 is false for both shipped and cancelled.
SQL 2012
Table B Shipped and Cancelled are both BIT data type
Table A
OrderID CustomerID
1 123
2 123
3 234
4 234
5 543
Table B
ID OrderID(FK) Shipped Cancelled
1 1 True False
2 1 True False
3 2 True False
4 2 False True
5 3 True False
6 3 False False
7 4 True False
8 5 True False
9 5 False True
10 5 True False
from this data query would return CustomerID 123 and 543.
234 would be excluded because Table B, ID 6 is false for both shipped and cancelled.
SQL 2012
Table B Shipped and Cancelled are both BIT data type
Table A
OrderID CustomerID
1 123
2 123
3 234
4 234
5 543
Table B
ID OrderID(FK) Shipped Cancelled
1 1 True False
2 1 True False
3 2 True False
4 2 False True
5 3 True False
6 3 False False
7 4 True False
8 5 True False
9 5 False True
10 5 True False
ASKER
_AGX_
This returns 234 when I try it. Table B row 6 should exclude it.
maybe because other rows in Table B satisfy the WHERE?
This returns 234 when I try it. Table B row 6 should exclude it.
maybe because other rows in Table B satisfy the WHERE?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Yea. Only customers who have ALL orders where both are false would make sense. Customer #234 has 1 order that is False/False and one that is True/False. It's because of the True/False order that you're getting customer #234.
Think through your logic again....
Think through your logic again....
where ALL orders are either shipped OR cancelled.....
You could check if the condition for exclusion EXISTS
SELECT * FROM TABLEA A
WHERE NOT EXISTS ( SELECT NULL FROM TABLEB B WHERE A.OrderID = B.OrderID AND A.CustomerID = B.CustomerID and B.Shipped = 0 and B.Cancelled = 0)
The good thing about EXISTS is that it stops checking if it finds an instance.
In which case, having a PK, or a secondary index on TABLEB on OrderID,CustomerID and have Shipped and Cacelled as included columns would help a lot.
Because you have ID in the table, I would assume that is an IDENTITY and the PK
So your Secondary index would be
create index idx_tableB_OrderID_Custome rID on tableb (OrderID, CustomerID) include (shipped,Cancelled)
And that way, it should be contained within index
You could check if the condition for exclusion EXISTS
SELECT * FROM TABLEA A
WHERE NOT EXISTS ( SELECT NULL FROM TABLEB B WHERE A.OrderID = B.OrderID AND A.CustomerID = B.CustomerID and B.Shipped = 0 and B.Cancelled = 0)
The good thing about EXISTS is that it stops checking if it finds an instance.
In which case, having a PK, or a secondary index on TABLEB on OrderID,CustomerID and have Shipped and Cacelled as included columns would help a lot.
Because you have ID in the table, I would assume that is an IDENTITY and the PK
So your Secondary index would be
create index idx_tableB_OrderID_Custome
And that way, it should be contained within index
ASKER
_agx_
that works! Thanks!
Mark Wills
I like the idea as it seems more efficient but,
TableB does not contain CustomerID. If I take out the AND A.CustomerID = B.CustomerID then it returns 234 with 123 and 543.
that works! Thanks!
Mark Wills
I like the idea as it seems more efficient but,
TableB does not contain CustomerID. If I take out the AND A.CustomerID = B.CustomerID then it returns 234 with 123 and 543.
Oops.... Sorry about that.
It does need it in the test, and is more efficient... Easy enough to fix :
SELECT * FROM #TABLEA C
WHERE NOT EXISTS ( SELECT NULL FROM #TableA A INNER JOIN #TABLEB B ON A.OrderID = B.OrderID AND A.CustomerID = C.CustomerID and B.Shipped = 0 and B.Cancelled = 0)
Welcome to EE :)
It does need it in the test, and is more efficient... Easy enough to fix :
SELECT * FROM #TABLEA C
WHERE NOT EXISTS ( SELECT NULL FROM #TableA A INNER JOIN #TABLEB B ON A.OrderID = B.OrderID AND A.CustomerID = C.CustomerID and B.Shipped = 0 and B.Cancelled = 0)
Welcome to EE :)
I had a semi-anti-join solution (24ms) up here which looked 'readable' and worked similarly as Mark's solution (Mark's solution is a true semi-anti-join), however, I deleted my comment earlier because of how fast and efficient _agx_'s query was (4ms).
http://sqlfiddle.com/#!18/26460/26
http://sqlfiddle.com/#!18/26460/26
@DON Swiatek - Do profile both queries. Mark's suggestion will probably be more efficient with larger volume.. but it's always good to verify it against real data.
with cte as
(select orderid from tablea
intersect
select orderid from tableb where shipped = 0 and cancelled = 0),
cte2 as
(select customerid from tablea where orderid in (select orderid from cte))
select customerid from tablea
except
select customerid from cte2;
(select orderid from tablea
intersect
select orderid from tableb where shipped = 0 and cancelled = 0),
cte2 as
(select customerid from tablea where orderid in (select orderid from cte))
select customerid from tablea
except
select customerid from cte2;
FROM TableA a INNER JOIN TableB b ON b.OrderID = a.OrderID
WHERE b.Shipped = 1
OR b.Cancelled = 1