Link to home
Start Free TrialLog in
Avatar of DrDamnit
DrDamnitFlag for United States of America

asked on

How to group a subquery for count?

I have rows of orders. Each order has multiple items. But, there are some orders that have been duplicated, and I need tofind them.

Assuming that my table has the following columns:
unique_id (int)
order_number (int)
item (varchar(100)).
quantity_ordered (int) - This is the total quantity for teh entire order.


SO, an example order would be:
1 - 1234 - Some Item - 3
2 - 1234 - Some other item - 3
3 - 1234 - some other item - 3

SO, here, we can see that this or has three items (last column) and that there are, indeed, 3 and only 3 in the order table.

But, here's the problem. Some were submitted twice. So we have:
1 - 1234 - Some Item - 3
2 - 1234 - Some other item - 3
3 - 1234 - some other item - 3
37 - 1234 - Some Item - 3
38 - 1234 - Some other item - 3
39 - 1234 - some other item - 3

So, even thought the order quantity is 3, there are 6 rows with this order number.

I need to write a query that finds all the orders where the number of rows for that order is greater than the ordered quantity (the last column in my example).

I am pretty sure a subquery is in order, but I cannot figure out the logic.
Avatar of Kyle Abrahams, PMP
Kyle Abrahams, PMP
Flag of United States of America image

You could do this with a self join if you're just looking for duplicates:


select p.uniqueid, c.* from table p
join table c on p.order_number  = c.order_number and p.item = c.item --can also add quantity
where
c.uniqueID > p.uniqueid
Avatar of DrDamnit

ASKER

p and c are confusing me.

Please elaborate.
it's an alias.

You only have one table.

I'm calling the first instance  "P" (parent)
the second instance (even though it's the same  table) "C"  (child).

I'm then joining the instances based on the order number and the item.  
then I'm filtering where the child's unique ID > then the parent unique id.
You can use the RANK() function along with the quantity field:
--	Set up sample data -----------------------------------------------
DECLARE @Orders TABLE (unique_id INT, order_number INT, item VARCHAR(100), quantity_ordered INT)

INSERT INTO @Orders ( unique_id ,   order_number ,  item ,  quantity_ordered )
VALUES ( 1 , 1234, 'Item 1', 3),
	( 2 , 1234, 'Item 2', 3),
	( 3 , 1234, 'Item 3', 3),
	( 45 , 1234, 'Item 1', 3),
	( 46 , 1234, 'Item 2', 3),
	( 47 , 1234, 'Item 3', 3)
--	------------------------------------------------------------------
; WITH OrderRank
AS (
	SELECT *,
		RANK() OVER (PARTITION BY order_number ORDER BY unique_id) AS iRank
	FROM @Orders
)

SELECT *
FROM OrderRank
WHERE iRank <= quantity_ordered

Open in new window

Regarding p and c, my table name is tbl_orders.

How do I alias that to p and c? I am not seeing that in the query above?
Can the same item appear multiple times for the same order_number, in different locations?  Or would the same item quantities always be combined?

1 - 1234 - ItemA - 3
2 - 1234 - ItemB - 3 --I changed the item to see a difference: are ids 2 and 3 actually supposed to be different items??
3 - 1234 - ItemC - 3
4 - 1234 - ItemB - 3 --is this valid?
ASKER CERTIFIED SOLUTION
Avatar of Kyle Abrahams, PMP
Kyle Abrahams, PMP
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial