asked on

SQL Select - Help finding duplicate records

I need some help with SQL Select to identify records that contain the same "New Feature" descriptions just in a different order. I have a table containing a list of products with each product containing 3 feature descriptions. I also have a group ID column that groups related sets of products together. Within each product group, I need to identify the group of products that contain the same set of new features. For example;

GroupID Product ID NewFeature01 NewFeature02 NewFeature03
1 21 Apple Peach
1 22 Peach Apple
1 23 Peach Apple
2 24 Cherry
2 25 Orange
3 26 Nuts
3 27 Nuts
3 28
2 24 Grapes
2 25 Candy
Using the above example, only product groups that have the same set of new features. Therefore, the expect result from this SQL Select query would:
GroupID Product ID NewFeature01 NewFeature02 NewFeature03
1 21 Apple Peach
1 22 Peach Apple
1 23 Peach Apple
3 26 Nuts
3 27 Nuts
3 28

ASKER CERTIFIED SOLUTION

Ryan Chong

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Naitik Gamit

declare @Table table
(
	GroupID int,
	Product_ID int,
	NewFeature01 varchar(50),
	NewFeature02 varchar(50),
	NewFeature03 varchar(50)
);
insert into @Table
values
(1, 21, 'Apple', 'Peach', null),
(1, 22, 'Peach', 'Apple', null),
(1, 23, null, 'Peach', 'Apple'),
(2, 24, 'Cherry', null, null),
(2, 25, 'Orange', null, null),
(3, 26, null, 'Nuts', null),
(3, 27, 'Nuts', null, null),
(3, 28, null, null, null),
(2, 24, 'Grapes', null, null),
(2, 25, null, 'Candy', null)

SELECT [@Table].GroupID,[@Table].Product_ID,TBL_DATA.NewFeature01,TBL_DATA.NewFeature02,TBL_DATA.NewFeature03 FROM @Table
INNER JOIN @Table AS TBL_DATA ON TBL_DATA.GroupID = [@Table].GroupID AND TBL_DATA.Product_ID = [@Table].Product_ID
GROUP BY [@Table].GroupID,[@Table].Product_ID,TBL_DATA.NewFeature01,TBL_DATA.NewFeature02,TBL_DATA.NewFeature03
HAVING COUNT([@Table].GroupID)<=1 AND COUNT([@Table].Product_ID)<=1

Open in new window

Ryan Chong

any further assistance needed?

Sharath S

declare @yourTable table
(
	GroupID int,
	[ProductID] int,
	NewFeature01 varchar(50),
	NewFeature02 varchar(50),
	NewFeature03 varchar(50)
);
insert into @yourTable 
values
(1, 21, 'Apple', 'Peach', null),
(1, 22, 'Peach', 'Apple', null),
(1, 23, null, 'Peach', 'Apple'),
(2, 24, 'Cherry', null, null),
(2, 25, 'Orange', null, null),
(3, 26, null, 'Nuts', null),
(3, 27, 'Nuts', null, null),
(3, 28, null, null, null),
(2, 24, 'Grapes', null, null),
(2, 25, null, 'Candy', null),

(4, 21, null, null, 'Grapes'),
(4, 22, null, null, 'Nuts'),
(5, 23, 'New Fruit', null, 'Candy'),
(5, 24, null, 'Candy', 'New Fruit');

;with cte as (
select GroupID, ProductID,
       case rn when 1 then NewFeature01 when 2 then NewFeature02 when 3 then NewFeature03 end NewFeature
  from @yourtable t1
 cross join (select 1 rn union all select 2 union all select 3) t2),
 cte2 as (
 select DISTINCT GroupID, ProductID,
        STUFF((SELECT DISTINCT ',' + NewFeature 
                 FROM  cte AS t2 
                WHERE t1.GroupID = t2.GroupID and t1.ProductID = t2.ProductID ORDER BY 1
                  FOR XML PATH('')), 1, 1, '') NewFeature
  from cte t1)
  select *
    from @yourTable t1 where exists (
  select null
   from cte2 t2 where t1.GroupID = t2.GroupID
   group by GroupID
   having count(distinct NewFeature) = 1)
/*
  	GroupID	Product ID	NewFeature01	NewFeature02	NewFeature03
1	1	21	Apple	Peach	NULL
2	1	22	Peach	Apple	NULL
3	1	23	NULL	Peach	Apple
4	3	26	NULL	Nuts	NULL
5	3	27	Nuts	NULL	NULL
6	3	28	NULL	NULL	NULL
7	5	23	New Fruit	NULL	Candy
8	5	24	NULL	Candy	New Fruit

*/

Open in new window

Sharath S

@Naitik, your query fails for sample data by Ryan. Please check.