Link to home
Start Free TrialLog in
Avatar of tmajor99
tmajor99

asked on

SQL Select - Help finding duplicate records

I need some help with SQL Select to identify records that contain the same "New Feature" descriptions just in a different order.  I have a table containing a list of products with each product containing 3 feature descriptions.  I also have a group ID column that groups related sets of products together.  Within each product group, I need to identify the group of products that contain the same set of new features.    For example;

GroupID  Product ID NewFeature01 NewFeature02 NewFeature03
  1                  21                   Apple            Peach
  1                  22                   Peach            Apple
  1                  23                                          Peach               Apple
  2                  24                    Cherry
  2                  25                    Orange  
  3                  26                                          Nuts
  3                  27                     Nuts
  3                  28                    
  2                  24                    Grapes
  2                  25                                          Candy
Using the above example, only product groups that have the same set of new features.   Therefore, the expect result from this SQL Select query would:
GroupID  Product ID NewFeature01 NewFeature02 NewFeature03
  1                  21                   Apple            Peach
  1                  22                   Peach            Apple
  1                  23                                          Peach               Apple
  3                  26                                          Nuts
  3                  27                     Nuts
  3                  28
ASKER CERTIFIED SOLUTION
Avatar of Ryan Chong
Ryan Chong
Flag of Singapore image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
declare @Table table
(
	GroupID int,
	Product_ID int,
	NewFeature01 varchar(50),
	NewFeature02 varchar(50),
	NewFeature03 varchar(50)
);
insert into @Table
values
(1, 21, 'Apple', 'Peach', null),
(1, 22, 'Peach', 'Apple', null),
(1, 23, null, 'Peach', 'Apple'),
(2, 24, 'Cherry', null, null),
(2, 25, 'Orange', null, null),
(3, 26, null, 'Nuts', null),
(3, 27, 'Nuts', null, null),
(3, 28, null, null, null),
(2, 24, 'Grapes', null, null),
(2, 25, null, 'Candy', null)

SELECT [@Table].GroupID,[@Table].Product_ID,TBL_DATA.NewFeature01,TBL_DATA.NewFeature02,TBL_DATA.NewFeature03 FROM @Table
INNER JOIN @Table AS TBL_DATA ON TBL_DATA.GroupID = [@Table].GroupID AND TBL_DATA.Product_ID = [@Table].Product_ID
GROUP BY [@Table].GroupID,[@Table].Product_ID,TBL_DATA.NewFeature01,TBL_DATA.NewFeature02,TBL_DATA.NewFeature03
HAVING COUNT([@Table].GroupID)<=1 AND COUNT([@Table].Product_ID)<=1

Open in new window

any further assistance needed?
declare @yourTable table
(
	GroupID int,
	[ProductID] int,
	NewFeature01 varchar(50),
	NewFeature02 varchar(50),
	NewFeature03 varchar(50)
);
insert into @yourTable 
values
(1, 21, 'Apple', 'Peach', null),
(1, 22, 'Peach', 'Apple', null),
(1, 23, null, 'Peach', 'Apple'),
(2, 24, 'Cherry', null, null),
(2, 25, 'Orange', null, null),
(3, 26, null, 'Nuts', null),
(3, 27, 'Nuts', null, null),
(3, 28, null, null, null),
(2, 24, 'Grapes', null, null),
(2, 25, null, 'Candy', null),

(4, 21, null, null, 'Grapes'),
(4, 22, null, null, 'Nuts'),
(5, 23, 'New Fruit', null, 'Candy'),
(5, 24, null, 'Candy', 'New Fruit');

;with cte as (
select GroupID, ProductID,
       case rn when 1 then NewFeature01 when 2 then NewFeature02 when 3 then NewFeature03 end NewFeature
  from @yourtable t1
 cross join (select 1 rn union all select 2 union all select 3) t2),
 cte2 as (
 select DISTINCT GroupID, ProductID,
        STUFF((SELECT DISTINCT ',' + NewFeature 
                 FROM  cte AS t2 
                WHERE t1.GroupID = t2.GroupID and t1.ProductID = t2.ProductID ORDER BY 1
                  FOR XML PATH('')), 1, 1, '') NewFeature
  from cte t1)
  select *
    from @yourTable t1 where exists (
  select null
   from cte2 t2 where t1.GroupID = t2.GroupID
   group by GroupID
   having count(distinct NewFeature) = 1)
/*
  	GroupID	Product ID	NewFeature01	NewFeature02	NewFeature03
1	1	21	Apple	Peach	NULL
2	1	22	Peach	Apple	NULL
3	1	23	NULL	Peach	Apple
4	3	26	NULL	Nuts	NULL
5	3	27	Nuts	NULL	NULL
6	3	28	NULL	NULL	NULL
7	5	23	New Fruit	NULL	Candy
8	5	24	NULL	Candy	New Fruit

*/

Open in new window

@Naitik, your query fails for sample data by Ryan. Please check.