Link to home
Create AccountLog in
Avatar of Ryan Wallace
Ryan Wallace

asked on

condensing duplication

I have three columns: ID, FeatureID, and FeatureName the data looks like this:

ID      FeatureID      FeatureName
1      111                      smaller seatbelt
2      111                 smaller seatbelt
3      111                      smaller seatbelt
4      222                      larger wheels
5      333                      more bass
6      333                      more bass


I would like to have the data look like this:

FeatureID      FeatureName                    ID1      ID2      ID3
111                      smaller seatbelt               1      2      3
222                      larger wheels                       4            
333                      more bass                       5      6      

what would be the best approach to accomplish this?

thank you!
Avatar of Bill Prew
Bill Prew

If you have SQL Server 2017 or newer you could do something like:

https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=812d2b58f235e6516369e6edce75ea51

create table tab1(ID integer, FeatureID integer, FeatureName char(30));

insert into tab1 values(1, 111, 'smaller seatbelt');
insert into tab1 values(2, 111, 'smaller seatbelt');
insert into tab1 values(3, 111, 'smaller seatbelt');
insert into tab1 values(4, 222, 'larger wheels');
insert into tab1 values(5, 333, 'more bass');
insert into tab1 values(6, 333, 'more bass');

SELECT FeatureID, FeatureName, STRING_AGG (ID, ',') AS IDlist
FROM tab1
GROUP BY FeatureID, FeatureName
ORDER BY FeatureID, FeatureName;

Open in new window

User generated image
»bp
If running 2014 then you could do something like:

https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=ec8502a94c9fab61fa04c2f3bd9cf74e

create table tab1(ID integer, FeatureID integer, FeatureName char(30));

insert into tab1 values(1, 111, 'smaller seatbelt');
insert into tab1 values(2, 111, 'smaller seatbelt');
insert into tab1 values(3, 111, 'smaller seatbelt');
insert into tab1 values(4, 222, 'larger wheels');
insert into tab1 values(5, 333, 'more bass');
insert into tab1 values(6, 333, 'more bass');

select FeatureID, FeatureName,
    STUFF((SELECT ', ' + CAST(ID AS varchar)
           FROM tab1 as t1
           WHERE t1.FeatureID = t2.FeatureID
           AND t1.FeatureName = t2.FeatureName
          FOR XML PATH('')), 1, 1, '') AS IDlist
FROM tab1 as t2
GROUP BY FeatureID, FeatureName
ORDER BY FeatureID, FeatureName;

Open in new window

User generated image

»bp
Avatar of Ryan Wallace

ASKER

is it possible to have each ID in the IDlist in a separate column?
That would make it quite a bit more complicated.  And you would have to set a limit for the number of IDs handled.  What's the value in having them in separate columns?


»bp
I need to use the ID as a  uniqueid put I want to get rid of the duplication under   FeatureName. so my thought was separate all the ID's then use the  
 ID1 column as the uniqueid
ASKER CERTIFIED SOLUTION
Avatar of Bill Prew
Bill Prew

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
oh ya that should work.

would this work dynamically? so if the database updates with new features how would i capture that without having to rewrite anything?
Not sure what you mean?  As those columns are there with the same names the query should work.


»bp
the example data I gave is just a sample. The actual data i am using for this example is 10,000 records so I need to do this adjustment for all the records
Well, from a query point of view it should work okay.  The bigger question I guess would be around your approach to identify a unique identifier for some other purpose, and I can't really speak to that.  But you will always have an ID, and it will be unique I assume, so I don't see a big problem based on the little I know.


»bp