Avatar of Ryan Wallace
Ryan Wallace

asked on 

condensing duplication

I have three columns: ID, FeatureID, and FeatureName the data looks like this:

ID      FeatureID      FeatureName
1      111                      smaller seatbelt
2      111                 smaller seatbelt
3      111                      smaller seatbelt
4      222                      larger wheels
5      333                      more bass
6      333                      more bass


I would like to have the data look like this:

FeatureID      FeatureName                    ID1      ID2      ID3
111                      smaller seatbelt               1      2      3
222                      larger wheels                       4            
333                      more bass                       5      6      

what would be the best approach to accomplish this?

thank you!
Microsoft SQL ServerSQL

Avatar of undefined
Last Comment
Bill Prew
Avatar of Bill Prew
Bill Prew

If you have SQL Server 2017 or newer you could do something like:

https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=812d2b58f235e6516369e6edce75ea51

create table tab1(ID integer, FeatureID integer, FeatureName char(30));

insert into tab1 values(1, 111, 'smaller seatbelt');
insert into tab1 values(2, 111, 'smaller seatbelt');
insert into tab1 values(3, 111, 'smaller seatbelt');
insert into tab1 values(4, 222, 'larger wheels');
insert into tab1 values(5, 333, 'more bass');
insert into tab1 values(6, 333, 'more bass');

SELECT FeatureID, FeatureName, STRING_AGG (ID, ',') AS IDlist
FROM tab1
GROUP BY FeatureID, FeatureName
ORDER BY FeatureID, FeatureName;

Open in new window

User generated image
»bp
Avatar of Bill Prew
Bill Prew

If running 2014 then you could do something like:

https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=ec8502a94c9fab61fa04c2f3bd9cf74e

create table tab1(ID integer, FeatureID integer, FeatureName char(30));

insert into tab1 values(1, 111, 'smaller seatbelt');
insert into tab1 values(2, 111, 'smaller seatbelt');
insert into tab1 values(3, 111, 'smaller seatbelt');
insert into tab1 values(4, 222, 'larger wheels');
insert into tab1 values(5, 333, 'more bass');
insert into tab1 values(6, 333, 'more bass');

select FeatureID, FeatureName,
    STUFF((SELECT ', ' + CAST(ID AS varchar)
           FROM tab1 as t1
           WHERE t1.FeatureID = t2.FeatureID
           AND t1.FeatureName = t2.FeatureName
          FOR XML PATH('')), 1, 1, '') AS IDlist
FROM tab1 as t2
GROUP BY FeatureID, FeatureName
ORDER BY FeatureID, FeatureName;

Open in new window

User generated image

»bp
Avatar of Ryan Wallace
Ryan Wallace

ASKER

is it possible to have each ID in the IDlist in a separate column?
Avatar of Bill Prew
Bill Prew

That would make it quite a bit more complicated.  And you would have to set a limit for the number of IDs handled.  What's the value in having them in separate columns?


»bp
Avatar of Ryan Wallace
Ryan Wallace

ASKER

I need to use the ID as a  uniqueid put I want to get rid of the duplication under   FeatureName. so my thought was separate all the ID's then use the  
 ID1 column as the uniqueid
ASKER CERTIFIED SOLUTION
Avatar of Bill Prew
Bill Prew

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of Ryan Wallace
Ryan Wallace

ASKER

oh ya that should work.

would this work dynamically? so if the database updates with new features how would i capture that without having to rewrite anything?
Avatar of Bill Prew
Bill Prew

Not sure what you mean?  As those columns are there with the same names the query should work.


»bp
Avatar of Ryan Wallace
Ryan Wallace

ASKER

the example data I gave is just a sample. The actual data i am using for this example is 10,000 records so I need to do this adjustment for all the records
Avatar of Bill Prew
Bill Prew

Well, from a query point of view it should work okay.  The bigger question I guess would be around your approach to identify a unique identifier for some other purpose, and I can't really speak to that.  But you will always have an ID, and it will be unique I assume, so I don't see a big problem based on the little I know.


»bp
Microsoft SQL Server
Microsoft SQL Server

Microsoft SQL Server is a suite of relational database management system (RDBMS) products providing multi-user database access functionality.SQL Server is available in multiple versions, typically identified by release year, and versions are subdivided into editions to distinguish between product functionality. Component services include integration (SSIS), reporting (SSRS), analysis (SSAS), data quality, master data, T-SQL and performance tuning.

171K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo