friend_s
asked on
Sql Server table column data concat
Hi, can i please get help with the below. I have a sql server table with close to 200,000 records. The data is similar to the top pattern in the attached file. I need to change it into the bottom pattern(into a new table) in the attached file. Appreciate your response.
ModifyTableData.doc
ModifyTableData.doc
How will you know which values belong to a specific KeyNumber, when it's NULL for all the values that should be grouped by?
Friend_s: Firstly, you need to replace the nulls in KeyNumber with the correct KeyNumber values (I'm assuming that records with IDs 2, 3, & 4 in your example have the same KeyNumber as record 1). (I have no SQL Server here, so please forgive typos!)
(work intervened here - I'll try to do the rest later!)
update ot
set ot.KeyNumber = ( select KeyNumber --get keynumber val
from OriginalTable --for that lower ID
where ID = ( select max(ID) --max. lower ID than current one
from OriginalTable --that has a KeyNumber
where ID < ot.ID
and KeyNumber is not null )
)
where ot.KeyNumber is null
from OriginalTable ot
(work intervened here - I'll try to do the rest later!)
ASKER
DcpKing: This data format is from a public website over which we have no control. We are told to use their data as is. So we also assumed that records with IDs 2, 3, & 4 in my example have the same KeyNumber as record 1. That is the reason i showed the bottom expected results table in that manner. I will look forward to your reply.
Hi Friend_s,
sorry it took so long.
After you've run the first part, on OriginalTable, run this. You'll need to do a few things first -
1. the code below will run perfectly and contains the data within it
2. If you want to run it twice, un-comment the two drop table lines at the top!
3. Replace the references to #ta with your OriginalTable name, and #tb with your destination table, and take out lines 1-21 and 23 to rid yourself of the demo part.
Have fun!!
hth
Mike
sorry it took so long.
After you've run the first part, on OriginalTable, run this. You'll need to do a few things first -
1. the code below will run perfectly and contains the data within it
2. If you want to run it twice, un-comment the two drop table lines at the top!
3. Replace the references to #ta with your OriginalTable name, and #tb with your destination table, and take out lines 1-21 and 23 to rid yourself of the demo part.
Have fun!!
hth
Mike
-- -----------------------------Load demo source table
--drop table #ta
--drop table #tb
create table #ta(ID int, KeyNumber varchar(16), ColumnA varchar(16), ColumnB varchar(16), ColumnC varchar(16))
insert into #ta values
(1, '1111111111', 'ValueA1', 'ValueB1', 'ValueC1'),
(2, '1111111111', 'ValueA2', 'ValueB2', 'ValueC2'),
(3, '1111111111', 'ValueA2', 'ValueB2', 'ValueC3'),
(4, '1111111111', 'ValueA3', 'ValueB3', 'ValueC4'),
(5, '2222222222', 'ValA1', 'ValB1', 'ValC1'),
(6, '2222222222', 'ValA2', 'ValB2', 'ValC2'),
(7, '2222222222', 'ValA3', 'ValB3', 'ValC3'),
(8, '2222222222', 'ValA4', 'ValB4', 'ValC4'),
(9, '2222222222', 'ValA4', 'ValB5', 'ValC4')
select * from #ta
-- -----------------------------
create table #tb( ID int identity(1, 1),
KeyNumber varchar(16),
ColumnA varchar(64),
ColumnB varchar(64),
ColumnC varchar(64))
insert into #tb select distinct KeyNumber, '', '', '' from #ta
select * from #tb
declare @strKeyNumber varchar(16)
declare @strCmd varchar(3000) = ''
while 'True' = 'True'
begin
select @strKeyNumber = min(KeyNumber) from #tb where ColumnA = ''
select @strKeyNumber as '@strKeyNumber'
if @strKeyNumber is null break
-- ---------------------
set @strCmd = ' '
set @strCmd = @strCmd + 'declare @taCol table(id int identity(1, 1), Col varchar(16)); '
set @strCmd = @strCmd + 'declare @strTemp varchar(144) = '''', '
set @strCmd = @strCmd + ' @intLine int, '
set @strCmd = @strCmd + ' @intNrLines int; '
set @strCmd = @strCmd + 'delete from @taCol; '
set @strCmd = @strCmd + 'insert into @taCol '
set @strCmd = @strCmd + ' select distinct ColumnA from #ta where KeyNumber = ''' + @strKeyNumber + '''; '
set @strCmd = @strCmd + 'set @intLine = 1; '
set @strCmd = @strCmd + 'set @intNrLines = (select count(*) from @taCol); '
set @strCmd = @strCmd + 'while ''True'' = ''True'' '
set @strCmd = @strCmd + 'begin '
set @strCmd = @strCmd + ' set @strTemp = @strTemp + (select Col from @taCol where id = @intLine); '
set @strCmd = @strCmd + ' if @intLine = @intNrLines break; '
set @strCmd = @strCmd + ' set @strTemp = @strTemp + '', ''; '
set @strCmd = @strCmd + ' set @intLine = @intLine + 1; '
set @strCmd = @strCmd + 'end; '
set @strCmd = @strCmd + 'update #tb '
set @strCmd = @strCmd + ' set ColumnA = @strTemp '
set @strCmd = @strCmd + ' where KeyNumber = ''' + @strKeyNumber + '''; '
exec(@strCmd)
end
select * from #tb
ASKER
Mike, thank you very much. Your code gave the result for one column. Can you please tell about the other columns? In addition, the real data has 3 more columns like KeyNumber but they are not null. I am attaching the file here. If possible, can you please take a look?
ModifyTableData.doc
ModifyTableData.doc
ok, although you might not want to see this :)
What I did was wrap the dynamic SQL inside another layer, running the whole thing once through for each column, just like I was doing for each KeyNumber.
It looks horrendous, but you'll find that all you do is double every " ' " to a " '' " and wrap, so that
"declare @str varchar(2) = '' "
becomes
"@strCmd = @strCmd + 'declare @str varchar(2) = '''' ' "
One thing: I used a shortcut assigning data into @taColNames - if you're not using SQL Server 2012 you'll need to expand it out (sorry!).
Have fun - hope it helps
Mike
What I did was wrap the dynamic SQL inside another layer, running the whole thing once through for each column, just like I was doing for each KeyNumber.
It looks horrendous, but you'll find that all you do is double every " ' " to a " '' " and wrap, so that
"declare @str varchar(2) = '' "
becomes
"@strCmd = @strCmd + 'declare @str varchar(2) = '''' ' "
One thing: I used a shortcut assigning data into @taColNames - if you're not using SQL Server 2012 you'll need to expand it out (sorry!).
-- -----------------------------Load demo source table
drop table #ta
drop table #tb
create table #ta(ID int, KeyNumber varchar(16), Action varchar(3), Type varchar(8), Code varchar(8), ColumnA varchar(16), ColumnB varchar(16), ColumnC varchar(16))
insert into #ta values
(1, '1111111111', 'AC1', 'Type1', 'Code1', 'ValueA1', 'ValueB1', 'ValueC1'),
(2, '1111111111', 'AC1', 'Type1', 'Code1', 'ValueA2', 'ValueB2', 'ValueC2'),
(3, '1111111111', 'AC1', 'Type1', 'Code1', 'ValueA2', 'ValueB2', 'ValueC3'),
(4, '1111111111', 'AC1', 'Type1', 'Code1', 'ValueA3', 'ValueB3', 'ValueC4'),
(5, '2222222222', 'AC2', 'Type2', 'Code2', 'ValA1', 'ValB1', 'ValC1'),
(6, '2222222222', 'AC2', 'Type2', 'Code2', 'ValA2', 'ValB2', 'ValC2'),
(7, '2222222222', 'AC2', 'Type2', 'Code2', 'ValA3', 'ValB3', 'ValC3'),
(8, '2222222222', 'AC2', 'Type2', 'Code2', 'ValA4', 'ValB4', 'ValC4'),
(9, '2222222222', 'AC2', 'Type2', 'Code2', 'ValA4', 'ValB5', 'ValC4')
select * from #ta
-- -----------------------------
create table #tb( ID int identity(1, 1),
KeyNumber varchar(16),
[Action] varchar(3),
[Type] varchar(8),
Code varchar(8),
ColumnA varchar(64),
ColumnB varchar(64),
ColumnC varchar(64))
insert into #tb select distinct KeyNumber, [Action], [Type], Code, '', '', '' from #ta
select * from #tb
-- -----------------------------
declare @taColNames table (id int identity(1, 1), ColName varchar(16), Done varchar(3))
insert into @taColNames values ('ColumnA', 'No'), ('ColumnB', 'No'), ('ColumnC', 'No')
declare @strOuterCmd varchar(3000),
@strThisColumnName varchar(16),
@intID int
While 'True' = 'True'
begin
set @intID = (select top 1 id from @taColNames where Done = 'No' order by id)
set @strThisColumnName = (select ColName from @taColNames where id = @intID)
if @strThisColumnName is null break --this happens when there aren't any matching records
-- ---------------------------------------------
set @strOuterCmd = ''
set @strOuterCmd = @strOuterCmd + ' declare @strKeyNumber varchar(16) '
set @strOuterCmd = @strOuterCmd + ' declare @strCmd varchar(2000) = '''' '
-- -------------------
set @strOuterCmd = @strOuterCmd + ' while ''True'' = ''True'' '
set @strOuterCmd = @strOuterCmd + ' begin '
set @strOuterCmd = @strOuterCmd + ' select @strKeyNumber = min(KeyNumber) from #tb where ' + @strThisColumnName + ' = '''' '
set @strOuterCmd = @strOuterCmd + ' select @strKeyNumber as ''@strKeyNumber'' '
set @strOuterCmd = @strOuterCmd + ' if @strKeyNumber is null break '
-- ---------------------
set @strOuterCmd = @strOuterCmd + ' set @strCmd = '' '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + ''declare @taCol table(id int identity(1, 1), Col varchar(16)); '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + ''declare @strTemp varchar(144) = '''''''', '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + '' @intLine int, '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + '' @intNrLines int; '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + ''delete from @taCol; '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + ''insert into @taCol '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + '' select distinct ' + @strThisColumnName + ' from #ta where KeyNumber = '''''' + @strKeyNumber + ''''''; '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + ''set @intLine = 1; '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + ''set @intNrLines = (select count(*) from @taCol); '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + ''while ''''True'''' = ''''True'''' '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + ''begin '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + '' set @strTemp = @strTemp + (select Col from @taCol where id = @intLine); '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + '' if @intLine = @intNrLines break; '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + '' set @strTemp = @strTemp + '''', ''''; '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + '' set @intLine = @intLine + 1; '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + ''end; '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + ''update #tb '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + '' set ' + @strThisColumnName + ' = @strTemp '' '
set @strOuterCmd = @strOuterCmd + ' set @strCmd = @strCmd + '' where KeyNumber = '''''' + @strKeyNumber + ''''''; '' '
set @strOuterCmd = @strOuterCmd + ' select @strCmd as ''InnerCommand'' '
set @strOuterCmd = @strOuterCmd + ' exec(@strCmd) '
set @strOuterCmd = @strOuterCmd + ' end '
set @strOuterCmd = @strOuterCmd + ' select * from #tb '
-- ---------------------------------------------
update @taColNames set Done = 'Yes' where id = @intID
select @strOuterCmd as 'OuterCommand'
exec(@strOuterCmd)
end
select * from #tb
Have fun - hope it helps
Mike
ASKER
Mike, thanks a lot. But i noticed it does the concat of those last columns only for KeyNumber. But we need to do the concat of those last columns for this combination : KeyNumber, [Action], [Type], Code. Which means we need something like : where KeyNumber=@strKeyNumber and Action=@strAction and Type=@strType and Code=@strCode. ------------these four are the distinct combination. Can you please help with this?
ASKER
Mike, i tried your code using just the keynumber on the real table of 200,000 plus rows, and it was too slow. I myself wrote different code for this using the sql STUFF function and that was too slow. So i need something that is faster.
To speed thing up some try this:
Change the structure of both #ta and #tb to make KeyNumber a primary key:
In each case change "KeyNumber varchar(16)" to "KeyNumber varchar(16) primary key"
The various queries and inserts are being done depending on this field, so keying it should help.
hth
Mike
Change the structure of both #ta and #tb to make KeyNumber a primary key:
In each case change "KeyNumber varchar(16)" to "KeyNumber varchar(16) primary key"
The various queries and inserts are being done depending on this field, so keying it should help.
hth
Mike
ASKER
Mike, thank you. Can you please see my previous post where i said we need something like : where KeyNumber=@strKeyNumber and Action=@strAction and Type=@strType and Code=@strCode. ------------these four are the distinct combination.
I've been looking at it - it won't be too difficult (I think!). We're on a long holiday weekend here in the USA, so I'm a bit occupied by family! I should get to it in the next day, tho'!
ASKER
Thank you Mike. When you get to it, will it be possible to make your demo table have 200,000 rows (may be insert data into it in a loop) and see if it takes less time with your solution?
Well I could, but you should know that I'm working on a VM on a laptop :)
Friend_s: Make the two tables permanent for testing (just take off the "#"). Then add a non-clustered non-unique index to both tables, on field KeyNumber in each case. That took me from 280 records per minute into tb to 5800 ! I'll admit that it isn't the fastest of processes, as it may take 3 hours to do what you want for all three fields. As I said earlier, though, those numbers are on a laptop in a Windows VM with other things going on too, so you should get better numbers on a server.
ASKER
Mike, thank you for your time. I posted this question on another website and got results from that solution in 30 seconds. So far, this is the fastest among the different approaches i tried. So going ahead with that. Once again, appreciate your time and effort.
Well, I think I demonstrated that T-SQL is a bit of a dog for concentrated string manipulation! A million rows of input took 9 hours! How did you end up doing it?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Interesting. Thanks. Mike
ASKER
The database table was huge, and the code i got from here, as well as the different ways i did, were taking too long on this huge table. Finally this link gave me the faster results.