Avatar of BAM
BAM
Flag for United States of America asked on

I need to merge several rows into one

I have a table that is like this:

Job Code Line Note
111 aaa  1    Short note
111 aaa  2    More note
111 aaa  3    and more
111 bbb  1    Note
111 bbb  2    text
111 bbb  3    more stuff
111 bbb  4    and some more
222 aaa  1    Tiny field
222 aaa  2    won't hold
222 aaa  3    all the data
222 aaa  4    so I need to
222 aaa  5    concatenate

Open in new window


I need to create a single row  for each group by concatenating the note field. The result should be:
Job Code Note
111 aaa  Short note More note and more
111 bbb  Note text more stuff and more stuff
222 aaa  Tiny field won't hold all the data so I need to concatenate

Open in new window


The solution should be obvious, but for me today it is not. Any help would be greatly appreciated.
Microsoft SQL Server

Avatar of undefined
Last Comment
BAM

8/22/2022 - Mon
Éric Moreau

You can use STUFF and FOR XML in TSQL and you will get your result. I have a sample of that at http://emoreau.com/Entries/Blogs/2009/10/SQL-Concatenate-values-multiple-from-multiple-lines.aspx
ASKER CERTIFIED SOLUTION
Anthony Perkins

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
Éric Moreau

the problem with CLR is that it very often kills the performance
Anthony Perkins

Eric,

the problem with CLR is that it very often kills the performance
I would encourage you to read the link I posted earlier.  The author has given a very balanced comparison of the different options and when you should use them.

My experience is actually quite the opposite.  T-SQL is lousy with string manipulation and CLR presents a very good alternative.  In this specific case, CLR proved to be a lot faster than using STUFF with FOR XML, which you have to admit is a tad contrived to say the least: Xml was never intended for this and to be quite frank you are relying on a cute trick.  Which brings me to my second point.  Compare these two pieces of code:

This is the code using STUFF() ... FOR XML PATH
SELECT  Job,
        Code,
        STUFF((SELECT   ',' + Note
               FROM     @YourTable i
               WHERE    i.Job = o.Job AND i.Code = o.Code
              FOR
               XML PATH('')
              ), 1, 1, '') AS Note
FROM    @YourTable o
GROUP BY
	Job,
        Code
ORDER BY
	Job,
	Code

Open in new window


Here is the code using GROUP_CONCAT():
SELECT  Job,
        Code,
        dbo.GROUP_CONCAT(Note) Note
FROM    @YourTable
GROUP BY
	Job,
        Code
ORDER BY
	Job, 
	code

Open in new window


Here is the rhetorical question I have why would you have to rethink/rewrite  all the code using STUFF() ... FOR XML PATH when you already have a canned function that does it all?

But I think we should let the author and future readers judge for themselves and pick whatever is best for them.  That in the end is the only part that counts.
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
Éric Moreau

I just made a test comparing the code from CodePlex (Anthony's link) and my pure SQL solution (using STUFF).

I created a table of 1,000,000 random entries.

To my great surprise, the CLR function took 3 seconds to run compared to a big 9 seconds for the pure SQL method. I will have to revisit that tomorrow when I will get more time.
PortletPaul

I would not consider STUFF(  <subquery> FOR XML PATH  ) "pure" sql. Yes it's valid T-SQL but not "pure SQL"

It is a "contrivance" as suggested albeit a very handy one. I'm not a great fan of external code however as maintaining a valid configuration for every sql server gets more complicated.

Personally I would trade some speed (the CLR) for ease of maintenance (stuff et al).
Just my 2 cents worth.
BAM

ASKER
I posted this just before leaving work on Friday, then came in Monday morning to a dead C: drive. I'll have a chance to read through this later today and try out the suggestions.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Anthony Perkins

I would not consider STUFF(  <subquery> FOR XML PATH  ) "pure" sql. Yes it's valid T-SQL but not "pure SQL"
I disagree, it is quite "pure" SQL just simply not the right tool.

It is a "contrivance" as suggested albeit a very handy one.
And no different from using  something like this to retrieve the third sentence:
SELECT PARSENAME('This is the first sentence.This is the second sentence.This is the third sentence.This is the fourth sentence', 2)
And if you look on this site you will find some solutions that I posted and were accepted that used precisely this approach.  Were they the right solution, not really,  Just a cute trick.

Personally I would trade some speed (the CLR) for ease of maintenance (stuff et al).
But that is just it:  Using CLR you get speed and code that does not have to be re-invented every time and easier to maintain.  That to me means that it is the right tool for the job.
PortletPaul

SQL as opposed to T-SQL it isn't "pure"  in that sense.

And by configuration/maintenance I mean making certain that the CLR code is installed and working (and is installed and working whenever you rebuild, upgrade hardware, move instances etc etc); I suggest that this aspect is not overlooked.
Anthony Perkins

I agree.  But that is just one more of dozens of check boxes that need to be checked.
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
BAM

ASKER
I went with this solution, which worked perfectly, for two reasons. It was very easy to implement and it was allegedly the faster way. My table had 9.8 million records so speed was a consideration in this case. Thanks to everyone who contributed. It was an interesting discussion as well as solving my problem.