asked on

I need to merge several rows into one

I have a table that is like this:

Job Code Line Note
111 aaa  1    Short note
111 aaa  2    More note
111 aaa  3    and more
111 bbb  1    Note
111 bbb  2    text
111 bbb  3    more stuff
111 bbb  4    and some more
222 aaa  1    Tiny field
222 aaa  2    won't hold
222 aaa  3    all the data
222 aaa  4    so I need to
222 aaa  5    concatenate

Open in new window

I need to create a single row for each group by concatenating the note field. The result should be:

Job Code Note
111 aaa  Short note More note and more
111 bbb  Note text more stuff and more stuff
222 aaa  Tiny field won't hold all the data so I need to concatenate

Open in new window

The solution should be obvious, but for me today it is not. Any help would be greatly appreciated.

Éric Moreau

You can use STUFF and FOR XML in TSQL and you will get your result. I have a sample of that at http://emoreau.com/Entries/Blogs/2009/10/SQL-Concatenate-values-multiple-from-multiple-lines.aspx

ASKER CERTIFIED SOLUTION

Anthony Perkins

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Éric Moreau

the problem with CLR is that it very often kills the performance

Anthony Perkins

Eric,

the problem with CLR is that it very often kills the performance
I would encourage you to read the link I posted earlier. The author has given a very balanced comparison of the different options and when you should use them.

My experience is actually quite the opposite. T-SQL is lousy with string manipulation and CLR presents a very good alternative. In this specific case, CLR proved to be a lot faster than using STUFF with FOR XML, which you have to admit is a tad contrived to say the least: Xml was never intended for this and to be quite frank you are relying on a cute trick. Which brings me to my second point. Compare these two pieces of code:

This is the code using STUFF() ... FOR XML PATH

SELECT  Job,
        Code,
        STUFF((SELECT   ',' + Note
               FROM     @YourTable i
               WHERE    i.Job = o.Job AND i.Code = o.Code
              FOR
               XML PATH('')
              ), 1, 1, '') AS Note
FROM    @YourTable o
GROUP BY
	Job,
        Code
ORDER BY
	Job,
	Code

Open in new window

Here is the code using GROUP_CONCAT():

SELECT  Job,
        Code,
        dbo.GROUP_CONCAT(Note) Note
FROM    @YourTable
GROUP BY
	Job,
        Code
ORDER BY
	Job, 
	code

Open in new window

Here is the rhetorical question I have why would you have to rethink/rewrite all the code using STUFF() ... FOR XML PATH when you already have a canned function that does it all?

But I think we should let the author and future readers judge for themselves and pick whatever is best for them. That in the end is the only part that counts.

Éric Moreau

I just made a test comparing the code from CodePlex (Anthony's link) and my pure SQL solution (using STUFF).

I created a table of 1,000,000 random entries.

To my great surprise, the CLR function took 3 seconds to run compared to a big 9 seconds for the pure SQL method. I will have to revisit that tomorrow when I will get more time.

PortletPaul

I would not consider STUFF( <subquery> FOR XML PATH ) "pure" sql. Yes it's valid T-SQL but not "pure SQL"

It is a "contrivance" as suggested albeit a very handy one. I'm not a great fan of external code however as maintaining a valid configuration for every sql server gets more complicated.

Personally I would trade some speed (the CLR) for ease of maintenance (stuff et al).
Just my 2 cents worth.

BAM

ASKER

I posted this just before leaving work on Friday, then came in Monday morning to a dead C: drive. I'll have a chance to read through this later today and try out the suggestions.

Anthony Perkins

I would not consider STUFF( <subquery> FOR XML PATH ) "pure" sql. Yes it's valid T-SQL but not "pure SQL"
I disagree, it is quite "pure" SQL just simply not the right tool.

It is a "contrivance" as suggested albeit a very handy one.
And no different from using something like this to retrieve the third sentence:
SELECT PARSENAME('This is the first sentence.This is the second sentence.This is the third sentence.This is the fourth sentence', 2)
And if you look on this site you will find some solutions that I posted and were accepted that used precisely this approach. Were they the right solution, not really, Just a cute trick.

Personally I would trade some speed (the CLR) for ease of maintenance (stuff et al).
But that is just it: Using CLR you get speed and code that does not have to be re-invented every time and easier to maintain. That to me means that it is the right tool for the job.

PortletPaul

SQL as opposed to T-SQL it isn't "pure" in that sense.

And by configuration/maintenance I mean making certain that the CLR code is installed and working (and is installed and working whenever you rebuild, upgrade hardware, move instances etc etc); I suggest that this aspect is not overlooked.

Anthony Perkins

I agree. But that is just one more of dozens of check boxes that need to be checked.

BAM

ASKER

I went with this solution, which worked perfectly, for two reasons. It was very easy to implement and it was allegedly the faster way. My table had 9.8 million records so speed was a consideration in this case. Thanks to everyone who contributed. It was an interesting discussion as well as solving my problem.