Solved

Merge Duplicate Rows with SQL "Merge"

Posted on 2011-02-25
15
535 Views
Last Modified: 2012-05-11
Hi Experts,

I have a SQL table with duplicate rows, that I would like to merge to a single row for each record.  below is what the existing table looks like.  I would like to process the whole table so the final results will just be unique.  Please use SQL to accomplish this task.
Table A as it exists now
ID             Unit    Date           TimeSent                          TimeArrived  
1234          12     1/1/2010    1/1/2010 20:10               1/1/2010 22:00
1235          8       1/2/2010    1/2/2010 20:10               1/2/2010 22:00
1234          35     1/1/2010    1/1/2010 20:10               1/1/2010 22:30
1235          10     1/1/2010    1/2/2010 20:10               1/2/2010 22:11

so the final table would look like this:
ID             Unit1   Unit2   Unit3    Date           TimeSent               TimeArrived1          Time Arrived2  
1234          12         35    NULL  1/1/2010    1/1/2010 20:10       1/1/2010 22:00       1/1/2010 22:30
1235          8           10    NULL  1/2/2010    1/2/2010 20:10       1/2/2010 22:00        1/2/2010 22:11

there are times when there are up to 6 units with 6 arrival times
thanks,

0
Comment
Question by:Aaron Goodwin
  • 8
  • 4
  • 2
  • +1
15 Comments
 
LVL 26

Expert Comment

by:tigin44
ID: 34983586
try this
SELECT ID, MAX([1]),MAX([2]),MAX([3]),MAX([4]),MAX([5]),MAX([6]), Date, TimeSent, MIN(TimeArrived) AS T1, MAX(TimeArrived) AS T2
FROM
(  
	SELECT ID, [1],[2],[3],[4],[5],[6], Date, TimeSent, TimeArrived   
	FROM  
	(   
		SELECT ID, Unit, Date, TimeSent, TimeArrived, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY TimeArrived) AS rowOrder  
		FROM pivotTable
	) AS P
	PIVOT 
	(
		MAX(Unit) FOR rowOrder IN ([1],[2],[3],[4],[5],[6])	
	) AS PVT
) A
GROUP BY ID, Date, TimeSent

Open in new window

0
 

Author Comment

by:Aaron Goodwin
ID: 34996627
trying now
0
 

Author Comment

by:Aaron Goodwin
ID: 34996736
its very close to what I need, thank you very much.  The only problem is when there are more than 2 units.  The time arrived does a min and max which is ok for up to 2 units.  But when I get 3 or more, those time values wont work.  Is there a way to modify your code so the correct arrived time will match the unit?
0
 
LVL 3

Expert Comment

by:KetGuru
ID: 34996826

Full respect to tigin44 suggestion and I agree with tigin44 but i had doubt what if entire row is duplicate for specified column values. so i put extra distinct statement to avoid any such chances.

SELECT ID, MAX([1]),MAX([2]),MAX([3]),MAX([4]),MAX([5]),MAX([6]), Date, TimeSent, MIN(TimeArrived) AS T1, MAX(TimeArrived) AS T2
FROM
(  
      SELECT ID, [1],[2],[3],[4],[5],[6], Date, TimeSent, TimeArrived  
      FROM  
      (  
      SELECT  ID, Unit, Date, TimeSent, TimeArrived , ROW_NUMBER() OVER (PARTITION BY ID  ORDER BY TimeArrived) AS rowOrder  
            FROM
            (
            SELECT DISTINCT ID, UNIT, DATE,TIMESENT, TIMEARRIVED FROM
                  dbo.tblMergeRows
            )AS T
      ) AS P
      PIVOT
      (
            MAX(Unit) FOR rowOrder IN ([1],[2],[3],[4],[5],[6])      
      ) AS PVT
) A
GROUP BY ID, Date, TimeSent
0
 

Author Comment

by:Aaron Goodwin
ID: 34997402
Thanks KetGuru,

i am still having trouble with multiple units and their Time arrived.  In the example I gave, I mentioned that there could be up to 6 units with 6 timeArrived values.  How can I deal with those?

SELECT ID, MAX([1]),MAX([2]),MAX([3]),MAX([4]),MAX([5]),MAX([6]), Date, TimeSent, MIN(TimeArrived) AS T1, MAX(TimeArrived) AS T2

the Min and Max time arrived works fine for up to 2 units.....after that its not working.

Thanks again.
0
 
LVL 31

Expert Comment

by:awking00
ID: 34997528
What would your final table look like if the data were like this?
ID             Unit    Date           TimeSent                          TimeArrived  
1234          12     1/1/2010    1/1/2010 20:10               1/1/2010 22:00
1235          8       1/2/2010    1/2/2010 20:10               1/2/2010 22:00
1234          35     1/1/2010    1/1/2010 20:10               1/1/2010 22:30
1235          10     1/1/2010    1/2/2010 20:10               1/2/2010 22:11
1234          12     1/1/2010    1/1/2010 20:10               1/1/2010 22:45
1236          16     1/1/2010    1/1/2010 20:10               1/1/2010 22:15
0
 
LVL 31

Expert Comment

by:awking00
ID: 34997544
I actually meant to change the unit value for the third id of 1234 to something like 20.
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 

Author Comment

by:Aaron Goodwin
ID: 34998021
Thanks Awking00,

It would look something like this, but for the case ID 1235 the ID would be from the same date.  It wouldn't have the same caseid on 2 different dates.  So essentially, the table would look like the attachment

table.doc
0
 
LVL 31

Expert Comment

by:awking00
ID: 34998448
See attached.
query.txt
0
 
LVL 3

Accepted Solution

by:
KetGuru earned 300 total points
ID: 34998535
here is my query for you.


SELECT aa.ID, max(aa.u1), max(aa.u2), max(aa.u3), max(aa.u4), max(aa.u5), max(aa.u6) , aa.Date, aa.TimeSent, max(bb.d1), max(bb.d2), max(bb.d3), max(bb.d4), max(bb.d5), max(bb.d6)

from
(
SELECT ID, MAX([1]) as 'u1',MAX([2]) as 'u2',MAX([3]) as 'u3',MAX([4]) as 'u4',MAX([5]) as 'u5',MAX([6]) as 'u6', Date, TimeSent --, MIN(TimeArrived) AS T1, MAX(TimeArrived) AS T2
FROM
(  
      SELECT ID, [1],[2],[3],[4],[5],[6], Date, TimeSent, TimeArrived  
      FROM  
      (  
      SELECT  ID, Unit, Date, TimeSent, TimeArrived , ROW_NUMBER() OVER (PARTITION BY ID  ORDER BY TimeArrived) AS rowOrder  
            FROM
            (
            SELECT DISTINCT ID, UNIT, DATE,TIMESENT, TIMEARRIVED FROM
                  dbo.tblMergeRows
                        --Group by ID
            )AS T
      ) AS P
      PIVOT
      (
            MAX(Unit) FOR rowOrder IN ([1],[2],[3],[4],[5],[6])      
      ) AS PVT
) A
GROUP BY ID, Date, TimeSent, timearrived
)as aa ,

(
SELECT ID, MAX([1])  as 'd1',MAX([2]) as 'd2',MAX([3]) as 'd3',MAX([4]) as 'd4',MAX([5]) as 'd5',MAX([6]) as 'd6', Date, TimeSent --, MIN(TimeArrived) AS T1, MAX(TimeArrived) AS T2
FROM
(  
      SELECT ID, [1],[2],[3],[4],[5],[6], Date, TimeSent
      FROM  
      (  
      SELECT  ID, Unit, Date, TimeSent, TimeArrived , ROW_NUMBER() OVER (PARTITION BY ID  ORDER BY TimeArrived) AS rowOrder  
            FROM
            (
            SELECT DISTINCT ID, UNIT, DATE,TIMESENT, TIMEARRIVED FROM
                  dbo.tblMergeRows
                        --Group by ID
            )AS T
      ) AS P
      PIVOT
      (
            MAX(TimeArrived) FOR rowOrder IN ([1],[2],[3],[4],[5],[6])      
      ) AS PVT
) A
GROUP BY ID, Date, TimeSent
) as bb

where aa.ID = bb.ID and aa.Date= bb.Date and aa.TimeSent= bb.TimeSent
Group by aa.ID, aa.Date, aa.TimeSent
0
 

Author Comment

by:Aaron Goodwin
ID: 35006868
checking out the codes from KetGuru and Awking00....thanks for the replies
0
 

Author Comment

by:Aaron Goodwin
ID: 35007106
Hi Awking00,

I must be doing something wrong, when I tried to run the code you gave me...it gave me syntax errors.  I made some subtle Changes, but when I do get it to run, the results was not a merged row.....is there a second part to this code?

Checking KetGuru's Code
query1-modified.txt
0
 

Author Comment

by:Aaron Goodwin
ID: 35007133
Awking00 these were the results I got from my modified query....again, I apologize if the results were from my own doings Un-merged-Results.xls
0
 
LVL 31

Assisted Solution

by:awking00
awking00 earned 200 total points
ID: 35007903
I think we need to incorporate the max function and a group by to create the "merge" as KetGuru used. Try the attached.
query.txt
0
 

Author Closing Comment

by:Aaron Goodwin
ID: 35010928
Thank you very much for your ideas, works great!
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Confronted with some SQL you don't know can be a daunting task. It can be even more daunting if that SQL carries some of the old secret codes used in the Ye Olde query syntax, such as: (+)     as used in Oracle;     *=     =*    as used in Sybase …
Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now