Solved

removing a row with duplicate name by date

Posted on 2011-02-19
6
304 Views
Last Modified: 2012-08-13
Hi
I have a SQL table with various columns, in this table I have a username column and a datetime column, I would like to delete the rows which contain duplicate usernames but only the duplicates with the oldest datetime value
0
Comment
Question by:sykotex
  • 3
  • 2
6 Comments
 
LVL 40

Expert Comment

by:Sharath
ID: 34935754
try this.
DELETE a 
  FROM your_table a 
 WHERE datetime_col NOT IN (SELECT MAX(datetime_col) 
                              FROM your_table b 
                             WHERE a.username = b.username)

Open in new window

0
 
LVL 40

Accepted Solution

by:
Sharath earned 250 total points
ID: 34935757
or this.
DELETE FROM your_table a 
      WHERE EXISTS (SELECT 1 
                      FROM your_table b 
                     WHERE a.username = b.username 
                           AND b.datetime_col < a.datetime_col)

Open in new window

0
 
LVL 3

Assisted Solution

by:clinker83
clinker83 earned 250 total points
ID: 34937127
Borrowing some ideas from Sharath, try the following

DELETE FROM your_table as A
WHERE
COUNT(username) > 1
AND
datetime NOT IN (SELECT MAX(datetime) FROM your_table as B WHERE A.username = B.username)
GROUP BY username

Obviously backup your database before running delete and update statements. Also run statements as select to see if the data you expect to be deleted is returned.

If the above doesn't work you may need to look at creating temp table with all usenames that have count(username) >1 on original table. From there you can delete everything apart from username that satisfys MAX(datetime) grouping by username

Best of Luck!
0
Resolve Critical IT Incidents Fast

If your data, services or processes become compromised, your organization can suffer damage in just minutes and how fast you communicate during a major IT incident is everything. Learn how to immediately identify incidents & best practices to resolve them quickly and effectively.

 
LVL 1

Author Comment

by:sykotex
ID: 34941790
Hi
I dont think I was clear
This is the table below

id         Username                 Date_time
1          Andy                        2/21/2011 12:15:14
2          Andy                        2/21/2011 12:45:16
3          James                      2/21/2011 12:30:45


I would like the job to remove the earliest Andy as this is a duplicate name with the earliest time (which would be id 1 in this case)

Thanks

0
 
LVL 3

Expert Comment

by:clinker83
ID: 34943410
In your example table above

SELECT * FROM your_table as A
WHERE
COUNT(username) > 1
GROUP BY Username

This will return
id         Username                 Date_time
1          Andy                        2/21/2011 12:15:14
2          Andy                        2/21/2011 12:45:16

The statement in my original post :
DELETE FROM your_table as A
WHERE
COUNT(username) > 1
AND
datetime NOT IN (SELECT MAX(datetime) FROM your_table as B WHERE A.username = B.username)
GROUP BY username

Should bring back everything apart from the latest row i.e it will return
1          Andy                        2/21/2011 12:15:14

Therefore when using the DELETE statement that I first posted it should delete every record that has a duplicate apart from the latest one, which is what you want to do.

I'm just looking at this from a logical perspective. Maybe the syntax of the statement is not correct. Have you tried it?
0
 
LVL 40

Expert Comment

by:Sharath
ID: 34943969
Did you try my posts? Those should solve the problem.
0

Featured Post

DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

'Between' is such a common word we rarely think about it but in SQL it has a very specific definition we should be aware of. While most database vendors will have their own unique phrases to describe it (see references at end) the concept in common …
Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question