Solved

In MySQL I need to delete some duplicates but leave more than 1

Posted on 2014-10-20
3
135 Views
Last Modified: 2014-10-21
I have a table with a bunch of duplicate records.  This is by design.  A sample may look like:

ID     Date                  location
-----------------------------------------
1      2014-10-20       1
2      2014-10-20       1
3      2014-10-20       1
4      2014-10-20       1
5      2014-10-20       1
6      2014-10-20       1
7      2014-10-20       2
8      2014-10-20       2
9      2014-10-20       2
10    2014-10-20       2
11    2014-10-20       2
12    2014-10-20       2
13      2014-10-21       1
14      2014-10-21       1
15      2014-10-21       1
16      2014-10-21       1
17      2014-10-21       1
18      2014-10-21       1
19      2014-10-21       2
20      2014-10-21       2
21      2014-10-21       2
22      2014-10-21       2
23      2014-10-21       2
24      2014-10-21       2

I want to remove only a certain number of duplicates.  For instance, each location has 6 duplicate dates (6 records for location 1 on 2014-10-20).

I would like to remove just TWO of the duplicate records from each date-location grouping leaving me with 4 records at location 1 on 2014-10-20.  I need to do this for several hundred such groupings.

The ID is primary and unique.

Any help?
0
Comment
Question by:pmascari
  • 2
3 Comments
 
LVL 54

Expert Comment

by:Julian Hansen
ID: 40393841
Can you not  do something like this
CREATE TEMPORARY TABLE deleteids (id INT);
INSERT INTO deleteids (id) SELECT MAX(id) FROM t1 GROUP BY DATE;
DELETE FROM t1 WHERE id IN (SELECT id FROM deleteids);

Open in new window

And then just run the INSERT and DELETE query twice?
0
 
LVL 8

Author Comment

by:pmascari
ID: 40394452
Thanks for the comment.  Your suggestion could work if there were always 6 duplicates.  I've found that these groups can have up to 12 duplicates.  How can I delete a variable number of records, leaving me with groups of 4 for each date/location?
0
 
LVL 54

Accepted Solution

by:
Julian Hansen earned 500 total points
ID: 40394518
Then something like this
SET @ROW := 0, @DATE := '';
DELETE FROM t1 WHERE id IN  (
SELECT id FROM (
  SELECT id, `date`,
      @ROW := IF(@DATE = `date`, @ROW + 1, 1) AS row_number,
      @DATE := `date` AS dummy
  FROM t1
  ORDER BY `date`
) subq
WHERE subq.row_number > 4
);

Open in new window

0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Fore-Foreword Today (2016) Maxmind has a new approach to the distribution of its data sets.  This article may be obsolete.  Instead of using the examples here, have a look at the MaxMind API (https://www.maxmind.com/en/geolite2-developer-package). …
Does the idea of dealing with bits scare or confuse you? Does it seem like a waste of time in an age where we all have terabytes of storage? If so, you're missing out on one of the core tools in every professional programmer's toolbox. Learn how to …
Windows 10 is mostly good. However the one thing that annoys me is how many clicks you have to do to dial a VPN connection. You have to go to settings from the start menu, (2 clicks), Network and Internet (1 click), Click VPN (another click) then fi…
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question