Solved

Remove Duplicates From A List

Posted on 2014-01-30
7
217 Views
Last Modified: 2014-02-02
I have a list that i need to remove all Duplicates, if I find any,
I need to do this in mysql, due to the sheer number of records - 150,000
Can someone help me with a query to eliminate all duplicates.
See the example goal that I'm looking for below.  Thank you.

DATA
bob@bob.com
bob@bob.com
sue@sue.com
mary@mary.com
mary@mary.com
joe@joe.com
sally@sally.com


GOAL
sue@sue.com
joe@joe.com
sally@sally.com
0
Comment
Question by:U_S_A
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
7 Comments
 
LVL 19

Expert Comment

by:Ken Butters
ID: 39821225
Select Distinct <field> from ....

where field is the email address you have above...
0
 

Author Comment

by:U_S_A
ID: 39821472
@Ken Butters,

Will this give me only records which are NOT duplicated?  or will it give me all unique records.  

I don't want any record that is listed more than once.

Thank you.
0
 
LVL 11

Accepted Solution

by:
John_Vidmar earned 500 total points
ID: 39821527
select <fieldname> from <table> group by <fieldname> having count(*) = 1
0
Simplifying Server Workload Migrations

This use case outlines the migration challenges that organizations face and how the Acronis AnyData Engine supports physical-to-physical (P2P), physical-to-virtual (P2V), virtual to physical (V2P), and cross-virtual (V2V) migration scenarios to address these challenges.

 
LVL 19

Expert Comment

by:Ken Butters
ID: 39821537
Distinct will give you all unique records.  

If it occurs more than once... it will be only be returned once. (dups removed)

If there is only one occurrence... it will be returned once.
0
 

Author Comment

by:U_S_A
ID: 39821573
@Ken

I want to remove any that have more than once.  I don't want to thin out the database to show only unique records.

How can I do this, please?
0
 
LVL 19

Expert Comment

by:Ken Butters
ID: 39821736
Sorry... I wasn't reading close enough.... and answered what I thought you were asking instead of what you really asked.

John's answer above should do what you want.

select <fieldname> from <table> group by <fieldname> having count(*) = 1

where fieldname is the email field in your table.

His query will first get all the records and group them by email address, and then only select the records where the count is equal to 1.

In the case where there is a duplicate... like

bob@bob.com
bob@bob.com

In that case count(*) will be > 1 and will not be returned.
0
 
LVL 25

Expert Comment

by:Tomas Helgi Johannsson
ID: 39822204
Hi!

Using John's query further you simply execute

delete from <table>
where <fieldname> not in ( select <fieldname> from <table> group by <fieldname> having count(*) = 1)

This will remove all duplicate rows.

Regards,
    Tomas Helgi
0

Featured Post

Migrating Your Company's PCs

To keep pace with competitors, businesses must keep employees productive, and that means providing them with the latest technology. This document provides the tips and tricks you need to help you migrate an outdated PC fleet to new desktops, laptops, and tablets.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

'Between' is such a common word we rarely think about it but in SQL it has a very specific definition we should be aware of. While most database vendors will have their own unique phrases to describe it (see references at end) the concept in common …
Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…
Are you ready to implement Active Directory best practices without reading 300+ pages? You're in luck. In this webinar hosted by Skyport Systems, you gain insight into Microsoft's latest comprehensive guide, with tips on the best and easiest way…

738 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question