?
Solved

Deduplicate a table

Posted on 2003-10-30
7
Medium Priority
?
511 Views
Last Modified: 2008-02-01
Hello,

I have a table called, "subscribers".

It has many fields, but this question only concerns 2.

email
custom_fields

There are **many** duplicate records on the email field.

I want to deduplicate it.

One catch.  I want to get rid of the duplicate records that have no value
in custom_fields

Does that make sense?

I have the table backed up so I can play with this and try different things.

Thank you very much in advance.

Eric
0
Comment
Question by:esalezone
  • 3
  • 3
7 Comments
 

Author Comment

by:esalezone
ID: 9651251
FYI: There is a primary field: subid
0
 
LVL 4

Expert Comment

by:vk33
ID: 9651535
Hi!

1. Export data into a tab-separated file (mysqldump)
2. Kill duplicates (using some shell script or simple utility)
3. Import it back (mysqlimport)

:)

Regards!
0
 
LVL 4

Expert Comment

by:vk33
ID: 9651625
Oh, sorry, I'm wrong about mysqldump...

1. SELECT * INTO OUTFILE "temp.txt" FIELDS TERMINATED BY "\t" LINES TERMINATED BY "\n"
     FROM subscribers
2. Are you running Windows or UNIX? I would process it with awk...
3. LOAD DATA INFILE "temp.txt" FIELDS TERMINATED BY "\t" LINES TERMINATED BY "\n"

Hope it helps!
0
Get your Disaster Recovery as a Service basics

Disaster Recovery as a Service is one go-to solution that revolutionizes DR planning. Implementing DRaaS could be an efficient process, easily accessible to non-DR experts. Learn about monitoring, testing, executing failovers and failbacks to ensure a "healthy" DR environment.

 

Author Comment

by:esalezone
ID: 9651722

There are 80,000 records.

Is your solution to "eyeball" the dupes and delete them manually?

If so, that's unacceptable.

If not, please tell me how I'm misunderstanding.

Thanks

Eric
0
 
LVL 17

Accepted Solution

by:
Squeebee earned 1000 total points
ID: 9652369
So, you want to replace email with an id number linked to a seperate table? Or just delete the email?

If you want to break email info out into a seperate table, something like this would do it:

CREATE TABLE emails(
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
email CHAR(255) NOT NULL
);

INSERT INTO emails
SELECT DISTINCT email
FROM other_table;

UPDATE other_table, emails
SET other_table.email = emails.id
WHERE other_table.email = emails.email;
0
 

Author Comment

by:esalezone
ID: 9652408
Forget it, I'm just paying someone to get this done.
0
 
LVL 4

Expert Comment

by:vk33
ID: 9652413
no, no... I suggested processing the dump with some utility.
If you use UNIX (any of them) it's quite easy with sed and awk.
If you use Windows you can write a simple program killing lines with the same e-mail field value...

If you have any problems - feel free to ask!
0

Featured Post

NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this blog post, we’ll look at how ClickHouse performs in a general analytical workload using the star schema benchmark test.
In this blog, we’ll look at how improvements to Percona XtraDB Cluster improved IST performance.
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses
Course of the Month16 days, 6 hours left to enroll

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question