Solved

Deduplicate a table

Posted on 2003-10-30
7
504 Views
Last Modified: 2008-02-01
Hello,

I have a table called, "subscribers".

It has many fields, but this question only concerns 2.

email
custom_fields

There are **many** duplicate records on the email field.

I want to deduplicate it.

One catch.  I want to get rid of the duplicate records that have no value
in custom_fields

Does that make sense?

I have the table backed up so I can play with this and try different things.

Thank you very much in advance.

Eric
0
Comment
Question by:esalezone
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
7 Comments
 

Author Comment

by:esalezone
ID: 9651251
FYI: There is a primary field: subid
0
 
LVL 4

Expert Comment

by:vk33
ID: 9651535
Hi!

1. Export data into a tab-separated file (mysqldump)
2. Kill duplicates (using some shell script or simple utility)
3. Import it back (mysqlimport)

:)

Regards!
0
 
LVL 4

Expert Comment

by:vk33
ID: 9651625
Oh, sorry, I'm wrong about mysqldump...

1. SELECT * INTO OUTFILE "temp.txt" FIELDS TERMINATED BY "\t" LINES TERMINATED BY "\n"
     FROM subscribers
2. Are you running Windows or UNIX? I would process it with awk...
3. LOAD DATA INFILE "temp.txt" FIELDS TERMINATED BY "\t" LINES TERMINATED BY "\n"

Hope it helps!
0
The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

 

Author Comment

by:esalezone
ID: 9651722

There are 80,000 records.

Is your solution to "eyeball" the dupes and delete them manually?

If so, that's unacceptable.

If not, please tell me how I'm misunderstanding.

Thanks

Eric
0
 
LVL 17

Accepted Solution

by:
Squeebee earned 500 total points
ID: 9652369
So, you want to replace email with an id number linked to a seperate table? Or just delete the email?

If you want to break email info out into a seperate table, something like this would do it:

CREATE TABLE emails(
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
email CHAR(255) NOT NULL
);

INSERT INTO emails
SELECT DISTINCT email
FROM other_table;

UPDATE other_table, emails
SET other_table.email = emails.id
WHERE other_table.email = emails.email;
0
 

Author Comment

by:esalezone
ID: 9652408
Forget it, I'm just paying someone to get this done.
0
 
LVL 4

Expert Comment

by:vk33
ID: 9652413
no, no... I suggested processing the dump with some utility.
If you use UNIX (any of them) it's quite easy with sed and awk.
If you use Windows you can write a simple program killing lines with the same e-mail field value...

If you have any problems - feel free to ask!
0

Featured Post

Resolve Critical IT Incidents Fast

If your data, services or processes become compromised, your organization can suffer damage in just minutes and how fast you communicate during a major IT incident is everything. Learn how to immediately identify incidents & best practices to resolve them quickly and effectively.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Foreword In the years since this article was written, numerous hacking attacks have targeted password-protected web sites.  The storage of client passwords has become a subject of much discussion, some of it useful and some of it misguided.  Of cou…
Introduction Since I wrote the original article about Handling Date and Time in PHP and MySQL several years ago, it seemed like now was a good time to update it for object-oriented PHP.  This article does that, replacing as much as possible the pr…
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…
If you’ve ever visited a web page and noticed a cool font that you really liked the look of, but couldn’t figure out which font it was so that you could use it for your own work, then this video is for you! In this Micro Tutorial, you'll learn yo…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question