Solved

Deduplicate a table

Posted on 2003-10-30
7
497 Views
Last Modified: 2008-02-01
Hello,

I have a table called, "subscribers".

It has many fields, but this question only concerns 2.

email
custom_fields

There are **many** duplicate records on the email field.

I want to deduplicate it.

One catch.  I want to get rid of the duplicate records that have no value
in custom_fields

Does that make sense?

I have the table backed up so I can play with this and try different things.

Thank you very much in advance.

Eric
0
Comment
Question by:esalezone
  • 3
  • 3
7 Comments
 

Author Comment

by:esalezone
ID: 9651251
FYI: There is a primary field: subid
0
 
LVL 4

Expert Comment

by:vk33
ID: 9651535
Hi!

1. Export data into a tab-separated file (mysqldump)
2. Kill duplicates (using some shell script or simple utility)
3. Import it back (mysqlimport)

:)

Regards!
0
 
LVL 4

Expert Comment

by:vk33
ID: 9651625
Oh, sorry, I'm wrong about mysqldump...

1. SELECT * INTO OUTFILE "temp.txt" FIELDS TERMINATED BY "\t" LINES TERMINATED BY "\n"
     FROM subscribers
2. Are you running Windows or UNIX? I would process it with awk...
3. LOAD DATA INFILE "temp.txt" FIELDS TERMINATED BY "\t" LINES TERMINATED BY "\n"

Hope it helps!
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 

Author Comment

by:esalezone
ID: 9651722

There are 80,000 records.

Is your solution to "eyeball" the dupes and delete them manually?

If so, that's unacceptable.

If not, please tell me how I'm misunderstanding.

Thanks

Eric
0
 
LVL 17

Accepted Solution

by:
Squeebee earned 500 total points
ID: 9652369
So, you want to replace email with an id number linked to a seperate table? Or just delete the email?

If you want to break email info out into a seperate table, something like this would do it:

CREATE TABLE emails(
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
email CHAR(255) NOT NULL
);

INSERT INTO emails
SELECT DISTINCT email
FROM other_table;

UPDATE other_table, emails
SET other_table.email = emails.id
WHERE other_table.email = emails.email;
0
 

Author Comment

by:esalezone
ID: 9652408
Forget it, I'm just paying someone to get this done.
0
 
LVL 4

Expert Comment

by:vk33
ID: 9652413
no, no... I suggested processing the dump with some utility.
If you use UNIX (any of them) it's quite easy with sed and awk.
If you use Windows you can write a simple program killing lines with the same e-mail field value...

If you have any problems - feel free to ask!
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A lot of articles have been written on splitting mysqldump and grabbing the required tables. A long while back, when Shlomi (http://code.openark.org/blog/mysql/on-restoring-a-single-table-from-mysqldump) had suggested a “sed” way, I actually shell …
Foreword This is an old article.  Instead of using the MySQL extension that was used in the original code examples, please choose one of the currently supported database extensions instead.  More information is available here: MySQLi / PDO (http://…
Along with being a a promotional video for my three-day Annielytics Dashboard Seminor, this Micro Tutorial is an intro to Google Analytics API data.
Nobody understands Phishing better than an anti-spam company. That’s why we are providing Phishing Awareness Training to our customers. According to a report by Verizon, only 3% of targeted users report malicious emails to management. With compan…

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question