Solved

Deduplicate a table

Posted on 2003-10-30
7
492 Views
Last Modified: 2008-02-01
Hello,

I have a table called, "subscribers".

It has many fields, but this question only concerns 2.

email
custom_fields

There are **many** duplicate records on the email field.

I want to deduplicate it.

One catch.  I want to get rid of the duplicate records that have no value
in custom_fields

Does that make sense?

I have the table backed up so I can play with this and try different things.

Thank you very much in advance.

Eric
0
Comment
Question by:esalezone
  • 3
  • 3
7 Comments
 

Author Comment

by:esalezone
ID: 9651251
FYI: There is a primary field: subid
0
 
LVL 4

Expert Comment

by:vk33
ID: 9651535
Hi!

1. Export data into a tab-separated file (mysqldump)
2. Kill duplicates (using some shell script or simple utility)
3. Import it back (mysqlimport)

:)

Regards!
0
 
LVL 4

Expert Comment

by:vk33
ID: 9651625
Oh, sorry, I'm wrong about mysqldump...

1. SELECT * INTO OUTFILE "temp.txt" FIELDS TERMINATED BY "\t" LINES TERMINATED BY "\n"
     FROM subscribers
2. Are you running Windows or UNIX? I would process it with awk...
3. LOAD DATA INFILE "temp.txt" FIELDS TERMINATED BY "\t" LINES TERMINATED BY "\n"

Hope it helps!
0
Control application downtime with dependency maps

Visualize the interdependencies between application components better with Applications Manager's automated application discovery and dependency mapping feature. Resolve performance issues faster by quickly isolating problematic components.

 

Author Comment

by:esalezone
ID: 9651722

There are 80,000 records.

Is your solution to "eyeball" the dupes and delete them manually?

If so, that's unacceptable.

If not, please tell me how I'm misunderstanding.

Thanks

Eric
0
 
LVL 17

Accepted Solution

by:
Squeebee earned 500 total points
ID: 9652369
So, you want to replace email with an id number linked to a seperate table? Or just delete the email?

If you want to break email info out into a seperate table, something like this would do it:

CREATE TABLE emails(
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
email CHAR(255) NOT NULL
);

INSERT INTO emails
SELECT DISTINCT email
FROM other_table;

UPDATE other_table, emails
SET other_table.email = emails.id
WHERE other_table.email = emails.email;
0
 

Author Comment

by:esalezone
ID: 9652408
Forget it, I'm just paying someone to get this done.
0
 
LVL 4

Expert Comment

by:vk33
ID: 9652413
no, no... I suggested processing the dump with some utility.
If you use UNIX (any of them) it's quite easy with sed and awk.
If you use Windows you can write a simple program killing lines with the same e-mail field value...

If you have any problems - feel free to ask!
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
How can i Install a new Data Source on Visual Studio 2 94
mysql joining from the same table 6 36
ebay table structure 2 14
mysql update statement 2 0
Introduction In this article, I will by showing a nice little trick for MySQL similar to that of my previous EE Article for SQLite (http://www.sqlite.org/), A SQLite Tidbit: Quick Numbers Table Generation (http://www.experts-exchange.com/A_3570.htm…
Creating and Managing Databases with phpMyAdmin in cPanel.
Along with being a a promotional video for my three-day Annielytics Dashboard Seminor, this Micro Tutorial is an intro to Google Analytics API data.
Many functions in Excel can make decisions. The most simple of these is the IF function: it returns a value depending on whether a condition you describe is true or false. Once you get the hang of using the IF function, you will find it easier to us…

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now