MacGyverSolutions
asked on
MySQL Ignore Duplicate Insert on MULTIPLE Columns
I have an interesting one for the crowd. I need to insert data into a table, but ignore duplicates. This isn't the normal "IGNORE INSERT" question that I'm asking. The ignore must happen on multiple non-unique fields.
For example, let's say we have a table with first name, last name, address, birth date, and phone number. If first name, last name, birth date, and address are the same, then ignore the insert, as we can safely assume this is the same person. However, two different people can have the same first and last name, two different people can have the same birth day, and two different people can have the same address. For this reason, a unique index can not be placed on any column. I want to ignore only the SAME person. The rest of the information such as phone number need to be included in the insert if no duplicate was found.
The reason I need this is because I'm importing data from multiple sources, and some rows from these sources may be the same, so I want the result to be a combination of all sources with no duplicates.
I can do this programmatically (PHP) pre or post-insert, but it involves nested loops and is slow and ugly, so I'd rather not do it this way.
Let me know what you think. Thanks!
For example, let's say we have a table with first name, last name, address, birth date, and phone number. If first name, last name, birth date, and address are the same, then ignore the insert, as we can safely assume this is the same person. However, two different people can have the same first and last name, two different people can have the same birth day, and two different people can have the same address. For this reason, a unique index can not be placed on any column. I want to ignore only the SAME person. The rest of the information such as phone number need to be included in the insert if no duplicate was found.
The reason I need this is because I'm importing data from multiple sources, and some rows from these sources may be the same, so I want the result to be a combination of all sources with no duplicates.
I can do this programmatically (PHP) pre or post-insert, but it involves nested loops and is slow and ugly, so I'd rather not do it this way.
Let me know what you think. Thanks!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Insert ignore only works on a unique column or unique column group. Put a unique index on that column group and it will work.
You can achieve by running following command
CREATE UNIQUE INDEX unique_rec ON table_name(first_name, last_name, birth_date, address);
Please note whenever you fire a query and the record is duplicate you would get a mysql error which can be caught as shown in the sample code.
You might want to change the code as per your needs.
CREATE UNIQUE INDEX unique_rec ON table_name(first_name, last_name, birth_date, address);
Please note whenever you fire a query and the record is duplicate you would get a mysql error which can be caught as shown in the sample code.
You might want to change the code as per your needs.
$sql= 'INSERT INTO `table` (`first_name`,`last_name`,`birth_date`, `address`) ';
$sql.= "VALUES ('$fname', '$lname', '$dob','$address')";
$result= @mysql_query($sql);
if ( mysql_errno() == '1062' )
{
die("I'm sorry, that record is duplicate!.");
} else {
//This is to capture any other mysql error
die('Error with query:<br />' .mysql_error());
}
p.s. If you do not want to capture the error then you can supress it using "INSERT IGNORE" MySQL directive.
ASKER
Ah - I thought a composite index would effectively set all columns to be unique, not unique as a group. I will try this out!
ASKER
Not only did this work, the inserts were fast! Thanks all!