I am working with a pretty large (for me anyway) MySQL DB. It is about 260mb and 500,000 records.
We are receiving data from a third party as zipped DBF files. The data they upload to the server is uploaded 5 to 6 times per day and each time builds upon the previous upload. So for instance the DBF they upload in the morning might have 600 records while the DBF at the end of the afternoon might have 6,000 records but will still include the original 600 records.
I have working scripts that use a couple of crontabs to unzip the zip, archive it, and put the DBF file into a specific location to be read. I also have a script that should INSERT the contents of it into the large, 500,000 record DB.
I'm having an issue where it works fine at smaller (<10,000) record sizes but doesn't do anything to update my large DB. I think that it is taking too long because it is trying to identify duplicate records so that it doesn't import them in.
What is a good strategy to take to both import the records into the large DB but to also keep from having duplicate records?