One of the tasks i regularly need to do is read through the rows of data in a text file, see if an entry already exists in the database for that entry and if not, create it. This can be quite slow. Having gained a bit more experience of linux commands i'm actually thinking it is quicker to export all the records for the mysql table in question into a tab file, compare this tab file to my input file using linux join, and then mysqlimport the records from the input file that 'fail' the join (i.e. they fail the join as they don't exist in the table)
This is significantly less coding than writing a perl script to query if the records exist and creating insert statements to insert the data.
Another thing i often need to do is, given a input file of data, i need to get the database ids for these input records in the database. I ususally do this by iterating over the file and and querying the database table to get the database id for each input. However for thousands of records i'm thinking again it might be easier to export the table into a tab file and linux join the exported file to the input file. This way the file will have the original input data plus the database id
I was wondering if anyone has any comments or reasons why not to use this approach. I know one possible caveat is lack of error checking for correct input format (though you can do a fair bit of that in mysqlimport/load data infile) but we dont do that in the perl script anyhow as we know the data is ok.
Exporting and importing data in mysql is so quick, and linux command line is so quick that i think this approach is ok