andieje
asked on
speed up master detail inserts
Hi
Is there anyway to speed up master detail inserts if the master table has an autoinc primary key.
Is there a better way than this:
START TRANSACTION;
NSERT master (field1,field2) VALUES (data1,data2);
INSERTdetail (foreign_key, field1,field2) VALUES (last_insert_id(), data3,data4);
INSERTmaster (field1,field2) VALUES (data5,data6);
INSERT detail (foreign_key,field1,field2 ) VALUES (LAST_INSERT_ID,data7,data 8);
COMMIT;
IS there a way of using mysqlimport or load data file for master detail data in separate tab files when the master table has an autoinc primary key?
thanks
Is there anyway to speed up master detail inserts if the master table has an autoinc primary key.
Is there a better way than this:
START TRANSACTION;
NSERT master (field1,field2) VALUES (data1,data2);
INSERTdetail (foreign_key, field1,field2) VALUES (last_insert_id(), data3,data4);
INSERTmaster (field1,field2) VALUES (data5,data6);
INSERT detail (foreign_key,field1,field2
COMMIT;
IS there a way of using mysqlimport or load data file for master detail data in separate tab files when the master table has an autoinc primary key?
thanks
ASKER
sorry, i'm being a bit thick but i don't get what you mean. Do you mean to insert all of the master records in one go such as
insert into master(field1, field2) values(a,b) values(c,d) values(e,f) etc
insert into detail(foreign_key, field1, field2) values(select id from master where field1 = a and field2 = b, some_val, some_val) values(select id from master where field1 = c and field2 = d, some_val, some_val)
Would this be quicker despite the select command in the insert? I don't know how mysql gets the last insert id to know if that is quicker than a select command like this
I can see that the multirow inserts would be quicker than single row inserts
insert into master(field1, field2) values(a,b) values(c,d) values(e,f) etc
insert into detail(foreign_key, field1, field2) values(select id from master where field1 = a and field2 = b, some_val, some_val) values(select id from master where field1 = c and field2 = d, some_val, some_val)
Would this be quicker despite the select command in the insert? I don't know how mysql gets the last insert id to know if that is quicker than a select command like this
I can see that the multirow inserts would be quicker than single row inserts
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
HI, thanks for a great answer. Sadly there is nothing i can find that could be unique in the master table. Would it be overkill to create an extra dummy unique field in the master table? The original raw data is an tab delimited file which contains the master and detail information in the same line. I could loop through this and add an extra column which is the time in milliseconds (or microseconds if the machine can process the input file quicker than than one row per millisecond which i expect it can)
You could do that, but if you will be doing this often, it may be just as easy to create the staging table. In MySQL, you cannot have triggers on VIEWs, but there would be nothing stopping you from doing the INSERT on a table with an INSERT TRIGGER on it for this purpose. If any of the inserts you are trying already exist in the database, you could have the trigger actually write the row to the staging table which could become your error log (failed transactions by say having a status column that denotes inserted, failed, etc.) while otherwise using the trigger to do insert on master and detail. Since you have the FOR EACH ROW there, it should work just fine. It doesn't necessarily speed things up, but it moves the burden of the separation from the client to the server and this staging table which just has to be setup the one-time.
http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html
http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html
ASKER
sorry for the confusion about generating the unique field in the master table - i hadn't actually seen the last few lines of your previous post!
So to recap you are suggesting:
1. A bulk insert into a staging table
insert into staging (masterfield1,masterfield2 , detailfield1, detailfield2) VALUES
(data1,data2, data3,data4), (data5,data6),data7,data8)
2. An INSERT trigger on the staging table
FOR EACH ROW to insert to master, get last_insert_id, and then insert to detail.
Would this trigger be an AFTER INSERT? Wouldn't this be slow compared to the method in the original question as any indexes on master and detail would be updated after each insert whereas the original method uses a transaction and, when batching in a transaction the index is only modified once (after all the records are added). Or can i put the FOR EACH ROW inside a transaction in the TRIGGER (i've never written a trigger)
But if i did add a dummy unique column to the master table the INSERT trigger on the staging table could do this as you previously suggested
-- insert into master
INSERT INTO master(field1, field2)
SELECT key1, key2
FROM staging;
-- insert into details with join on master
INSERT INTO detail(foreign_key, field1, field2)
SELECT master.key, staging.data1, staging.data2
FROM staging
JOIN master ON staging.key1 = master.field1
AND staging.key2 = master.field2
I'm not sure which method would be better or faster
So to recap you are suggesting:
1. A bulk insert into a staging table
insert into staging (masterfield1,masterfield2
(data1,data2, data3,data4), (data5,data6),data7,data8)
2. An INSERT trigger on the staging table
FOR EACH ROW to insert to master, get last_insert_id, and then insert to detail.
Would this trigger be an AFTER INSERT? Wouldn't this be slow compared to the method in the original question as any indexes on master and detail would be updated after each insert whereas the original method uses a transaction and, when batching in a transaction the index is only modified once (after all the records are added). Or can i put the FOR EACH ROW inside a transaction in the TRIGGER (i've never written a trigger)
But if i did add a dummy unique column to the master table the INSERT trigger on the staging table could do this as you previously suggested
-- insert into master
INSERT INTO master(field1, field2)
SELECT key1, key2
FROM staging;
-- insert into details with join on master
INSERT INTO detail(foreign_key, field1, field2)
SELECT master.key, staging.data1, staging.data2
FROM staging
JOIN master ON staging.key1 = master.field1
AND staging.key2 = master.field2
I'm not sure which method would be better or faster
Hmm. Good points. If you can put a unique key in the master table that you can know ahead of time in your staging table then I would say that is likely going to be the faster approach as you can put that in a transaction. I don't think you can have individual row attention to utilize last_insert_id and transactional processing in the trigger approach.
ASKER
very helpful - much obliged
Can you could on field1 and field2 to be unique? If so, you could load your updates into a staging table so you have 4 columns. You can insert all the values of the first two columns into your master table and then using the uniqueness of those values look up the autoinc id to insert the other two columns into detail table.
Unfortunately, the current versions of MySQL cannot have triggers on views or else that may be a possibility.