speed up master detail inserts

andieje
andieje used Ask the Experts™
on
Hi

Is there anyway to speed up master detail inserts if the master table has an autoinc primary key.

Is there a better way than this:

START TRANSACTION;
NSERT master (field1,field2) VALUES (data1,data2);
INSERTdetail (foreign_key, field1,field2) VALUES (last_insert_id(), data3,data4);
INSERTmaster (field1,field2) VALUES (data5,data6);
INSERT detail (foreign_key,field1,field2) VALUES (LAST_INSERT_ID,data7,data8);
COMMIT;

IS there a way of using mysqlimport or load data file for master detail data in separate tab files when the master table has an autoinc primary key?

thanks
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Kevin CrossChief Technology Officer
Most Valuable Expert 2011

Commented:
Hi.

Can you could on field1 and field2 to be unique?  If so, you could load your updates into a staging table so you have 4 columns.  You can insert all the values of the first two columns into your master table and then using the uniqueness of those values look up the autoinc id to insert the other two columns into detail table.

Unfortunately, the current versions of MySQL cannot have triggers on views or else that may be a possibility.

Author

Commented:
sorry, i'm being a bit thick but i don't get what you mean.  Do you mean to insert all of the master records in one go such as

insert into master(field1, field2) values(a,b) values(c,d) values(e,f) etc
insert into detail(foreign_key, field1, field2) values(select id from master where field1 = a and field2 = b, some_val, some_val) values(select id from master where field1 = c and field2 = d, some_val, some_val)

Would this be quicker despite the select command in the insert? I don't know how mysql gets the last insert id to know if that is quicker than a select command like this

I can see that the multirow inserts would be quicker than single row inserts
Chief Technology Officer
Most Valuable Expert 2011
Commented:
Yes, that is close to what I am suggesting.  If the combination of (a, b), (c, d) and (e, f) uniquely identify a row, then I would do the insert in one mass statement.  Keep in mind you can use INSERT INTO ... SELECT syntax if this is coming from a table.  

http://dev.mysql.com/doc/refman/4.1/en/insert.html

In fact that is where the power comes in is being able to use that syntax to do one singular bulk insert.  If that is true, you can load all your data to a table like:

INSERT INTO staging(key1, key2, data1, data2)
VALUES(a, b, some_val, another_val);

After all your data is in that table, the insert into master and detail tables become simplified to:

-- insert into master
INSERT INTO master(field1, field2)
SELECT key1, key2
FROM staging;

-- insert into details with join on master
INSERT INTO detail(foreign_key, field1, field2)
SELECT master.key, staging.data1, staging.data2
FROM staging
JOIN master ON staging.key1 = master.field1
   AND staging.key2 = master.field2
;

This should be more efficient than having to one by one insert to master, trap last inserted id and then insert to details.  In fact, this can make the process very streamlined from your application standpoint as it can open one connection to MySQL and write records to staging table.  Your staging table can have a trigger for INSERTS that will do the two inserts needed to your master and detail tables.  

Carrying that thought further, if your (a, b) combinations don't uniquely identify a row, you can still use the staging table idea.  When you insert a row, simply use a trigger FOR EACH ROW to insert to master, get last_insert_id, and then insert to detail.  

If you want, you can have a column that is a BIT symbolizes rows that have been processed that gets marked true after successful update of the master and detail tables.

Hope that helps!
How to Generate Services Revenue the Easiest Way

This Tuesday! Learn key insights about modern cyber protection services & gain practical strategies to skyrocket business:

- What it takes to build a cloud service portfolio
- How to determine which services will help your unique business grow
- Various use-cases and examples

Author

Commented:
HI, thanks for a great answer. Sadly there is nothing i can find that could be unique in the master table. Would it be overkill to create an extra dummy unique field in the master table? The original raw data is an tab delimited file which contains the master and detail information in the same line. I could loop through this and add an extra column which is the time in milliseconds (or microseconds if the machine can process the input file quicker than than one row per millisecond which i expect it can)
Kevin CrossChief Technology Officer
Most Valuable Expert 2011

Commented:
You could do that, but if you will be doing this often, it may be just as easy to create the staging table.  In MySQL, you cannot have triggers on VIEWs, but there would be nothing stopping you from doing the INSERT on a table with an INSERT TRIGGER on it for this purpose.  If any of the inserts you are trying already exist in the database, you could have the trigger actually write the row to the staging table which could become your error log (failed transactions by say having a status column that denotes inserted, failed, etc.) while otherwise using the trigger to do insert on master and detail.  Since you have the FOR EACH ROW there, it should work just fine.  It doesn't necessarily speed things up, but it moves the burden of the separation from the client to the server and this staging table which just has to be setup the one-time.

http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html

Author

Commented:
sorry for the confusion about generating the unique field in the master table - i hadn't actually seen the last few lines of your previous post!

So to recap you are suggesting:

1. A bulk insert into a staging table
insert into staging  (masterfield1,masterfield2, detailfield1, detailfield2) VALUES
(data1,data2, data3,data4), (data5,data6),data7,data8)

2. An INSERT trigger on the staging table
FOR EACH ROW to insert to master, get last_insert_id, and then insert to detail.

Would this trigger be an AFTER INSERT? Wouldn't this be slow compared to the method in the original question as any indexes on master and detail would be updated after each insert whereas the original method uses a transaction and, when batching in a transaction the index is only modified once (after all the records are added). Or can i put the FOR EACH ROW inside a transaction in the TRIGGER (i've never written a trigger)


But if i did add a dummy unique column to the master table the INSERT trigger on the staging table could do this as you previously suggested

-- insert into master
INSERT INTO master(field1, field2)
SELECT key1, key2
FROM staging;

-- insert into details with join on master
INSERT INTO detail(foreign_key, field1, field2)
SELECT master.key, staging.data1, staging.data2
FROM staging
JOIN master ON staging.key1 = master.field1
   AND staging.key2 = master.field2

I'm not sure which method would be better or faster
Kevin CrossChief Technology Officer
Most Valuable Expert 2011

Commented:
Hmm.  Good points.  If you can put a unique key in the master table that you can know ahead of time in your staging table then I would say that is likely going to be the faster approach as you can put that in a transaction.  I don't think you can have individual row attention to utilize last_insert_id and transactional processing in the trigger approach.

Author

Commented:
very helpful - much obliged

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial