Solved

MySQL Multiple table update statement performance question.

Posted on 2008-10-02
13
846 Views
Last Modified: 2013-12-12
Hey folks,

New user here, so I'm not sure if I have the proper forum to ask this question, so my apologies in advance if this belongs elsewhere.

Essentially, here's my scenario. We have a web app that functions as a warehouse management system for locations of products, quantities, etc. A mini ERP basically, but far more simplistic.

I have 3 tables with the following structure: (there are more fields, only the relevant ones are shown)

product:
PKProductID, FKWarehouseLocationID

productgroup:
SKU, FKProductID

Warehouse:
PKWarehouseLocationID, WareHouseLocationName

The foreign key FKProductID from productgroup references the primary key PKProductID in the product table, and the Foreign key FKWarehouseLocationID in the product table references the primary key PKWarehouseLocationID in the warehouse table.

Now, I'm using this query to update individual warehouse locations for a given sku:

UPDATE warehouse, product, productgroup PG
JOIN product p ON (PG.`FKProductID` = p.`PKProductID`)
JOIN warehouse w ON (p.`FKWareHouseLocationID` = w.`PKWareHouseLocationID`)
set p.`FKWareHouseLocationID` = w.`PKWareHouseLocationID`
WHERE PG.`SKU` = 'enter_sku_here'
AND w.`WareHouseLocationName` = 'enter_location_of_product_here'

That query works, but it's ugly; it takes a good 12-13 seconds per record to update.  On a per record basis, that's not a *huge* issue, but we've gotten to the point where there's too many changes to efficiently do singly.

So I created a temp table I use for importing new sheets of warehouse locations with the structure of:

temptable:
SKU, WareHouseLocation

I adjusted the above query to basically update locations based on the values entered into the temptable. the new query looks like this:

UPDATE warehouse, product, temptable tt, productgroup PG
JOIN product p ON (PG.`FKProductID` = p.`PKProductID`)
JOIN warehouse w ON (p.`FKWareHouseLocationID` = w.`PKWareHouseLocationID`)
set p.`FKWareHouseLocationID` = w.`PKWareHouseLocationID`
WHERE PG.`SKU` = 'tt.sku'
AND w.`WareHouseLocationName` = 'tt.WareHouseLocation'

Only problem is, as you can see from my performance numbers, 12-13 seconds per record when you have a sheet of 1,000 rows just isn't viable.

I'm not an SQL wizard by any stretch, so I'm sure there's got to be a better way to accomplish this. Can anyone suggest a better way to write queries that involve multiple, large tables like these? Even a point in the right direction would be appreciated.
0
Comment
Question by:jmoriarty
  • 7
  • 6
13 Comments
 
LVL 2

Expert Comment

by:storeytime
Comment Utility
personally I can;t even figure out how that works but I am assuming that what you are trying to do is find the product with a SKU and then update its location if so try the following if not please clarify:

UPDATE product p
JOIN productgroup pg ON (pg.`FKProductID` = p.`PKProductID`)
SET p.`FKWareHouseLocationID` =
        (SELECT `PKWareHouseLocationID`
         FROM warehouse w
         WHERE w.`WareHouseLocationName` = 'enter_location_of_product_here')
WHERE pg.`SKU` = 'enter_sku_here';


Was't sure on the data structures used for each I simulated with INT UNSIGNED for each and it works.
0
 

Author Comment

by:jmoriarty
Comment Utility
Hi storeytime,

Thanks for the fast reply! I actually tried using something similar to that using a sub-query to choose the productid, but the problem with sub-queries is that, when pulling a large list of skus from a temporary table like in the latter half of my post, the sub-query would return more than one result. Or at least that's the error MySQL was giving me.

For example, using your above query, I would write it similar to this to facilitate the above paragraph:

UPDATE product p
JOIN productgroup pg ON (pg.`FKProductID` = p.`PKProductID`)
SET p.`FKWareHouseLocationID` =
        (SELECT `PKWareHouseLocationID`
         FROM warehouse w
         WHERE w.`WareHouseLocationName` = 'temptable.warehousename')
WHERE pg.`SKU` = 'temptable.sku';

which returns 0 rows affected (presumably due to the sub-query returning multiple results?)

And you mostly were right about the data types - my apologies for forgetting such a crucial detail. All of them are INT UNSIGNED with the exception of WareHouseLocationName which is VARCHAR

Thanks for taking the time to assist me; I appreciate it greatly.
0
 
LVL 2

Expert Comment

by:storeytime
Comment Utility
If sub query returns multiple results then there is error in data not in the code.  All the IDs should be unique and defined as keys/ indexes.  Run the subquery alone to test that theory also run one with SKU WHERE statement
0
 

Author Comment

by:jmoriarty
Comment Utility
Ah, I see, that starts to make sense now. Running the sub-query alone as shown below returns 0 results, even though there are matching warehouse location names in both tables. I.E., if I run a select statement for a specific warehouse name on both tables, I get results, but if I run the query as shown below, it returns 0 results. Could that be due to no indexes/key definitions on the temptable? Basically, I was just creating temptable with the fields as placeholders to run queries against, I hadn't defined any keys/indexes as of yet.

SELECT `PKWareHouseLocationID`
         FROM warehouse w
         WHERE w.`WareHouseLocationName` = 'temptable.warehousename'


Thank you again for taking the time to assist me.
0
 

Author Comment

by:jmoriarty
Comment Utility
Actually, I take back part of what I said in the previous comment - I do have a primary key and index on the temptable assigned to the sku field.
0
 

Author Comment

by:jmoriarty
Comment Utility
Small update - running the subquery as shown below (and above) does work. I had accidentally enclosed the table name in single quotes in the WHERE clause.

So running this:

select w.`pkwarehouseid`
from warehouse w, temptable
where w.WareHouseLocationName = temptable.warehousename

Does produce the right results and shows the pkwarehouseid of each warehousename in temptable, but plugging the sub-query into the update statement like this:

UPDATE product p
JOIN productgroup pg ON (pg.`FKProductID` = p.`PKProductID`)
SET p.`FKWareHouseID` =
        (select w.`pkwarehouseid`
from warehouse w, temptable
where w.WareHouseLocationName = temptable.warehousename)
WHERE pg.`SKU` = temptable.sku;

Throws a mysql error of:

Unknown column 'temptable.sku' in 'where clause'

I'm guessing it's something simple I'm not aware of that needs to be included in the update statement to reference the temptable?


Thanks again!

0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 2

Expert Comment

by:storeytime
Comment Utility
you have temptable defined only in the from in the subquery but access it in the WHERE of the UPDATE.

What was the speed of the command using without temptable was it still 12 sec

Best query so far

UPDATE
(((warehouse w INNER JOIN product p ON (w.`PKWareHouseLocationID` = p.`FKWareHouseLocationID`))
INNER JOIN productgroup pg ON (pg.`FKProductID` = p.`PKProductID`))
INNER JOIN temptable tt ON (tt.SKU = pg.SKU))
set p.WID = (select w.`PKWareHouseLocationID`
                     FROM warehouse w
                     WHERE w.`WareHouseLocationName`  = tt.warehousename) ;
0
 
LVL 2

Expert Comment

by:storeytime
Comment Utility
UPDATE
(((warehouse w LEFT INNER JOIN product p ON (w.`PKWareHouseLocationID` = p.`FKWareHouseLocationID`))
LEFT INNER JOIN productgroup pg ON (pg.`FKProductID` = p.`PKProductID`))
LEFT INNER JOIN temptable tt ON (tt.SKU = pg.SKU))
set p.WID = (select w.`PKWareHouseLocationID`
                     FROM warehouse w
                     WHERE w.`WareHouseLocationName`  = tt.warehousename) ;



I forced LEFT INNER JOIN its suppose to be identical but I also read that it can cut down on permutations so it might speed it up with large tables
0
 

Author Comment

by:jmoriarty
Comment Utility
Ah, now I see, you join all tables on common fields, and then update the single record you want to change. Since we're joining temptable to productgroup using sku, it's essentially the same as saying "update warehouseid where this sku is the same as this sku" since only the records that match in both tables are joined when using an inner join, correct?

This query worked well:

UPDATE
(((warehouse w INNER JOIN product p ON (w.`PKWareHouseLocationID` = p.`FKWareHouseLocationID`))
INNER JOIN productgroup pg ON (pg.`FKProductID` = p.`PKProductID`))
INNER JOIN temptable tt ON (tt.SKU = pg.SKU))
set p.WID = (select w.`PKWareHouseLocationID`
                     FROM warehouse w
                     WHERE w.`WareHouseLocationName`  = tt.warehousename) ;

Took about 4.5 seconds to update 11 rows, which is far better than the results I was getting. That's only ~7 minutes or so for 1,000 record update.

In response to your question, on a per SKU basis without running through temptable, the update statement took ~750 MS, so even on an individual update basis, it was still much faster.

Interestingly, your statement using LEFT INNER JOIN failed; could be a data issue?

I'm wondering though, even at ~7 minutes per 1,000, I'm beginning to think it might be faster to run the query through a PHP script and loop through each result of a select statement and update each record individually. What's your opinion?


Thanks!
0
 
LVL 2

Accepted Solution

by:
storeytime earned 500 total points
Comment Utility
I think you are starting to pick up on it.  I haven't done in sql in years so it is slowly coming back to me

I doubt that would be faster depending on how the script is done it will have to be reparsed each time it goes through the loop which adds time to the query

UPDATE
((( productgroup pg INNER JOIN temptable tt ON (tt.SKU = pg.SKU))
INNER JOIN product p ON (pg.`FKProductID` = p.`PKProductID`))
INNER JOIN warehouse w  ON (w.`PKWareHouseLocationID` = p.`FKWareHouseLocationID`))
set p.WID = (select w.`PKWareHouseLocationID`
                     FROM warehouse w
                     WHERE w.`WareHouseLocationName`  = tt.warehousename) ;


I rearranged the joins since temptable is the smallest table is should be in the first join to decrease overall table size should speed it up somemore
0
 

Author Comment

by:jmoriarty
Comment Utility
Ah, I see - so you join from from smallest to largest, smaller tables on the outside, larger on the inside, because your smallest table designated first is what's going to create the initial table. One of the things I never thought of.  And the parenthesis I assume are to ensure that the tables are joined in the proper order?

I've gained a lot of insight from your responses, I thank you once again.
0
 
LVL 2

Expert Comment

by:storeytime
Comment Utility
no problem what is the speed down too
0
 

Author Comment

by:jmoriarty
Comment Utility
Much better with the second query where you adjusted the smaller tables to be first - I just finished testing various sheets of records, ~12.54 seconds for 100 records, thats less than 3 minutes for 1,000 records, which is a 10 fold increase over what I was working with. The largest sheets are usually only around 2-3000 records at most, so that's not bad performance considering what I was working with.

Very much appreciated.
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Introduction Since I wrote the original article about Handling Date and Time in PHP and MySQL (http://www.experts-exchange.com/articles/201/Handling-Date-and-Time-in-PHP-and-MySQL.html) several years ago, it seemed like now was a good time to updat…
This article discusses four methods for overlaying images in a container on a web page
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to dynamically set the form action using jQuery.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now