Solved

How to speed up  SELECT JOIN query for large tables

Posted on 2008-06-20
4
1,984 Views
Last Modified: 2010-04-21
How do i speed up SELECT JOIN query for 2 large tables? (teststock and prdbesarch)

I've been using this code for a while now and it seems to be working fine.

One table is teststock and looks like this:
------------------------------------------------
ID int(12) NO PRI auto_increment
CombArtNr varchar(9) NO
KlantNr int(5) NO 0
besnr int(6) NO 0
itemnr int(4) NO 0

prdbesarch looks like this:
------------------------------
Field Name Field Type AllowNull Key Default Extra
prdbesarch
ID int(25) NO PRI auto_increment
besdate date NO
klantnr int(5) NO
klantnaam varchar(64) NO geen naam
fland varchar(64) NO geen land
opmerkingen varchar(512) NO geen
faktuurnr int(6) NO 0
proformanr int(6) NO 0
leveringsnr int(6) NO 0
finstatus varchar(128) NO
besnr int(6) NO
refklant varchar(64) NO
itemnr int(4) NO
artcode varchar(6) NO
klrcode varchar(3) NO
omschr varchar(256) NO
qty decimal(6,2) NO
price decimal(8,2) NO
totaal decimal(8,2) NO 0.00
kg decimal(6,2) NO
intrastat int(8) NO


Today i've update the teststock to the point where it has around 300 000 lines.
prdbesarch has around 15 000 lines


The code basicly shows me the content of prdbesarch and then for each line it gives me the the total stock (fromstock), and available stock (reststock).

Is it maybe because i'm pushing the limits of MySQL? Or is it because my code is just to complex for 2 tables?

This is an example of the code i use.
SELECT l.*, 

SUM(CASE g.KlantNr WHEN l.klantnr THEN CASE g.besnr WHEN l.besnr THEN CASE g.itemnr WHEN l.itemnr THEN 1 ELSE 0 END ELSE 0 END ELSE 0 END) "FromStock", 

SUM(CASE g.KlantNr WHEN '0' THEN 1 ELSE 0 END)  "RestStock", 

COUNT(CombArtNr) "AllStock" 

FROM ( 

    SELECT 

    prdbesarch.ID, 

    prdbesarch.besdate, 

    prdbesarch.klantnr, 

    prdbesarch.klantnaam, 

    prdbesarch.fland, 

    prdbesarch.opmerkingen, 

    prdbesarch.faktuurnr, 

    prdbesarch.proformanr, 

    prdbesarch.leveringsnr, 

    prdbesarch.finstatus, 

    prdbesarch.besnr, 

    prdbesarch.refklant, 

    prdbesarch.itemnr, 

    prdbesarch.artcode, 

    prdbesarch.klrcode, 

    prdbesarch.omschr, 

    prdbesarch.qty, 

    prdbesarch.price, 

    prdbesarch.totaal, 

    prdbesarch.kg, 

    prdbesarch.intrastat 

    FROM prdbesarch 

    WHERE (artcode LIKE 'A1101') 

    ) l 

    LEFT JOIN teststock g 

    ON CONCAT(l.artcode,'_',l.klrcode) = g.CombArtNr 

    GROUP BY g.CombArtNr, 

    l.ID, 

    l.klantnr, 

    l.besnr, 

    l.refklant, 

    l.itemnr, 

    l.artcode, 

    l.klrcode, 

    l.qty 

ORDER BY l.ID ASC

Open in new window

0
Comment
Question by:24Carat
  • 2
4 Comments
 
LVL 9

Accepted Solution

by:
Rurne earned 350 total points
Comment Utility
The code is unoptimized.  Typically, subqueries (lines 6-29) slow performance, especially when used in joins, because MySQL has to create a temporary table in order to perform the  LEFT JOIN on the alias ("l", line 30).  The join is then performed on a function (line 32, "CONCAT(l.artcode,'_',l.klrcode)"), and so a full-text match has to be used and any index on g.CombArtNr is ignored.  Depending on the data definitions for prdbesarch and teststock, the GROUP BY statement (lines 33-41) are probably being performed on columns without indexes, which will take longer.  Finally, your result set is dependent on two aggregate functions which contain extended switching statements (lines 2-3), which takes considerable logic for the query planner to map and execute.  Finally, you are ordering on a column from a temporary table (line 42), which has no index from the original table.

One question I do have: Line 29 uses the LIKE conditional, but you don't use any '%' wildcards to indicate an incomplete phrase.  Should prdbesarch = 'A1101', or should it simply start with 'A1101' and have characters after it?  I assume it is the latter case in the code below.  In either case, as you're both filtering against and applying GROUP BY to prdbesarch.artcode, I'd recommend creating an index on it, if you have not already done so.

Below is a start, though to get optimum performance, we're going to still have to address the CONCAT issue not optimizing on index.  That's going to be a huge performance drain, but that will require modifying prdbesarch  I'd also recommend checking into the EXPLAIN syntax for MySQL a bit more to help pinpoint where .
SELECT prdbesarch.ID, prdbesarch.klantnr, prdbesarch.besnr, prdbesarch.refklant, prdbesarch.itemnr, prdbesarch.artcode, prdbesarch.klrcode, prdbesarch.qty, 

SUM(CASE g.KlantNr WHEN l.klantnr THEN CASE g.besnr WHEN l.besnr THEN CASE g.itemnr WHEN l.itemnr THEN 1 ELSE 0 END ELSE 0 END ELSE 0 END) "FromStock", 

SUM(CASE g.KlantNr WHEN '0' THEN 1 ELSE 0 END)  "RestStock", 

COUNT(CombArtNr) "AllStock" 

FROM prdbesarch

LEFT JOIN LEFT JOIN teststock 

ON CONCAT(prdbesarch.artcode,'_',prdbesarch.klrcode) = teststock.CombArtNr 

WHERE prdbesarch.artcode LIKE 'A1101%'

GROUP BY teststock.CombArtNr, 

    prdbesarch.ID, 

    prdbesarch.klantnr, 

    prdbesarch.besnr, 

    prdbesarch.refklant, 

    prdbesarch.itemnr, 

    prdbesarch.artcode, 

    prdbesarch.klrcode, 

    prdbesarch.qty 

ORDER BY prdbesarch.ID ASC

Open in new window

0
 
LVL 45

Assisted Solution

by:Kdo
Kdo earned 150 total points
Comment Utility

You've got a potential issue in the join operation, too.

CombArtNr is a VARCHAR (9)

ArtCode is a VARCHAR (6)
KlrCode is a VARCHAR (3)

CONCAT (ArtCode, '_', KlrCode) can therefore be 10 characters, which will never match CombArtNr.

I'm a freak for normalization AND for integer join keys.  (They go together.)  You can make your database a lot more efficient if you'll utilize some of these techniques.

If it makes sense with regard to other processes, you should consider building another table with the CombArtNr values, with an auto_increment primary key.  In the teststock and prdbesarch tables, store the primary key from this table instead of the character values.  Index the column in both tables.

Now the join operation is based on indexed integers.  You'll see a tremendous performance improvement.


Good Luck,
Kent
0
 

Author Closing Comment

by:24Carat
Comment Utility
Optimizing the query has improved the performance a lot.

I've alse created indexes on the teststock table and this helps a lot more even, however i'm unexperienced with indexes.

Wich is the best way to create indexes? Should i simple create 1 index named TSindex on all the field names? Or should i create indexes for each field?

How would you create the indexes on these tables?

Teststock is a table that changes frequently is that also an isue?
0
 
LVL 9

Expert Comment

by:Rurne
Comment Utility
The only benefit to a multicolumn index (especially in MyISAM tables) is to use it as a unique constraint.  A multicolumn index only really hashes on the first column, so you are better off doing individual indexes on each of the columns in the GROUP BY statement.

Then again, it is a matter of perspective.  As you've noted that teststock has frequen inserts/updates, too many indexes can slow performance as each has to be rehashed after the manipulation query.  You may want to see how it performs with and without each of the indexes and to use some sample data to benchmark the insert and update queries:
http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_benchmark
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
Composite queries are used to retrieve the results from joining multiple queries after applying any filters. UNION, INTERSECT, MINUS, and UNION ALL are some of the operators used to get certain desired results.​
Illustrator's Shape Builder tool will let you combine shapes visually and interactively. This video shows the Mac version, but the tool works the same way in Windows. To follow along with this video, you can draw your own shapes or download the file…
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now