Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

How to speed up  SELECT JOIN query for large tables

Posted on 2008-06-20
4
Medium Priority
?
1,998 Views
Last Modified: 2010-04-21
How do i speed up SELECT JOIN query for 2 large tables? (teststock and prdbesarch)

I've been using this code for a while now and it seems to be working fine.

One table is teststock and looks like this:
------------------------------------------------
ID int(12) NO PRI auto_increment
CombArtNr varchar(9) NO
KlantNr int(5) NO 0
besnr int(6) NO 0
itemnr int(4) NO 0

prdbesarch looks like this:
------------------------------
Field Name Field Type AllowNull Key Default Extra
prdbesarch
ID int(25) NO PRI auto_increment
besdate date NO
klantnr int(5) NO
klantnaam varchar(64) NO geen naam
fland varchar(64) NO geen land
opmerkingen varchar(512) NO geen
faktuurnr int(6) NO 0
proformanr int(6) NO 0
leveringsnr int(6) NO 0
finstatus varchar(128) NO
besnr int(6) NO
refklant varchar(64) NO
itemnr int(4) NO
artcode varchar(6) NO
klrcode varchar(3) NO
omschr varchar(256) NO
qty decimal(6,2) NO
price decimal(8,2) NO
totaal decimal(8,2) NO 0.00
kg decimal(6,2) NO
intrastat int(8) NO


Today i've update the teststock to the point where it has around 300 000 lines.
prdbesarch has around 15 000 lines


The code basicly shows me the content of prdbesarch and then for each line it gives me the the total stock (fromstock), and available stock (reststock).

Is it maybe because i'm pushing the limits of MySQL? Or is it because my code is just to complex for 2 tables?

This is an example of the code i use.
SELECT l.*, 
SUM(CASE g.KlantNr WHEN l.klantnr THEN CASE g.besnr WHEN l.besnr THEN CASE g.itemnr WHEN l.itemnr THEN 1 ELSE 0 END ELSE 0 END ELSE 0 END) "FromStock", 
SUM(CASE g.KlantNr WHEN '0' THEN 1 ELSE 0 END)  "RestStock", 
COUNT(CombArtNr) "AllStock" 
FROM ( 
    SELECT 
    prdbesarch.ID, 
    prdbesarch.besdate, 
    prdbesarch.klantnr, 
    prdbesarch.klantnaam, 
    prdbesarch.fland, 
    prdbesarch.opmerkingen, 
    prdbesarch.faktuurnr, 
    prdbesarch.proformanr, 
    prdbesarch.leveringsnr, 
    prdbesarch.finstatus, 
    prdbesarch.besnr, 
    prdbesarch.refklant, 
    prdbesarch.itemnr, 
    prdbesarch.artcode, 
    prdbesarch.klrcode, 
    prdbesarch.omschr, 
    prdbesarch.qty, 
    prdbesarch.price, 
    prdbesarch.totaal, 
    prdbesarch.kg, 
    prdbesarch.intrastat 
    FROM prdbesarch 
    WHERE (artcode LIKE 'A1101') 
    ) l 
    LEFT JOIN teststock g 
    ON CONCAT(l.artcode,'_',l.klrcode) = g.CombArtNr 
    GROUP BY g.CombArtNr, 
    l.ID, 
    l.klantnr, 
    l.besnr, 
    l.refklant, 
    l.itemnr, 
    l.artcode, 
    l.klrcode, 
    l.qty 
ORDER BY l.ID ASC

Open in new window

0
Comment
Question by:24Carat
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 9

Accepted Solution

by:
Rurne earned 1400 total points
ID: 21833821
The code is unoptimized.  Typically, subqueries (lines 6-29) slow performance, especially when used in joins, because MySQL has to create a temporary table in order to perform the  LEFT JOIN on the alias ("l", line 30).  The join is then performed on a function (line 32, "CONCAT(l.artcode,'_',l.klrcode)"), and so a full-text match has to be used and any index on g.CombArtNr is ignored.  Depending on the data definitions for prdbesarch and teststock, the GROUP BY statement (lines 33-41) are probably being performed on columns without indexes, which will take longer.  Finally, your result set is dependent on two aggregate functions which contain extended switching statements (lines 2-3), which takes considerable logic for the query planner to map and execute.  Finally, you are ordering on a column from a temporary table (line 42), which has no index from the original table.

One question I do have: Line 29 uses the LIKE conditional, but you don't use any '%' wildcards to indicate an incomplete phrase.  Should prdbesarch = 'A1101', or should it simply start with 'A1101' and have characters after it?  I assume it is the latter case in the code below.  In either case, as you're both filtering against and applying GROUP BY to prdbesarch.artcode, I'd recommend creating an index on it, if you have not already done so.

Below is a start, though to get optimum performance, we're going to still have to address the CONCAT issue not optimizing on index.  That's going to be a huge performance drain, but that will require modifying prdbesarch  I'd also recommend checking into the EXPLAIN syntax for MySQL a bit more to help pinpoint where .
SELECT prdbesarch.ID, prdbesarch.klantnr, prdbesarch.besnr, prdbesarch.refklant, prdbesarch.itemnr, prdbesarch.artcode, prdbesarch.klrcode, prdbesarch.qty, 
SUM(CASE g.KlantNr WHEN l.klantnr THEN CASE g.besnr WHEN l.besnr THEN CASE g.itemnr WHEN l.itemnr THEN 1 ELSE 0 END ELSE 0 END ELSE 0 END) "FromStock", 
SUM(CASE g.KlantNr WHEN '0' THEN 1 ELSE 0 END)  "RestStock", 
COUNT(CombArtNr) "AllStock" 
FROM prdbesarch
LEFT JOIN LEFT JOIN teststock 
ON CONCAT(prdbesarch.artcode,'_',prdbesarch.klrcode) = teststock.CombArtNr 
WHERE prdbesarch.artcode LIKE 'A1101%'
GROUP BY teststock.CombArtNr, 
    prdbesarch.ID, 
    prdbesarch.klantnr, 
    prdbesarch.besnr, 
    prdbesarch.refklant, 
    prdbesarch.itemnr, 
    prdbesarch.artcode, 
    prdbesarch.klrcode, 
    prdbesarch.qty 
ORDER BY prdbesarch.ID ASC

Open in new window

0
 
LVL 46

Assisted Solution

by:Kent Olsen
Kent Olsen earned 600 total points
ID: 21834215

You've got a potential issue in the join operation, too.

CombArtNr is a VARCHAR (9)

ArtCode is a VARCHAR (6)
KlrCode is a VARCHAR (3)

CONCAT (ArtCode, '_', KlrCode) can therefore be 10 characters, which will never match CombArtNr.

I'm a freak for normalization AND for integer join keys.  (They go together.)  You can make your database a lot more efficient if you'll utilize some of these techniques.

If it makes sense with regard to other processes, you should consider building another table with the CombArtNr values, with an auto_increment primary key.  In the teststock and prdbesarch tables, store the primary key from this table instead of the character values.  Index the column in both tables.

Now the join operation is based on indexed integers.  You'll see a tremendous performance improvement.


Good Luck,
Kent
0
 

Author Closing Comment

by:24Carat
ID: 31469251
Optimizing the query has improved the performance a lot.

I've alse created indexes on the teststock table and this helps a lot more even, however i'm unexperienced with indexes.

Wich is the best way to create indexes? Should i simple create 1 index named TSindex on all the field names? Or should i create indexes for each field?

How would you create the indexes on these tables?

Teststock is a table that changes frequently is that also an isue?
0
 
LVL 9

Expert Comment

by:Rurne
ID: 21835485
The only benefit to a multicolumn index (especially in MyISAM tables) is to use it as a unique constraint.  A multicolumn index only really hashes on the first column, so you are better off doing individual indexes on each of the columns in the GROUP BY statement.

Then again, it is a matter of perspective.  As you've noted that teststock has frequen inserts/updates, too many indexes can slow performance as each has to be rehashed after the manipulation query.  You may want to see how it performs with and without each of the indexes and to use some sample data to benchmark the insert and update queries:
http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_benchmark
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Backups and Disaster RecoveryIn this post, we’ll look at strategies for backups and disaster recovery.
An alternative to the "For XML" way of pivoting and concatenating result sets into strings, and an easy introduction to "common table expressions" (CTEs). Being someone who is always looking for alternatives to "work your data", I came across this …
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

610 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question