MYSQL, query and structure optimization

Hi,

I would like to know what is the best way to query a recordset with the following structure:
1 table containing about 100.000 products
1 table containing product_groups (about 2.000 records).
1 table containing specs about each product in the product table. On average each product has about 35 specs.
1 table containing specs (about 200 records).

Example table structure
Table: products
Field: product_id [unique, indexed]
Field: product_name
Field: product_group_id [indexed]

Table: product_group
Field: product_group_id [unique, indexed]
Field: product_group_name

Table: spec_product
product_id [indexed]
spec_id [indexed]

Table: specs
Field: spec_id [unique, indexed]
Field: spec_name

Now what is the fastet way to find records in a specific group that has one or more specific specs.

If for some reason another table structure is required to optimize querying the dataset please include this in you response. I have no problem altering the structure if this improves speed.

Below an example  how i currently extract data. This example gets all products in product_group_id 100 with the specs 100 and 101.

SELECT product_id from products
Inner Join spec_product as spec_product1 ON product.product_id = spec_product1.product_id AND spec_product1.specs_id = 100
Inner Join spec_product as spec_product2 ON product.product_id = spec_product2.product_id AND spec_product2.specs_id = 101
Where  product_group_id = 100

Open in new window


For some reason when there are more the 7 inner joins the time it takes to get the products increases rapidly.

Well hope anyone can help.
NebukadAsked:
Who is Participating?
 
SharathData EngineerCommented:
can you try this?
SELECT * 
  FROM products p 
 WHERE product_group_id = 2 
       AND EXISTS (  SELECT 1 
                       FROM spec_product sp 
                      WHERE sp.spec_id IN (896,897) 
                            AND p.product_id = sp.product_id 
                   GROUP BY sp.product_id 
                     HAVING COUNT(* ) = 2)

Open in new window

0
 
SharathData EngineerCommented:
You need not to JOIN spec_product multiple times. Can you post some sample data from your tables with expected result?
0
 
virmaiorCommented:
you are specifying the join backwards and you are not indicating an index on both fields

what you wrote:

Inner Join spec_product as spec_product1 ON product.product_id = spec_product1.product_id AND spec_product1.specs_id = 100
Inner Join spec_product as spec_product2 ON product.product_id = spec_product2.product_id AND spec_product2.specs_id = 10

is not equivalent to

Inner Join spec_product as spec_product1 ON spec_product1.product_id = product.product_id AND spec_product1.specs_id = 100
Inner Join spec_product as spec_product2 ON spec_product2.product_id = product.product_id AND spec_product2.specs_id = 10

The former means scan the product table until you find a row that agrees in value with the spec table.  The latter means scan the spec_product table until you find a row that agrees with the product_table

also to do this quickly you need to do it in the order of your indexes (and index the two fields together).
0
Network Scalability - Handle Complex Environments

Monitor your entire network from a single platform. Free 30 Day Trial Now!

 
NebukadAuthor Commented:
@Sharath_123: Thanks for your response below an example with data.

products
product_id # product_name # product_group_id
1 # Aspire AS5732ZG-454G32MN # 1
2 # Aspire AS5736Z-453G32Mnkk # 1
3 # HP Pavilion dm1-3100 # 2
4 # HP Pavilion dv3-4300  # 2

product_group
product_group_id # product_group_name
1 # Acer
2 # HP

specs
spec_id # spec_name
1 # 11.6 inch
2 # 13,3 inch
3 # 15,6 inch
4 # 4 gb ram

spec_product
product_id # spec_id
1 # 3
1 # 4
2 # 3
3 # 1
4 # 1
4 # 4

Text description
Find all products with the product_group_id is 2 (HP) and the specs_id is 1 (11.6 inch) and specs_id is 4 (4 gb ram)

Query
SELECT product_id, product_name from products
Inner Join spec_product as spec_product1 ON product.product_id = spec_product1.product_id AND spec_product1.specs_id = 1
Inner Join spec_product as spec_product2 ON product.product_id = spec_product2.product_id AND spec_product2.specs_id = 4
Where  product_group_id = 2

Result
4 # HP Pavilion dv3-4300
0
 
NebukadAuthor Commented:
@virmaior: Thanks for your response. Not sure if this was a question but there is an index on the fields
product_id and spec_id (they are indexed seperately and together so in total 3 indexes).
I am not sure what you mean by "also to do this quickly you need to do it in the order of your indexes", could you explain this a little more.

Also what is faster:

Inner Join spec_product as spec_product1 ON product.product_id = spec_product1.product_id AND spec_product1.specs_id = 100
Inner Join spec_product as spec_product2 ON product.product_id = spec_product2.product_id AND spec_product2.specs_id = 10

OR

Inner Join spec_product as spec_product1 ON spec_product1.product_id = product.product_id AND spec_product1.specs_id = 100
Inner Join spec_product as spec_product2 ON spec_product2.product_id = product.product_id AND spec_product2.specs_id = 10

The spec_product table contains about 35 times more records that the product table.
0
 
virmaiorCommented:
Two things.
First, there are four possible indexes with two fields:

field1
field2
field1 , field2
field2 , field1

The bottom two differ because an index on two fields is not bi-directional.  it first indexes the one and then the other within that.  Let's say field1 has letters: a, b, c, d, e and field2 numbers 1,2,3,4,5.  then f1,f2 index would contain values like [a {1,3,4,6}, b {2,5,9,11}, c {12, 20, 40} ... ] and the f2,f1 index would be like [1 {a,d} , 2 {b, q } ...]

Second, this means that the fastest query is the one that can use the best index.  But the second consideration is which direction the scanning happens.  The joined table should be scanned rather than vice versa.

0
 
virmaiorCommented:
what does it say when you run:

EXPLAIN SELECT product_id from products
Inner Join spec_product as spec_product1 ON product.product_id = spec_product1.product_id AND spec_product1.specs_id = 100
Inner Join spec_product as spec_product2 ON product.product_id = spec_product2.product_id AND spec_product2.specs_id = 101
Where  product_group_id = 100
0
 
NebukadAuthor Commented:
ACTUAL QUERY
EXPLAIN SELECT p_bach.bach_id_refid
FROM p_bach 
Inner Join p_specslink_bach as join1 ON p_bach.bach_id_refid = join1.bach_id_refid AND join1.specs_id_refid = 897
Inner Join p_specslink_bach as join2 ON p_bach.bach_id_refid = join2.bach_id_refid AND join2.specs_id_refid = 896
WHERE p_bach.city_id_refid =  '440'

Open in new window


RESULT
1	SIMPLE	join1	ref	bach_id_refid,bach_id_refid_2,specs_id_refid	specs_id_refid	5	const	908	Using where
1	SIMPLE	p_bach	ref	city_id_refid,bach_id_refid	bach_id_refid	5	bach.join1.bach_id_refid	1	Using where
1	SIMPLE	join2	ref	bach_id_refid,bach_id_refid_2,specs_id_refid	bach_id_refid	10	bach.join1.bach_id_refid,const	284	Using where; Using index

Open in new window


DLL
CREATE TABLE `p_bach` (
  `city_id_refid` int(11) DEFAULT NULL,
  `bach_id_refid` int(11) DEFAULT NULL,
  KEY `city_id_refid` (`city_id_refid`),
  KEY `bach_id_refid` (`bach_id_refid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

Open in new window


CREATE TABLE `p_specslink_bach` (
  `bach_id_refid` int(11) DEFAULT NULL,
  `specs_id_refid` int(11) DEFAULT NULL,
  UNIQUE KEY `bach_id_refid` (`bach_id_refid`,`specs_id_refid``) USING BTREE,
  KEY `bach_id_refid_2` (`bach_id_refid`),
  KEY `specs_id_refid` (`specs_id_refid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

Open in new window

0
 
NebukadAuthor Commented:
Sorry for the late response Sharath_123 (got the flue) anyway i tested your query and it seems a lot faster. So is this the best way to filter the records?

EXPLAIN  SELECT SQL_NO_CACHE p_bach.bach_id_refid
FROM p_bach WHERE p_bach.city_id_refid = 440 
AND EXISTS (SELECT SQL_NO_CACHE 1 FROM p_specslink_bach WHERE specs_id_refid IN (896,897) AND p_bach.bach_id_refid = p_specslink_bach.bach_id_refid
GROUP BY bach_id_refid 
HAVING COUNT(* ) = 2)

Open in new window


1	PRIMARY	p_bach	ref	city_id_refid	city_id_refid	5	const	28	Using where
2	DEPENDENT SUBQUERY	p_specslink_bach	ref	bach_id_refid,bach_id_refid_2,specs_id_refid	bach_id_refid_2	5	bach.p_bach.bach_id_refid	29	Using where

Open in new window

0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.