Solved

Help optimizing this crazy query

Posted on 2013-01-31
41
206 Views
Last Modified: 2013-03-29
We have 4 table relationship, 2 main tables - "assets" and "products", a through table - "assets_products" and a products/price relationship.   Here's the gist of the tables with only the relevant fields:

assets:
+----------------------------------+--------------+------+-----+---------------------+-----------------------------+
| Field                            | Type         | Null | Key | Default             | Extra                       |
+----------------------------------+--------------+------+-----+---------------------+-----------------------------+
| id                               | int(11)      | NO   | PRI | NULL                | auto_increment              |
| in_store                         | tinyint(4)   | YES  |     | 0                   |                             |
| isbn                             | varchar(32)  | YES  | MUL | NULL                |                             |
+----------------------------------+--------------+------+-----+---------------------+-----------------------------+

Open in new window

products:
+------------------------+---------------+------+-----+---------------------+-----------------------------+
| Field                  | Type          | Null | Key | Default             | Extra                       |
+------------------------+---------------+------+-----+---------------------+-----------------------------+
| id                     | int(11)       | NO   | PRI | NULL                | auto_increment              |
| sku                    | varchar(40)   | YES  | UNI | NULL                |                             |
| type                   | varchar(30)   | YES  |     | NULL                |                             |
| asset_id               | int(11)       | YES  |     | NULL                |                             |
| in_store               | tinyint(4)    | NO   | MUL | 0                   |                             |
| parent_id              | int(11)       | YES  | MUL | NULL                |                             |
+------------------------+---------------+------+-----+---------------------+-----------------------------+

Open in new window


assets_products:
+------------+----------+------+-----+---------+----------------+
| Field      | Type     | Null | Key | Default | Extra          |
+------------+----------+------+-----+---------+----------------+
| asset_id   | int(11)  | NO   | MUL | 0       |                |
| product_id | int(11)  | NO   | MUL | 0       |                |
+------------+----------+------+-----+---------+----------------+

Open in new window


prices:
+----------------------+--------------+------+-----+---------+----------------+
| Field                | Type         | Null | Key | Default | Extra          |
+----------------------+--------------+------+-----+---------+----------------+
| price                | decimal(8,2) | NO   |     | 0.00    |                |
| currency             | varchar(3)   | NO   | MUL | USD     |                |
| effective_begin_date | datetime     | YES  |     | NULL    |                |
| effective_end_date   | datetime     | YES  |     | NULL    |                |
| priceable_id         | int(11)      | YES  | MUL | NULL    |                |
| priceable_type       | varchar(255) | YES  | MUL | NULL    |                |
+----------------------+--------------+------+-----+---------+----------------+

Open in new window




The first relationship is called a Single relationship and here's a summary of the relationships:
1. ONE TO ONE MATCH of assets/products through assets_products
2. The product type is set to 'Single' and the assets.isbn MATCHES the products.sku

The second relationship is a called a Rental/Demo relationship:
1. The product type is either 'Rental' or 'Demo'
2. the products.parent_id is set to the id of a 'parent' product which is a Single relationship (ONE TO ONE MATCH of assets/products through assets_products)

The 3rd relationship is called a Package
1. A package is a collection of all the above (Singles, Singles + Rentals or Demos) and they can be seen as "children" of the Package
2. A package can access it's "children" by all the asset_ids that match the Package's product_id in the assets_products table.

Each product is linked to the prices table by the prices.pricable_id = products.id and prices.pricable_type=products.type (Package or Product) and we need to show products with valid prices which are donated by        prices.effective_begin_date <= 'today' AND prices.effective_end_date >= '2020-23-01'

The other main criteria that drives this query is Singles have cached_product_in_store=1 and products have and in_store=1.

Now, the dilemma.   I'm trying to write a search by asset.title and products.name search all in one query and the madness looks like this:

SELECT 
	a1.id AS aid, p1.id AS pid, p1.type, p1.in_store as pin_store, a1.cached_product_in_store, p1.parent_id, a1.title, p1.name
FROM assets a1
INNER JOIN (assets_products, products p1, prices) ON
	 p1.id = assets_products.product_id AND
	 a1.id = assets_products.asset_id AND
	 p1.id = prices.priceable_id AND
	 prices.effective_begin_date <= '2013-23-01' AND
	 prices.effective_end_date >= '2020-23-01' AND
	 prices.priceable_type IN ('Product', 'Package') AND
	 prices.currency = 'USD'
WHERE 
	(CASE
		WHEN ((p1.type='Rental' OR p1.type='Demo') AND p1.parent_id IS NOT NULL) THEN (p1.in_store=(SELECT in_store FROM products p2 WHERE p1.parent_id=p2.id AND p2.in_store=1)) AND
			(LOWER(p1.name) LIKE '%Global%' AND LOWER(p1.name) LIKE '%History%' AND LOWER(p1.name) LIKE '%&%' AND LOWER(p1.name) LIKE '%Geography%')
		WHEN (p1.type='Package') THEN (p1.in_store=1) AND
			(LOWER(p1.name) LIKE '%Global%' AND LOWER(p1.name) LIKE '%History%' AND LOWER(p1.name) LIKE '%&%' AND LOWER(p1.name) LIKE '%Geography%')
		WHEN (p1.type='Single') THEN (a1.cached_product_in_store=1) AND
			(LOWER(a1.title) LIKE '%Global%' AND LOWER(a1.title) LIKE '%History%' AND LOWER(a1.title) LIKE '%&%' AND LOWER(a1.title) LIKE '%Geography%')
	END)
GROUP BY p1.id

Open in new window


It's close but the GROUP BY p1.id is doubling the execution time of the query.

Any ideas?
0
Comment
Question by:clemcrock
  • 18
  • 16
  • +1
41 Comments
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
I would start by being 100% consistent with use of ansi joins - I guess this is the equivalent

SELECT 
	a1.id AS aid, p1.id AS pid, p1.type, p1.in_store as pin_store, a1.cached_product_in_store, p1.parent_id, a1.title, p1.name
FROM assets a1
INNER JOIN assets_products ON a1.id = assets_products.asset_id
INNER JOIN products p1 ON assets_products.product_id = p1.id
INNER JOIN prices  ON p1.id = prices.priceable_id 
                  AND prices.effective_begin_date between '2013-23-01' AND '2020-23-01'
                  AND prices.priceable_type IN ('Product', 'Package') 
                  AND prices.currency = 'USD'
WHERE 
	(CASE
		WHEN ((p1.type='Rental' OR p1.type='Demo') AND p1.parent_id IS NOT NULL) THEN (p1.in_store=(SELECT in_store FROM products p2 WHERE p1.parent_id=p2.id AND p2.in_store=1)) AND
			(LOWER(p1.name) LIKE '%Global%' AND LOWER(p1.name) LIKE '%History%' AND LOWER(p1.name) LIKE '%&%' AND LOWER(p1.name) LIKE '%Geography%')
		WHEN (p1.type='Package') THEN (p1.in_store=1) AND
			(LOWER(p1.name) LIKE '%Global%' AND LOWER(p1.name) LIKE '%History%' AND LOWER(p1.name) LIKE '%&%' AND LOWER(p1.name) LIKE '%Geography%')
		WHEN (p1.type='Single') THEN (a1.cached_product_in_store=1) AND
			(LOWER(a1.title) LIKE '%Global%' AND LOWER(a1.title) LIKE '%History%' AND LOWER(a1.title) LIKE '%&%' AND LOWER(a1.title) LIKE '%Geography%')
	END)
GROUP BY assets_products.product_id

Open in new window

Note I have substituted 'assets_products.product_id' in the group by
(assets_products.product_id = p1.id) this might help performance

You may need an explain plan to delve deeper into the performance issue

also I have used '[date] between [value] and [value]' just to highlight it as the equivalent
0
 
LVL 50

Expert Comment

by:Lowfatspread
Comment Utility
since you don't have any group functions in the query can you explain why you feel you need to group by clause anyway?
0
 

Author Comment

by:clemcrock
Comment Utility
Lowfatspread - Thanks for your reply.   If I don't have the group by clause, I get tons of duplicate records.
0
 
LVL 34

Expert Comment

by:James0628
Comment Utility
Maybe just use DISTINCT instead?  Assuming that they really are duplicate rows.

 James
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
I'm not a great fan of using "select distinct" as it can operate slowly, however as James correctly points out you aren't aggregating/grouping the data so "select distinct" may be sufficient.

I'm also very curious about the where statement in your original. I THINK you should move the case expressions into the field list and hope I have re-interpreted your query accurately.

Also note that
nonbinary string comparisons are case insensitive by default. This means that if you search with col_name LIKE 'a%', you get all column values that start with A or a.
http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
so there is no need to lower(p1.name) etc

try this:

SELECT DISTINCT
  a1.id AS aid
, p1.id AS pid
, p1.type
, CASE
    WHEN (
            ( p1.type = 'Rental' OR p1.type = 'Demo' ) 
            AND (
                    p1.name LIKE '%Global%' AND p1.name LIKE '%History%' AND p1.name LIKE '%&%' AND p1.name LIKE '%Geography%'
                  )
         )
         THEN 1
    WHEN (
            p1.type = 'Package'
            AND (
                    p1.name LIKE '%Global%' AND p1.name LIKE '%History%' AND p1.name LIKE '%&%' AND p1.name LIKE '%Geography%'
                )
            )
         THEN 1 
    ELSE p1.in_store
   END
  AS pin_store
, CASE
    WHEN (
            p1.type = 'Single'
            AND (
                    a1.title   LIKE '%Global%' AND a1.title LIKE '%History%' AND a1.title LIKE '%&%' AND a1.title LIKE '%Geography%'
                )
         )
         THEN 1 
    ELSE a1.cached_product_in_store
   END
  AS cached_product_in_store
, p1.parent_id
, a1.title
, p1.name
FROM   assets a1 
INNER JOIN assets_products ON a1.id = assets_products.asset_id 
INNER JOIN products p1     ON assets_products.product_id = p1.id 
INNER JOIN prices
           ON p1.id = prices.priceable_id 
          AND prices.effective_begin_date BETWEEN '2013-23-01' AND '2020-23-01' 
          AND prices.priceable_type IN ( 'Product', 'Package' ) 
          AND prices.currency = 'USD' 

Open in new window


There might also be an ability to simplify some of those string comparisons, but this will totally depend on the data. e.g.

 p1.name LIKE '%Global%History%&%Geography%'

Open in new window

0
 

Author Comment

by:clemcrock
Comment Utility
First of all, I'd like to apologize for taking so long to get back to this and also, thank you so much for taking the time to reply to my question.

PortletPaul - I tried your first example and those were great suggestions and I thought they would definitely speed things up, but strangely, they added an extra 3 seconds (9 seconds compared to my 6 seconds) to the query?

I then, tried your DISTINCT version of the query and it clocked in at a whopping 28 seconds!

There's got to be a way man!
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
mmmm, I did say I'm not a fan of "select distinct" and your results prove it to me once again.

I think we need more feedback.
Can you identify which fields are indexed in your tables?

The other thing I'd like to know is does your 6 second query, and my 9 second query access the same number of rows?

select count(*)
FROM assets a1
INNER JOIN (assets_products, products p1, prices) ON
	 p1.id = assets_products.product_id AND
	 a1.id = assets_products.asset_id AND
	 p1.id = prices.priceable_id AND
	 prices.effective_begin_date <= '2013-23-01' AND
	 prices.effective_end_date >= '2020-23-01' AND
	 prices.priceable_type IN ('Product', 'Package') AND
	 prices.currency = 'USD'

Open in new window

select count(*)
FROM assets a1
INNER JOIN assets_products ON a1.id = assets_products.asset_id
INNER JOIN products p1 ON assets_products.product_id = p1.id
INNER JOIN prices  ON p1.id = prices.priceable_id 
                  AND prices.effective_begin_date between '2013-23-01' AND '2020-23-01'
                  AND prices.priceable_type IN ('Product', 'Package') 
                  AND prices.currency = 'USD'   

Open in new window

By the way, this is MySQL isn't it? If it is, you should probably be using str_to_date() on those dates. e.g.

AND prices.effective_begin_date BETWEEN STR_TO_DATE('2013-23-01','%Y-%m-%d') 
AND STR_TO_DATE('2020-23-01','%Y-%m-%d')

Open in new window

The next step would be 'explain plans' on your original query (6 seconds) and then on my first post (9 seconds).

if you have never used an explain plan try:
http://www.lornajane.net/posts/2011/explaining-mysqls-explain

note she suggests using a " \G " - I'm not familiar with this it may or may not be helpful

more: http://dev.mysql.com/doc/refman/5.0/en/execution-plan-information.html
0
 

Author Comment

by:clemcrock
Comment Utility
Thanks again for all your help.   To answer your questions:

This is Mysql and your query and my query return both return the same number of rows.

Here are my indexes for both tables

mysql> SHOW INDEXES FROM assets;
+------------+------------------------------------+--------------+-------------------+
| Non_unique | Key_name                           | Seq_in_index | Column_name       |
+------------+------------------------------------+--------------+-------------------+
|          0 | PRIMARY                            |            1 | id                |
|          0 | guid                               |            1 | guid              |
|          1 | tt                                 |            1 | title             |
|          1 | tt                                 |            2 | type              |
|          1 | ht                                 |            1 | href              |
|          1 | ht                                 |            2 | type              |
|          1 | type                               |            1 | type              |
|          1 | taht                               |            1 | title             |
|          1 | taht                               |            2 | author_name       |
|          1 | taht                               |            3 | href              |
|          1 | taht                               |            4 | type              |
|          1 | at                                 |            1 | author_name       |
|          1 | at                                 |            2 | type              |
|          1 | index_assets_on_imprint_id         |            1 | imprint_id        |
|          1 | index_assets_on_parent_id          |            1 | parent_id         |
|          1 | index_assets_on_isbn               |            1 | isbn              |
|          1 | index_assets_on_isbn_10            |            1 | isbn_10           |
|          1 | index_assets_on_isbn_13            |            1 | isbn_13           |
|          1 | index_assets_on_isbn_canonical     |            1 | isbn_canonical    |
|          1 | index_assets_on_e_isbn             |            1 | e_isbn            |
|          1 | index_assets_on_cached_company_id  |            1 | cached_company_id |
|          1 | assets_isbn_type_index             |            1 | isbn              |
|          1 | assets_isbn_type_index             |            2 | type              |
|          1 | assets_tims_lookup_isbn_type_index |            1 | tims_metadata     |
|          1 | assets_tims_lookup_isbn_type_index |            2 | isbn              |
|          1 | assets_tims_lookup_isbn_type_index |            3 | type              |
|          1 | assets_eisbn_canonical_index       |            1 | eisbn_canonical   |
+------------+------------------------------------+--------------+-------------------+

mysql> SHOW INDEXES FROM products;
+------------+---------------------------------+---------------+
| Non_unique | Key_name                        | Column_name   |
+------------+---------------------------------+---------------+
|          0 | PRIMARY                         | id            |
|          0 | sku                             | sku           |
|          0 | products_public_id_index        | public_id     |
|          1 | sku_id_type                     | sku           |
|          1 | sku_id_type                     | id            |
|          1 | sku_id_type                     | type          |
|          1 | id_lock                         | id            |
|          1 | id_lock                         | lock_version  |
|          1 | in_store                        | in_store      |
|          1 | products_parent_id_type         | parent_id     |
|          1 | products_parent_id_type         | type          |
|          1 | index_products_on_sku_canonical | sku_canonical |
|          1 | index_products_on_company_id    | company_id    |
|          1 | name                            | name          |
+------------+---------------------------------+---------------+

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
are there indexes on assets_products & prices?

thanks for confirming the record counts as the same - good to know
0
 

Author Comment

by:clemcrock
Comment Utility
Here's the explain from my query:

+----+--------------------+-----------------+--------+--------------------------------------------------------------------------------------+----------------------------------+---------+-------------------------------------------+--------+---------------------------------+
| id | select_type        | table           | type   | possible_keys                                                                        | key                              | key_len | ref                                       | rows   | Extra                           |
+----+--------------------+-----------------+--------+--------------------------------------------------------------------------------------+----------------------------------+---------+-------------------------------------------+--------+---------------------------------+
|  1 | PRIMARY            | p1              | ALL    | PRIMARY,id_lock                                                                      | NULL                             | NULL    | NULL                                      | 273431 | Using temporary; Using filesort |
|  1 | PRIMARY            | assets_products | ref    | assets_products_asset_id_index,assets_products_product_id_index                      | assets_products_product_id_index | 4       | phoenix_1_3_13.p1.id                      |      1 |                                 |
|  1 | PRIMARY            | a1              | eq_ref | PRIMARY                                                                              | PRIMARY                          | 4       | phoenix_1_3_13.assets_products.asset_id   |      1 | Using where                     |
|  1 | PRIMARY            | prices          | ref    | index_prices_on_priceable_id,index_prices_on_priceable_type,index_prices_on_currency | index_prices_on_priceable_id     | 5       | phoenix_1_3_13.assets_products.product_id |      3 | Using where                     |
|  2 | DEPENDENT SUBQUERY | p2              | eq_ref | PRIMARY,id_lock,in_store                                                             | PRIMARY                          | 4       | phoenix_1_3_13.p1.parent_id               |      1 | Using where                     |
+----+--------------------+-----------------+--------+--------------------------------------------------------------------------------------+----------------------------------+---------+-------------------------------------------+--------+---------------------------------+

Open in new window


and explain on your query:

+----+--------------------+-----------------+--------+--------------------------------------------------------------------------------------+--------------------------------+---------+-------------------------------------------+--------+----------------------------------------------+
| id | select_type        | table           | type   | possible_keys                                                                        | key                            | key_len | ref                                       | rows   | Extra                                        |
+----+--------------------+-----------------+--------+--------------------------------------------------------------------------------------+--------------------------------+---------+-------------------------------------------+--------+----------------------------------------------+
|  1 | PRIMARY            | a1              | ref    | PRIMARY,type,index_assets_on_parent_id                                               | type                           | 93      | const                                     | 123577 | Using where; Using temporary; Using filesort |
|  1 | PRIMARY            | assets_products | ref    | assets_products_asset_id_index,assets_products_product_id_index                      | assets_products_asset_id_index | 4       | phoenix_1_3_13.a1.id                      |      2 |                                              |
|  1 | PRIMARY            | p1              | eq_ref | PRIMARY,id_lock                                                                      | PRIMARY                        | 4       | phoenix_1_3_13.assets_products.product_id |      1 | Using where                                  |
|  1 | PRIMARY            | prices          | ref    | index_prices_on_priceable_id,index_prices_on_priceable_type,index_prices_on_currency | index_prices_on_priceable_id   | 5       | phoenix_1_3_13.p1.id                      |      3 | Using where                                  |
|  2 | DEPENDENT SUBQUERY | p2              | eq_ref | PRIMARY,id_lock,in_store                                                             | PRIMARY                        | 4       | phoenix_1_3_13.p1.parent_id               |      1 | Using where                                  |
+----+--------------------+-----------------+--------+--------------------------------------------------------------------------------------+--------------------------------+---------+-------------------------------------------+--------+----------------------------------------------+

Open in new window


And the only indexes from the assets_products are the asset_id and product_id foreign keys.

Thanks again for your help.
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
6 seconds: your submitted query; 9 Seconds ID: 38842521

Excellent, thanks for the 'execution plans'. Seems to me however that the second execution plan is just a little better than the first (first step of second query uses an index ['const' type] and expects to scan 123577 rows as opposed to 273431). However the first query  is 6 seconds and the second 9 seconds. I admit however I don't know why mySQL has used the index with name 'type' (possibly not the best field or index name to use but that's another story).

I think the difference between these 2 queries (aside from the linking approach) has been the 'group by' content

6 second query: group by p1.id
9 second query: group by assets_products.product_id

duh! p1.id is a unique identifier but assets_products.product_id is not
(so the 9 second query is doing more summarization then the 6 second)

We also know that the core linking performed by the 2 queries produces the same number of rows, so I suggest you compare the 2 queries, but this time both should use the same group by detail

try both with:

"GROUP BY a1.id"

Now, precisely which "group by" makes sense will really depend on you - but when comparing the queries both should use the same fields in the group by. You could also compare without any grouping.

Also note, caching can influence results, see: http://dev.mysql.com/doc/refman/5.1/en/query-cache-status-and-maintenance.html

I don't know if I'm going to get your query really fast but I think there is progress.
By the way what would be a successful outcome in your view?
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
Clem, is your 'crazy query' working now or does it need more?
0
 

Author Comment

by:clemcrock
Comment Utility
Hello and big apologies about the long reply - had a huge deploy week and finished last night and now, back on this task full time.

I tried to group by a1.id instead of p1.id and it went from 6 seconds to 8 seconds.      I'm stumped, right now the way we're pulling all this data is by fusing together the results of multiple queries.    

Then we search for all products that have the type 'Package' and have a name like '%title_query%' .

Then we search for all products that have the type 'Rental' or 'Demo' and have a name like '%title_query%' .    

For each of these records in  a loop, we then fire off queries to test for in_store and prices.

You would think that all this would make this search take forever, but it's still faster than trying to do it all in one query.

I don't get it?
0
 

Author Comment

by:clemcrock
Comment Utility
So a bit more detail on what we're currently doing in our search by title method.

We break this into a series of methods that break all this into about 7 different queries that concatenate their results into one big collection which is then paginated.   All this feels completely ridiculous to me.

Here's the pseudocode:
1. METHOD 1: Search asset table for all entries with a title LIKE '%search_term%' and exclude any that don't have cached_product_in_store=1
2. METHOD 2: Filter all those results more by testing if any of them have a cached_company_id in a list of company ids
3. METHOD 3: For all the assets found in the first 2 methods search for ALL assets in the assets_products through table that match all products have a type of 'Package'
4. METHOD 4: find all products that have a name LIKE '%search_term%', type='Package' and in_store=1
5. METHOD 5: find all products that have a name LIKE '%search_term%', type='Rental' or type='Demo' and in_store=1
6. METHOD 6: Sort the entire collection
7. METHOD 7: paginate the entire collection
8. Then we display the items by looping through each one and then we test if we're showing a purchase button by testing if the product has in_store=1 and making this call to see if they have a valid price:
 SELECT * FROM `prices` WHERE (`prices`.priceable_id = 36216 AND `prices`.priceable_type = 'Product' AND (currency = 'USD' and (effective_begin_date <= '2013-02-11 16:52:11' and effective_end_date >= '2013-02-11 16:52:11'))) ORDER BY created_on DESC, currency ASC, created_on DESC LIMIT 1

Open in new window


As you can see - this feels redundant and crazy as hell BUT, when trying to condense it all into one QUERY, it's STILL slower than doing it the way we're doing it now!!!
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
clemcrock, did you intend to include all the sql for the multiple step approach? There is just a one-line snippet against the prices table in the post above.
Perhaps put all relevant sql into into a text file and attach that?
0
 

Author Comment

by:clemcrock
Comment Utility
Good suggestion Here's the actual steps in code:

STEP 1:
SELECT
 *
 FROM
 assets
 WHERE
 type = 'VitalBook' and
 cached_product_in_store = 1 and
 (cached_company_id is null or cached_company_id not in (209)) AND 
 (LOWER(replace(title, "'", '')) LIKE '%biology%' OR
 LOWER(title) LIKE '%biology%' OR
 LOWER(replace(isbn, '-','')) LIKE '%biology%' OR
 LOWER(isbn_canonical) LIKE '%biology%' OR 
 LOWER(isbn_10) LIKE '%biology%' OR
 LOWER(isbn_13) LIKE '%biology%' OR
 LOWER(replace(e_isbn, '-','')) LIKE '%biology%' OR
 LOWER(author_name) LIKE '%biology%')
 AND parent_id IS NULL
 AND block_search != 1
 ORDER BY
 title ASC

Open in new window




STEP 2:
SELECT
        DISTINCT assets.id
      FROM
        assets
    
      WHERE
        assets.cached_product_in_store = 1 and assets.type = 'VitalBook' and assets.id IN (123322,168524,37506,167833,177302,47588,125665,126730,164627,118803,43288,137506,51755,51756,197964,141651,179289,41312,201314,57186,39787,37741,37742,37744,57986,58003,61095,105659,105660,105661,39621,38860,38861,38862,38863,129754,46067,46068,46069,46070,43514,46051,45303,46052,43513,45304,44534,46053,43610,43830,43943,43611,43829,46054,46055,46056,46057,46058,43286,46059,46060,43287,46061,44053,46062,43944,46063,46064,46065,65275,43612,65276,46066,57926,45305,45306,43865,45307,43609,38065,43450,37405,43157,43274,43448,81075,44274,42497,42492,43550,38944,137512,39734,80974,51792,51793,44120,43898,43899,105095,199070,37802,37803,39639,125670,44027,44160,44357,43666,43664,44072,43665,44212,44213,44073,52567,44356,44291,44259,44292,44358,44161,44293,44383,43896,43897,43243,43244,43973,61118,44119,43355,43549,44162,44163,40844,80970,197965,130648,105097,105678,43432,43823,42514,42518,85401,20807,123113,44744,131500,202651,64221,64378,44183,37888,179632,119813,81092,40300,44814,48211,85419,101811,38280,101071,147539,39942,54712,45413,38792,57261,167430,167186,110101,106797,226407,167015,112257,112258,25528,25547,226508,99288,226270,109294,131323,109686,119624,38258,69128,38983,125223,179235,39053,44890,39941,104990,90819,45416,105100) and ((assets.cached_company_id is null) OR assets.cached_company_id NOT IN (209))

Open in new window



STEP 3:
SELECT
 products.*
 FROM
 products
 INNER JOIN
 assets_products
 ON
 products.id = assets_products.product_id
 WHERE
 assets_products.asset_id
 IN
 (20807,25528,25547,37405,37506,37741,37742,37744,37802,37803,37888,38065,38258,38280,38792,38860,38861,38862,38863,38944,38983,39053,39621,39639,39734,39787,39941,39942,40300,40844,41312,42492,42497,42514,42518,43157,43243,43244,43274,43286,43287,43288,43355,43432,43448,43450,43513,43514,43549,43550,43609,43610,43611,43612,43664,43665,43666,43823,43829,43830,43865,43896,43897,43898,43899,43943,43944,43973,44027,44053,44072,44073,44119,44120,44160,44161,44162,44163,44183,44212,44213,44259,44274,44291,44292,44293,44356,44357,44358,44383,44534,44744,44814,44890,45303,45304,45305,45306,45307,45413,45416,46051,46052,46053,46054,46055,46056,46057,46058,46059,46060,46061,46062,46063,46064,46065,46066,46067,46068,46069,46070,47588,48211,51755,51756,51792,51793,52567,54712,57186,57261,57926,57986,58003,61095,61118,64221,64378,65275,65276,69128,80970,80974,81075,81092,85401,85419,90819,99288,101071,101811,104990,105095,105097,105100,105659,105660,105661,105678,106797,109294,109686,110101,112257,112258,118803,119624,119813,123113,123322,125223,125665,125670,126730,129754,130648,131323,131500,137506,137512,141651,147539,164627,167015,167186,167430,167833,168524,177302,179235,179289,179632,197964,197965,199070,201314,202651,226270,226407,226508)

Open in new window



STEP 4:
SELECT id, name, parent_id, sku, description, type, asset_id, in_store, edition, publisher, author_name, sku_canonical FROM products AS p1 WHERE p1.type='Package' AND p1.in_store = 1 AND (p1.company_id IS NULL OR p1.company_id NOT IN (209)) AND (LOWER(replace(replace(replace(replace(p1.name, '-', ' '), " ", ''), '&', ''), ':', '') ) LIKE '%biology%' OR LOWER(p1.name) LIKE '%biology%')
 OR p1.sku='%biology%'

Open in new window



STEP 5:
SELECT id, name, parent_id, sku, description, type, asset_id, in_store, edition, publisher, author_name, sku_canonical FROM products AS p1 WHERE (p1.type='Rental' OR p1.type='Demo') AND p1.in_store = (SELECT in_store FROM products AS p2 WHERE p2.in_store=1 AND p1.parent_id = p2.id) AND (p1.company_id IS NULL OR p1.company_id NOT IN (209)) AND (LOWER(replace(replace(replace(replace(p1.name, '-', ' '), " ", ''), '&', ''), ':', '') ) LIKE '%biology%' OR LOWER(p1.name) LIKE '%biology%')
 OR p1.sku='%biology%'

Open in new window



STEP 6:
  @results.sort!{|a,b|
          a_title = a.is_a?(Asset) ? a.title : a.name
          b_title = b.is_a?(Asset) ? b.title : b.name
          a_title <=> b_title
        }

Open in new window


STEP 7:
@results = paginate_collection(@results, {:per_page => MAX_ITEMS_PER_PAGE})

Open in new window

STEP 8:
 SELECT * FROM `prices` WHERE (`prices`.priceable_id = 173220 AND `prices`.priceable_type = 'Product' AND (currency = 'USD' and (effective_begin_date <= '2013-02-12 12:12:53' and effective_end_date >= '2013-02-12 12:12:53'))) ORDER BY created_on DESC, currency ASC, created_on DESC LIMIT 1

Open in new window

0
 

Author Comment

by:clemcrock
Comment Utility
One thing I'm noticing when I do an explain on my queries is that asset.title and products.name aren't listed as possible keys.   Would that be because we're using LIKES on those fields?   Also, another thing that concerns me is that many indexes are duplicates - IE: assets.type

show indexes from assets;
+--------+------------+------------------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table  | Non_unique | Key_name                           | Seq_in_index | Column_name       | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+------------------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| assets |          0 | PRIMARY                            |            1 | id                | A         |      245360 |     NULL | NULL   |      | BTREE      |         |               |
| assets |          0 | guid                               |            1 | guid              | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | tt                                 |            1 | title             | A         |      245360 |       25 | NULL   | YES  | BTREE      |         |               |
| assets |          1 | tt                                 |            2 | type              | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | ht                                 |            1 | href              | A         |      245360 |       50 | NULL   | YES  | BTREE      |         |               |
| assets |          1 | ht                                 |            2 | type              | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | type                               |            1 | type              | A         |          11 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | taht                               |            1 | title             | A         |      245360 |       25 | NULL   | YES  | BTREE      |         |               |
| assets |          1 | taht                               |            2 | author_name       | A         |      245360 |       15 | NULL   | YES  | BTREE      |         |               |
| assets |          1 | taht                               |            3 | href              | A         |      245360 |       25 | NULL   | YES  | BTREE      |         |               |
| assets |          1 | taht                               |            4 | type              | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | at                                 |            1 | author_name       | A         |      245360 |       15 | NULL   | YES  | BTREE      |         |               |
| assets |          1 | at                                 |            2 | type              | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | index_assets_on_imprint_id         |            1 | imprint_id        | A         |        3186 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | index_assets_on_parent_id          |            1 | parent_id         | A         |        4810 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | index_assets_on_isbn               |            1 | isbn              | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | index_assets_on_isbn_10            |            1 | isbn_10           | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | index_assets_on_isbn_13            |            1 | isbn_13           | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | index_assets_on_isbn_canonical     |            1 | isbn_canonical    | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | index_assets_on_e_isbn             |            1 | e_isbn            | A         |       81786 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | index_assets_on_cached_company_id  |            1 | cached_company_id | A         |         796 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | assets_isbn_type_index             |            1 | isbn              | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | assets_isbn_type_index             |            2 | type              | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | assets_tims_lookup_isbn_type_index |            1 | tims_metadata     | A         |          26 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | assets_tims_lookup_isbn_type_index |            2 | isbn              | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | assets_tims_lookup_isbn_type_index |            3 | type              | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
| assets |          1 | assets_eisbn_canonical_index       |            1 | eisbn_canonical   | A         |      245360 |     NULL | NULL   | YES  | BTREE      |         |               |
+--------+------------+------------------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Open in new window


show indexes from products;
+----------+------------+---------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name                        | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+---------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| products |          0 | PRIMARY                         |            1 | id            | A         |      245936 |     NULL | NULL   |      | BTREE      |         |               |
| products |          0 | sku                             |            1 | sku           | A         |      245936 |     NULL | NULL   | YES  | BTREE      |         |               |
| products |          0 | products_public_id_index        |            1 | public_id     | A         |      245936 |     NULL | NULL   | YES  | BTREE      |         |               |
| products |          1 | sku_id_type                     |            1 | sku           | A         |      245936 |     NULL | NULL   | YES  | BTREE      |         |               |
| products |          1 | sku_id_type                     |            2 | id            | A         |      245936 |     NULL | NULL   |      | BTREE      |         |               |
| products |          1 | sku_id_type                     |            3 | type          | A         |      245936 |     NULL | NULL   | YES  | BTREE      |         |               |
| products |          1 | id_lock                         |            1 | id            | A         |      245936 |     NULL | NULL   |      | BTREE      |         |               |
| products |          1 | id_lock                         |            2 | lock_version  | A         |      245936 |     NULL | NULL   |      | BTREE      |         |               |
| products |          1 | in_store                        |            1 | in_store      | A         |           3 |     NULL | NULL   |      | BTREE      |         |               |
| products |          1 | products_parent_id_type         |            1 | parent_id     | A         |       61484 |     NULL | NULL   | YES  | BTREE      |         |               |
| products |          1 | products_parent_id_type         |            2 | type          | A         |       61484 |     NULL | NULL   | YES  | BTREE      |         |               |
| products |          1 | index_products_on_sku_canonical |            1 | sku_canonical | A         |      245936 |     NULL | NULL   | YES  | BTREE      |         |               |
| products |          1 | index_products_on_company_id    |            1 | company_id    | A         |         244 |     NULL | NULL   | YES  | BTREE      |         |               |
| products |          1 | name                            |            1 | name          | A         |      245936 |     NULL | NULL   | YES  | BTREE      |         |               |
+----------+------------+---------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
"another thing that concerns me is that many indexes are duplicates - IE: assets.type"

I would suggest you post a separate question on this, needs level of mySQL expertise I don't possess. Suspect this detail will be lost by blockbuster current question.
0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
Her's a crack at "the crazy query" but do please read the embedded comments:

SELECT id
 , name
 , parent_id
 , sku
 , description
 , type
 , asset_id
 , in_store
 , edition
 , publisher
 , author_name
 , sku_canonical
FROM products AS p1
INNER JOIN (
            SELECT DISTINCT
            products.id
            FROM assets_products ON products.id = assets_products.product_id
            INNER JOIN (
                            SELECT assets.id  /* don't use "select *" specify the fields you need */
                            FROM (
                                            SELECT assets.id  /* don't use "select *" specify the fields you need */
                                            FROM assets
                                            WHERE type = 'VitalBook'
                                            AND cached_product_in_store = 1
                                            AND
                                               (
                                               cached_company_id IS NULL
                                            OR cached_company_id NOT IN (209) /* just use != ?? */
                                               )
                                            AND parent_id IS NULL
                                            AND block_search != 1
                                  ) AS ASSET_FILT1
                            WHERE
                                    /* MMM, IT IS PROBABLE THAT REPLACE() IS ALSO REMOVING ABILITY TO LEVERAGE INDEXES ... AVOID IF POSSIBLE */
                                    /* ALSO AM ASSUMING mySQL IS CASE INSENSITIVE */
                               (
                                   REPLACE(title, "'", '') LIKE '%biology%' /* this line covers the next line... */
                                /* OR LOWER(title) LIKE '%biology%' */
                                OR REPLACE(isbn, '-','') LIKE '%biology%'
                                OR isbn_canonical LIKE '%biology%'
                                OR isbn_10 LIKE '%biology%'
                                OR isbn_13 LIKE '%biology%'
                                OR REPLACE(e_isbn, '-','') LIKE '%biology%'
                                OR author_name LIKE '%biology%'
                               )
                        ) AS ASSET_FILT2 ON assets_products.asset_id = ASSET_FILT2.id
            ) AS PROD_FILT1 ON products.id = PROD_FILT1.id
WHERE (
        (
          (
            AND p1.type='Package'
            AND p1.in_store = 1
           )
         OR
          (
              (
                   p1.type='Rental'
                OR p1.type='Demo'
               )
            AND p1.in_store = /* is this subquery needed? won't it always return 1 or NULL ? so: "p1.in_store = 1" is the same thing right? (cannot compare null through =) */
               (SELECT in_store
               FROM products AS p2
               WHERE p2.in_store=1
               AND p1.parent_id = p2.id
               )
          )      
        )
        AND
           (
               p1.company_id IS NULL
            OR p1.company_id NOT IN (209) /* only one value for not in? could use != */
           )
        AND
           (
               LOWER(REPLACE(REPLACE(REPLACE(REPLACE(p1.name, '-', ' '), " ", ''), '&', ''), ':', '') ) LIKE '%biology%'
            OR LOWER(p1.name) LIKE '%biology%'
           )
      )
OR p1.sku='%biology%' /* will this do anything? do you mean "like" instead of = */

Open in new window

You may want to try this cut-down version first - see comments inside
/* try this cut-down logic, does it work? is it relatively efficient? */
SELECT id , name , parent_id , sku , description , type , asset_id , in_store , edition , publisher , author_name , sku_canonical
FROM products AS p1
INNER JOIN (
            SELECT DISTINCT
            products.id
            FROM assets_products ON products.id = assets_products.product_id
            INNER JOIN (
                            SELECT assets.id 
                            FROM (
                                    /* trying to keep this free of text searches to maximize indexing */
                                            SELECT assets.id
                                            FROM assets
                                            WHERE type = 'VitalBook'
                                            AND cached_product_in_store = 1
                                            AND
                                               (
                                               cached_company_id IS NULL
                                            OR cached_company_id NOT IN (209) /* just use != ?? */
                                               )
                                            AND parent_id IS NULL
                                            AND block_search != 1
                                   ) AS ASSET_FILT1
                            /* do text field seaches here (I think). 
                               Might even want to break this into 2 blocks, 
                               first WITHOUT 'replace', then
                               second involving 'replace'
                            */
                            WHERE 1=1
                        ) AS ASSET_FILT2 ON assets_products.asset_id = ASSET_FILT2.id
            /* at this point hopefully DISTINCT will be effective and not too expensive */
            ) AS PROD_FILT1 ON products.id = PROD_FILT1.id
WHERE 1=1
/* I worry about the final "OR p1.sku='%biology%'"
   perhaps try the full crazy query WITHOUT this final line
   If this final line is expensive MAYBE a union for this condition would work better?
 */

Open in new window

as it existed I attached my initial thoughts for reference
Q-28015532.observations.sql
0
 

Author Comment

by:clemcrock
Comment Utility
PortletPaul - thanks so much for your detailed reply.    I'm trying your queries and they're both crapping out on this line:  

SELECT DISTINCT
            products.id
            FROM assets_products ON products.id = assets_products.product_id

Open in new window


I was wondering if this was valid syntax for different db's.    

I get the gist of your query and I like your approach and it looks promising.    Playing around with this as we speak.

Also, to respond to your observations you pointed out in the attachment:
1. You're exactly right about the case insensitive observations.    That's the existing code as I walked into the project and that's all coming out in the new versions.
2. can those exclusions go into a table? - that's a good observation and I'm investigating this.
3. All my new queries aren't doing a SELECT *, they're all pulling ONLY the relavant fields.
4. for the "OR p1.sku='%biology%'", this is a "general" search so we're including this field into the search.

Again, thanks for your extensive assistance!

Eric
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
should just read:

SELECT DISTINCT
            products.id
            FROM assets_products

(I think) bit frazzled right now (its late)

to answer your exact latest question "if this was valid syntax " = no = mistake

Oh, and I try to structure those nested queries so you can try them one by one (start at the deepest of course) as you "move out" you will be testing the various joins

I would start with the simplified version, get the "structure" correct, then put in the selection criteria - make sense?

at that final "or" may make the entire thing inefficient...
see if you can get "crazy" to work without this last or

then, lets see what impact that last 'or' has (if much)

sorry...

try this separately, a lot depends on this - if this is slow we need another way:

SELECT DISTINCT
            products.id
            FROM assets_products

I'm NOT a fan of distinct!!! but if the indexes suit it may be ok.- here
0
 

Author Comment

by:clemcrock
Comment Utility
Just quick feed back on your stripped down version.    I updated it so it looks like this now:

SELECT p1.id, p1.name, parent_id, sku, type , asset_id , in_store , edition , publisher , author_name , sku_canonical
FROM products AS p1
INNER JOIN (
            SELECT DISTINCT
            product_id
            FROM assets_products
            INNER JOIN (SELECT ASSET_FILT1.id 
                            FROM (
                                    SELECT assets.id
                                    FROM assets
                                    WHERE type = 'VitalBook'
                                    AND cached_product_in_store = 1
                                    AND ( cached_company_id IS NULL OR cached_company_id NOT IN (209))
                                    AND parent_id IS NULL
                                    AND block_search != 1
                                   ) AS ASSET_FILT1
                            WHERE 1=1
                        ) AS ASSET_FILT2 ON assets_products.asset_id = ASSET_FILT2.id
            ) AS PROD_FILT1 ON p1.id = PROD_FILT1.product_id
WHERE 1=1

Open in new window


cold it pulled 103444 rows in set (9.67 sec)
warmed up it pulled 103444 rows in set (3.57 sec)!

To me, that's pretty impressive.    That leads me to a side question - what causes the difference between the first (cold) and second time (warmed up) - mysql caching?
0
 

Author Comment

by:clemcrock
Comment Utility
Well - this, to me, looks both promising and impressive.

I ran this version of your long query and it found 173 rows in 1.38 seconds!!!!

SELECT p1.id, p1.name, sku, type, p1.in_store
FROM products AS p1
INNER JOIN (
            SELECT DISTINCT
            products.id
            FROM assets_products 
			INNER JOIN products ON products.id = assets_products.product_id
            INNER JOIN (
                            SELECT ASSET_FILT1.id
                            FROM(
									SELECT id
                                            FROM assets
                                            WHERE type = 'VitalBook' AND
                                            cached_product_in_store = 1 AND
                                            (cached_company_id IS NULL OR cached_company_id NOT IN (209)) AND 
                                            (parent_id IS NULL AND block_search != 1) AND
											(
												REPLACE(title, "'", '') LIKE '%biology%'
			                                	OR REPLACE(isbn, '-','') LIKE '%biology%'
			                                	OR isbn_canonical LIKE '%biology%'
			                                	OR isbn_10 LIKE '%biology%'
			                                	OR isbn_13 LIKE '%biology%'
			                                	OR REPLACE(e_isbn, '-','') LIKE '%biology%'
			                                	OR author_name LIKE '%biology%'
											)
                                  ) AS ASSET_FILT1
                        ) AS ASSET_FILT2 ON assets_products.asset_id = ASSET_FILT2.id
            ) AS PROD_FILT1 ON p1.id = PROD_FILT1.id
WHERE (
        (
          (p1.type='Package' AND p1.in_store = 1) OR
          ( (p1.type='Rental'  OR p1.type='Demo') AND p1.in_store = (SELECT in_store FROM products AS p2 WHERE p2.in_store=1 AND p1.parent_id = p2.id))      
        )
		AND
        (p1.company_id IS NULL  OR p1.company_id NOT IN (209)) AND
        (LOWER(REPLACE(REPLACE(REPLACE(REPLACE(p1.name, '-', ' '), " ", ''), '&', ''), ':', '') ) LIKE '%biology%' OR LOWER(p1.name) LIKE '%biology%')
      )
OR p1.sku='%biology%'

Open in new window



You'll notice I stripped out some of the field list just so I can concentrate on the ones that really matter.  

I'm trying to figure out how to pull some of the fields from the asset association and if I can do that, I may be pretty close.   Working on that now.
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
be v.careful with this bit:

I'm trying to figure out how to pull some of the fields from the asset association and if I can do that, I may be pretty close.   Working on that now.

this could introduce "multiple rows per product" which I think we don't want.
I didn't see anything in your 6 steps that required fields from that table did I?

re: what causes the difference between the first (cold) and second time (warmed up) - mysql caching? =  almost certainly. When comparing "speed" you need to take cache into consideration - purists might clear cache to timing tests between runs (so they are comparable) - or - ignore the first run of each query. Basically that's a looong yes.

oh, and I like this, continue to pull as little as possible:
I stripped out some of the field list just so I can concentrate on the ones that really matter.  
0
 

Author Comment

by:clemcrock
Comment Utility
Ok - so it's still promising but when I break the query into individual parts, I'm realizing I'm getting far fewer records than I'm expecting.

Here's the query as it is now and the speed is amazing and it's returning 173 rows in (1.48 sec)
 
SELECT 
 	p1.id, p1.sku, PROD_FILT1.isbn, p1.type as ptype, PROD_FILT1.type, SUBSTRING(p1.name, 1, 20), SUBSTRING(PROD_FILT1.title, 1, 20), PROD_FILT1.sort_title,
 	p1.in_store, PROD_FILT1.author_name, PROD_FILT1.author_first_name, PROD_FILT1.author_last_name, PROD_FILT1.href, PROD_FILT1.description
 FROM products AS p1
 	INNER JOIN(
 		SELECT
 			 products.id, a1.type, a1.title, a1.sort_title, a1.author_name, a1.author_first_name, 
 			 a1.author_last_name, a1.isbn, a1.cached_product_in_store, a1.href, a1.description
 		FROM assets_products 
 			INNER JOIN products ON products.id = assets_products.product_id
 			INNER JOIN(
 						SELECT
 							id, type, title, sort_title, author_name, author_first_name, author_last_name, isbn, 
 							cached_product_in_store, href, description
 						FROM(
 					 	 	SELECT 
 					 		 	a2.id, a2.type, a2.title, a2.sort_title, a2.author_name, a2.author_first_name, a2.author_last_name, 
 					 		 	a2.isbn, a2.cached_product_in_store, a2.href, a2.description
 							FROM 
 					 		 	assets a2
 							WHERE 
 					 		 	(type = 'VitalBook' AND cached_product_in_store = 1) AND
 								(cached_company_id IS NULL OR cached_company_id NOT IN (209)) AND
 								(parent_id IS NULL AND block_search != 1) AND
 					 		 	(a2.title LIKE '%biology%' OR a2.isbn LIKE '%biology%')
 					 )AS ASSET_FILT1
 		) 
 		AS a1 ON assets_products.asset_id = a1.id
 	) 
 	AS PROD_FILT1 ON p1.id = PROD_FILT1.id
 WHERE (
 (
 (p1.type='Package' AND p1.in_store = 1) OR
 ( (p1.type='Rental' OR p1.type='Demo') AND p1.in_store = (SELECT in_store FROM products AS p2 WHERE p2.in_store=1 AND p1.parent_id = p2.id)) 
 )
 		AND
 (p1.company_id IS NULL OR p1.company_id NOT IN (209)) AND
 (LOWER(REPLACE(REPLACE(REPLACE(REPLACE(p1.name, '-', ' '), " ", ''), '&', ''), ':', '') ) LIKE '%biology%' OR LOWER(p1.name) LIKE '%biology%')
 )
 OR p1.sku='%biology%'

Open in new window



But, when I run the queries individually I get far more entries:

 ---------------------------------------------------------------------------------------------------------
ASSETS  938 total rows
---------------------------------------------------------------------------------------------------------
SELECT a1.id, a1.type, a1.in_store, a1.cached_product_in_store, a1.build_status, a1.title as title
FROM assets a1 WHERE (a1.title LIKE '%biology%') AND 
(a1.cached_product_in_store=1 AND a1.type='VitalBook') AND 
(a1.parent_id IS NULL AND a1.block_search != 1) AND
(a1.build_status='approved' OR (a1.rework=1 && a1.build_status NOT IN ('destroyed', 'unavailable', 'out_of_distribution')))
---------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------
PRODUCTS - 441 total rows
---------------------------------------------------------------------------------------------------------
SELECT p1.id, p1.type, p1.in_store, p1.sku, p1.parent_id, SUBSTRING(p1.name, 1, 70) as name
FROM products p1 WHERE (p1.name LIKE '%biology%') AND p1.in_store=1 AND type='Package'    		//1 row in set (0.70 sec)

SELECT p1.id, p1.type, p1.in_store, p1.sku, p1.parent_id, SUBSTRING(p1.name, 1, 70) as name  	//435 rows in set (0.36 sec)
FROM products p1 WHERE (p1.name LIKE '%biology%') AND p1.in_store=1 AND type='Rental'

SELECT p1.id, p1.type, p1.in_store, p1.sku, p1.parent_id, SUBSTRING(p1.name, 1, 70) as name
FROM products p1 WHERE (p1.name LIKE '%biology%') AND p1.in_store=1 AND type='Demo'				//5 rows in set (0.36 sec)
---------------------------------------------------------------------------------------------------------

Open in new window



I'm currently looking into this right now....
0
 

Author Comment

by:clemcrock
Comment Utility
After double checking the outside products and inside assets part of the query I'm confirming that they are both returning the correct number of rows:

PRODUCTS (278 rows):
 SELECT 
 	p1.id, p1.sku,  SUBSTRING(p1.name, 1, 20), p1.in_store
 FROM products AS p1
 WHERE (
 (
 (p1.type='Package' AND p1.in_store = 1) OR
 ( (p1.type='Rental' OR p1.type='Demo') AND p1.in_store = (SELECT in_store FROM products AS p2 WHERE p2.in_store=1 AND p1.parent_id = p2.id)) 
 )
 AND
 (p1.company_id IS NULL OR p1.company_id NOT IN (209)) AND
 (LOWER(REPLACE(REPLACE(REPLACE(REPLACE(p1.name, '-', ' '), " ", ''), '&', ''), ':', '') ) LIKE '%biology%' OR LOWER(p1.name) LIKE '%biology%')
 )
 OR p1.sku='%biology%'

Open in new window



ASSETS (943 rows):
 SELECT 
 	a2.id, a2.type, a2.title, a2.sort_title, a2.author_name, a2.author_first_name, a2.author_last_name, 
 	a2.isbn, a2.cached_product_in_store, a2.href, a2.description AND
	(a2.build_status='approved' OR (a2.rework=1 && a2.build_status NOT IN ('destroyed', 'unavailable', 'out_of_distribution')))
FROM 
 	assets a2
WHERE 
 	(type = 'VitalBook' AND cached_product_in_store = 1) AND
	(cached_company_id IS NULL OR cached_company_id NOT IN (209)) AND
	(parent_id IS NULL AND block_search != 1) AND
 	(a2.title LIKE '%biology%' OR a2.isbn LIKE '%biology%')

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
might indicate joins aren't correct...

this might help:
select 'products        :' as tbl, count(*) as numrows from products
union all
select 'assets          :' as tbl, count(*) as numrows from assets
union all
select 'asset products  :' as tbl, count(*) as numrows from asset_products
union all
select 'distinct asset  :' as tbl, count(distinct asset_id) as numrows from asset_products
union all
select 'distinct product:' as tbl, count(distinct product_id) as numrows from asset_products

Open in new window

0
 

Author Comment

by:clemcrock
Comment Utility
Just an FIY - I simplified it a bit and this seems to pull the same data and gives easier access to both asset and product fields:

SELECT
 	p1.id, a1.id, p1.sku, p1.type, SUBSTRING(p1.name, 1, 20), SUBSTRING(a1.title, 1, 20)
 FROM products AS p1
	INNER JOIN assets_products ON p1.id = assets_products.product_id
	INNER JOIN( 
				SELECT 
				 	a2.id, a2.type, a2.title, a2.sort_title, a2.author_name, a2.author_first_name, a2.author_last_name, 
				 	a2.isbn, a2.cached_product_in_store, a2.href, a2.description
				FROM 
				 	assets a2
				WHERE 
				 	(type = 'VitalBook' AND cached_product_in_store = 1) AND
					(cached_company_id IS NULL OR cached_company_id NOT IN (209)) AND
					(parent_id IS NULL AND block_search != 1) AND
				 	(a2.title LIKE '%biology%' OR a2.isbn LIKE '%biology%') AND
					(a2.build_status='approved' OR (a2.rework=1 && a2.build_status NOT IN ('destroyed', 'unavailable', 'out_of_distribution')))
		) 
		AS a1 ON assets_products.asset_id = a1.id
 WHERE((
        (p1.type='Package' AND p1.in_store = 1) OR
 		((p1.type='Rental' OR p1.type='Demo') AND p1.in_store = (SELECT in_store FROM products AS p2 WHERE p2.in_store=1 AND p1.parent_id = p2.id))) AND
 		(p1.company_id IS NULL OR p1.company_id NOT IN (209)) AND
 		(REPLACE(REPLACE(REPLACE(REPLACE(p1.name, '-', ' '), " ", ''), '&', ''), ':', '') LIKE '%biology%' OR p1.name LIKE '%biology%'))
 OR p1.sku='%biology%'

Open in new window

0
 

Author Comment

by:clemcrock
Comment Utility
PortletPaul - an answer to your question above here's the output of your union example:

select 'products        :' as tbl, count(*) as numrows from products
union all
select 'assets          :' as tbl, count(*) as numrows from assets
union all
select 'asset products  :' as tbl, count(*) as numrows from assets_products
union all
select 'distinct asset  :' as tbl, count(distinct asset_id) as numrows from assets_products
union all
select 'distinct product:' as tbl, count(distinct product_id) as numrows from assets_products
+-------------------+---------+
| tbl               | numrows |
+-------------------+---------+
| products        : |  328503 |
| assets          : |  222651 |
| asset products  : | 1162996 |
| distinct asset  : |  222084 |
| distinct product: |  353245 |
+-------------------+---------+

Open in new window

0
 

Author Comment

by:clemcrock
Comment Utility
I took another stab that seems like it would totally work because each inner join, by themselves, returns the correct number of records but together they don't:

SELECT
 	p1.id, a1.id, p1.sku, p1.type, a1.title
 FROM assets_products
 	INNER JOIN( 
				SELECT DISTINCT
				 	p2.id, p2.sku, p2.type, SUBSTRING(p2.name, 1, 20)
				FROM 
				 	products p2
				WHERE((
				        (p2.type='Package' AND p2.in_store = 1) OR
				 		((p2.type='Rental' OR p2.type='Demo') AND p2.in_store = (SELECT in_store FROM products AS p3 WHERE p3.in_store=1 AND p2.parent_id = p3.id))) AND
				 		(p2.company_id IS NULL OR p2.company_id NOT IN (209)) AND
				 		(REPLACE(REPLACE(REPLACE(REPLACE(p2.name, '-', ' '), " ", ''), '&', ''), ':', '') LIKE '%biology%' OR p2.name LIKE '%biology%'))
				 OR p2.sku='%biology%'
		) 
		AS p1 ON assets_products.product_id = p1.id
		
	 INNER JOIN( 
				SELECT DISTINCT
				 	a2.id, a2.type, a2.title, a2.sort_title, a2.author_name, a2.author_first_name, a2.author_last_name, 
				 	a2.isbn, a2.cached_product_in_store, a2.href, a2.description
				FROM 
				 	assets a2
				WHERE 
				 	(type = 'VitalBook' AND cached_product_in_store = 1) AND
					(cached_company_id IS NULL OR cached_company_id NOT IN (209)) AND
					(parent_id IS NULL AND block_search != 1) AND
				 	(a2.title LIKE '%biology%' OR a2.isbn LIKE '%biology%') AND
					(a2.build_status='approved' OR (a2.rework=1 && a2.build_status NOT IN ('destroyed', 'unavailable', 'out_of_distribution')))
		) 
		AS a1 ON assets_products.asset_id = a1.id

Open in new window


It seems like this is so close to being a UNION  but we're trying to basically glue two queries together by INNER JOINING products ON assets_products.product_id AND INNER JOINING assets ON assets_products.asset_id
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
Comment Utility
You may be on to something by this approach :)
If my understanding is partially right, you are trying to list "things" that you have available - right? If so, I think that's from the products side, in which case this should be the inner join, and assets a left join (I think). I don't believe you need 'distinct' - or I hope you don't. Like this:
FROM assets_products
 	  INNER JOIN( 
				SELECT /* DISTINCT */
                    ...
				FROM 
				 	products p2
				WHERE ...
		) 
		AS p1 ON assets_products.product_id = p1.id
		
	 LEFT JOIN( 
				SELECT /* DISTINCT */
                    ...
				FROM 
				 	assets a2
				WHERE ...
		) 
		AS a1 ON assets_products.asset_id = a1.id

Open in new window

Those row counts help. Looks like asset_products has some redundant records (e.g. more distinct assets than in the asset table) - at some point you may wantto consider locating and removing useless records in assets_products. Another time/question :)
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
HI, just wanted to know how this crazy query was progressing. Is it done with?
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
clemcrock, did you settle on a "crazy query"? is this question now complete?
0
 

Author Comment

by:clemcrock
Comment Utility
Thanks for everyone's help and sorry for the long replies
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
my best guess is that ID: http:#a38891347 contributed most to the final outcome.
0
 

Author Comment

by:clemcrock
Comment Utility
Do you guys need anything else from me to close this issue?
0

Featured Post

6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

Join & Write a Comment

Foreword This is an old article.  Instead of using the MySQL extension that was used in the original code examples, please choose one of the currently supported database extensions instead.  More information is available here: MySQLi / PDO (http://…
Composite queries are used to retrieve the results from joining multiple queries after applying any filters. UNION, INTERSECT, MINUS, and UNION ALL are some of the operators used to get certain desired results.​
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now