[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 387
  • Last Modified:

MySQL Join Problem - WHERE NOT IN

Using MySql 5.1.30.

I have an order_history table and a order_history_products table.  The products table has many rows for each order in the orders table.

I need to know what orders do not contain certain products, for example:

SELECT * FROM `order_history` WHERE order_id NOT IN (SELECT  order_history.order_id
      FROM order_history INNER JOIN order_history_products ON order_history.order_id = order_history_products.order_id
      WHERE order_history_products.product_name = 'Pencils'
      GROUP BY order_history.order_id)

In theory this query would work, but the query takes 10+ minutes to execute.

If I execute the query separately it takes only seconds to finish (the inner join).

Example of tables:

---order_history---
row 1
{
    order_id: 5000
    order_fullname: John Doe
}

---order_history_products---
row 1
{
    order_id:5000
    product_name: Pencils
}
row 2
{
    order_id:5000
    product_name: Paper
}
row 3
{
    order_id:5000
    product_name: Paperclips
}


-----------------------

Also, in the orders_history table, there are multiple rows for the same customer, so I'll be grouping the end result by order_fullname.  Just FYI.

I'm open to any suggestions, even reconstructing the structure if necessary.

Thank you.


0
divineanarchy
Asked:
divineanarchy
  • 14
  • 7
  • 2
1 Solution
 
Pratima PharandeCommented:
use Distinct intade of group by

SELECT * FROM `order_history` WHERE order_id NOT IN (SELECT Distinct order_history.order_id
      FROM order_history INNER JOIN order_history_products ON
      order_history.order_id = order_history_products.order_id
      WHERE order_history_products.product_name = 'Pencils'
     )
0
 
Pratima PharandeCommented:
Or try this


SELECT * FROM order_history
WHERE order_id NOT IN ( select Distinct order_history_products.order_id From order_history_products
Where   order_history.order_id = order_history_products.order_id
      WHERE order_history_products.product_name = 'Pencils')
0
 
divineanarchyAuthor Commented:
No difference, still waiting for query to finish (3+ minutes)
0
Microsoft Certification Exam 74-409

VeeamĀ® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
divineanarchyAuthor Commented:
INNER JOINS seem to be the only timely options- I tried temp tables as well to isolate the order_id, but doesn't speed up the process of NOT IN
0
 
mahomeCommented:
As far as I know, mysql has a problem with using indices in "IN" and "NOT IN". Can you as always in performance problems please post the query plan (EXPLAIN ...)
0
 
divineanarchyAuthor Commented:
Can I use some sort of LEFT JOIN looking for NULL values?
0
 
divineanarchyAuthor Commented:
Query plan:

Get all orders that do not contain specific products.  An order's product(s) is defined in another table with one or many rows.  Each product row has an order_id.


0
 
mahomeCommented:
No I meant the result of:

EXPLAIN SELECT * FROM `order_history` WHERE order_id NOT IN (SELECT Distinct order_history.order_id
      FROM order_history INNER JOIN order_history_products ON
      order_history.order_id = order_history_products.order_id
      WHERE order_history_products.product_name = 'Pencils'
     )              

That's called query or execution plan: http://dev.mysql.com/doc/refman/5.1/de/explain.html

With LEFT JOIN it should work, but I just can't figure out the exact query.
0
 
divineanarchyAuthor Commented:
id      select_type      table      type      possible_keys      key      key_len      ref      rows      Extra
1      PRIMARY      order_history      ALL      NULL      NULL      NULL      NULL      52732      Using where
2      DEPENDENT SUBQUERY      order_history      eq_ref      PRIMARY      PRIMARY      3      func      1      Using index; Using temporary
2      DEPENDENT SUBQUERY      order_history_products      ALL      NULL      NULL      NULL      NULL      154040      Using where; Distinct; Using join buffer
0
 
divineanarchyAuthor Commented:
This EXPLAIN is my original query I posted:

id      select_type      table      type      possible_keys      key      key_len      ref      rows      Extra
1      PRIMARY      order_history      ALL      NULL      NULL      NULL      NULL      52527      Using where
2      DEPENDENT SUBQUERY      order_history_products      ALL      NULL      NULL      NULL      NULL      154621      Using where; Using temporary; Using filesort
2      DEPENDENT SUBQUERY      order_history      eq_ref      PRIMARY      PRIMARY      3      headblade_dbo.order_history_products.ORDER_ID      1      Using where; Using index
0
 
mahomeCommented:
Looking at the plans, the first one is better because it has no filesort. The distinct is unnecessary, just remove it.

Difficult, I only can guess because I have no test data.

Try the following and again post the plan:

SELECT order_history.* FROM `order_history` left join  (SELECT order_history.order_id as order_id
     FROM order_history INNER JOIN order_history_products ON
     order_history.order_id = order_history_products.order_id
     WHERE order_history_products.product_name = 'Pencils'
 ) as subquery on (order_history.order_id = subquery.order_id)
where subquery.order_id is null

Open in new window

0
 
divineanarchyAuthor Commented:
It looks like that will work.  The first test did.  I was hoping it would be a solution like this.

Thank you
0
 
divineanarchyAuthor Commented:
OK that was good.  Thank you.

The orders table has a CustomerName field that I need to group by.  

Can I put a group by CustomerName in the subquery statement to get Customers who have never purchased a product?

The orders have multiple rows for each customer, so I want to see if they have ever purchased a product.

Thank you for your help so far.
0
 
mahomeCommented:
I only can answer according to the history tables I see. If it has nothing to do with it just post a new  question and describe the relevant tables as in this question.

order with no products

SELECT order_history.*
FROM order_history LEFT JOIN order_history_products ON order_history.order_id = order_history_products.order_id
where order_history_products.order_id is null

Open in new window

0
 
divineanarchyAuthor Commented:
I did mention the grouping by customer at the end of my post.

I have no problem posting as a new question, but I'd like your input on it.

The order_history table has a Customer_Name field.  But order_history has many rows to one Customer_Name.  I want the same functionality you built, for Grouped customers.  Does that make sense?

 
0
 
divineanarchyAuthor Commented:
I now know what orders do not contain products.  But I need to know unique which customers across those resulting orders.  This is difficult to explain.

I have rows of orders that do not contain a product. Is there a way to step before that and find Customers who never had orders with product in it?

0
 
mahomeCommented:
OK I understand. I think of it.
0
 
mahomeCommented:
Group by is good.

1. Gives you customers which never bought any product.
2. Gives you customers which never bought a specific product.

SELECT order_fullname 
FROM order_history LEFT JOIN order_history_products ON order_history.order_id = order_history_products.order_id
group by order_fullname
having count(order_history_products.order_id) = 0
 
 
SELECT order_fullname FROM order_history left join  (SELECT order_history.order_id as order_id
     FROM order_history INNER JOIN order_history_products ON
     order_history.order_id = order_history_products.order_id
     WHERE order_history_products.product_name = 'Pencil'
 ) as subquery on (order_history.order_id = subquery.order_id)
group by order_fullname
having count(subquery.order_id) = 0

Open in new window

0
 
divineanarchyAuthor Commented:
Awesome!  Will this work with another group by?
group by order_fullname, order_email   ?

This is great, thank you!
0
 
mahomeCommented:
Yes that should work.
0
 
divineanarchyAuthor Commented:
Can I say != 'Paper' in this query for Customers who have orders Paper, but have not ordered Pencils?

Thank you!

0
 
divineanarchyAuthor Commented:
Brilliant
0
 
divineanarchyAuthor Commented:
mahome, Thank you very much for your assistance.

Would it be possible to modify this query to find customers who have bought something but not the other?  
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 14
  • 7
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now