?
Solved

MySQL: Subtract one SELECT with multiple WHEREs from another SELECT with multiple WHEREs

Posted on 2011-09-04
11
Medium Priority
?
393 Views
Last Modified: 2012-05-12
A table of Contacts (names & addresses) is linked to a table of Groups in a many to many relationship, which relationship is indexed in the table groups_has_contacts.

I need a query that will provide the selection of all the contacts in more than one of the groups without duplicating any of the contacts. The query also needs to allow for the subtraction of contacts in any number of the other groups a contact might be in.

The following gets close, but any contact which is in both of the selected groups (in the WHERE clause) will be duplicated in the results.

SELECT `contacts`.`idContacts`
     , `contacts`.`First_Name`, `contacts`.`Last_Name`
     , `Contacts_idContacts` , `contacts`.`Email1` , `Groups_idGroup`, `Group`
FROM `libertytools`.`contacts`
JOIN `libertytools`.`groups_has_contacts`
   ON Contacts_idContacts=idContacts
JOIN `libertytools`.`groups` ON Groups_idGroup=idGroup
WHERE (Groups_idGroup = 20 OR Groups_idGroup = 21)
AND NOT EXISTS (
   SELECT 1
   FROM `libertytools`.`groups_has_contacts` gc2
   WHERE (gc2.Groups_idGroup = 350 OR gc2.Groups_idGroup = 250)
   AND gc2.Contacts_idContacts=idContacts
)
AND (`contacts`.`Email1`
IS NOT NULL
AND `contacts`.`Email1` <> '')
order by `contacts`.`Last_Name` ASC;

SELECT `contacts`.`idContacts` 
     , `contacts`.`First_Name`, `contacts`.`Last_Name`
     , `Contacts_idContacts` , `contacts`.`Email1` , `Groups_idGroup`, `Group`
FROM `libertytools`.`contacts` 
JOIN `libertytools`.`groups_has_contacts` 
   ON Contacts_idContacts=idContacts 
JOIN `libertytools`.`groups` ON Groups_idGroup=idGroup
WHERE (Groups_idGroup = 20 OR Groups_idGroup = 21)
AND NOT EXISTS (
   SELECT 1
   FROM `libertytools`.`groups_has_contacts` gc2
   WHERE (gc2.Groups_idGroup = 350 OR gc2.Groups_idGroup = 250)
   AND gc2.Contacts_idContacts=idContacts
)
AND (`contacts`.`Email1` 
IS NOT NULL 
AND `contacts`.`Email1` <> '') 
order by `contacts`.`Last_Name` ASC;

Open in new window


Note that the last clause that checks for an email needs to be appendable to the end of the query -- we toggle it on and off.

Thanks!
0
Comment
Question by:Ron1959
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 5
11 Comments
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 36481938
Hi. Do you need to maintain the data from the `groups` and `groups_has_contacts` tables in the final selection?
0
 

Author Comment

by:Ron1959
ID: 36481949
No, we only need the data from the Contacts table.  We may want the name of the groups, but we can grab that in another query, if need be.

The main select could be:

SELECT `contacts`.`idContacts`,  `contacts`.`First_Name`, `contacts`.`Last_Name`,
    `contacts`.`Email1`
0
 
LVL 60

Accepted Solution

by:
Kevin Cross earned 800 total points
ID: 36481961
Okay, so here is my first thought. It is based on only needing contact information. What you can do is convert the inclusion in a similar fashion as the exclusion, i.e., use EXISTS.

SELECT `contacts`.`idContacts`, `contacts`.`First_Name`
     , `contacts`.`Last_Name`, `contacts`.`Email1`
FROM `libertytools`.`contacts` 
WHERE EXISTS ( -- include contacts from specific groups
   SELECT 1
   FROM `libertytools`.`groups_has_contacts` gc2
   WHERE gc2.Groups_idGroup IN ( 20, 21 )
   AND gc2.Contacts_idContacts=idContacts
)
AND NOT EXISTS ( -- exclude those in other groups
   SELECT 1
   FROM `libertytools`.`groups_has_contacts` gc2
   WHERE gc2.Groups_idGroup IN ( 350, 250 )
   AND gc2.Contacts_idContacts=idContacts
)
-- direct comparision of Email1 should eliminate NULLs
AND (`contacts`.`Email1` <> '') 
ORDER BY `contacts`.`Last_Name` ASC;

Open in new window


Now, if you need to include the group information, you can take your original query and add a GROUP BY on all the `contacts` columns and then use GROUP_CONCAT() to show a comma-separated list of the group names to which the contact belongs. The latter option, might look something like this:

SELECT `contacts`.`idContacts` 
     , `contacts`.`First_Name`, `contacts`.`Last_Name`
     , `contacts`.`Email1`
     , GROUP_CONCAT(`Group` ORDER BY `idGroup`) `Groups`
FROM `libertytools`.`contacts` 
JOIN `libertytools`.`groups_has_contacts` 
   ON Contacts_idContacts=idContacts 
JOIN `libertytools`.`groups` ON Groups_idGroup=idGroup
WHERE Groups_idGroup IN ( 20, 21 )
AND NOT EXISTS (
   SELECT 1
   FROM `libertytools`.`groups_has_contacts` gc2
   WHERE gc2.Groups_idGroup IN ( 350, 250 )
   AND gc2.Contacts_idContacts=idContacts
)
AND `contacts`.`Email1` <> ''
ORDER BY `contacts`.`Last_Name` ASC;

Open in new window

0
Ransomware: The New Cyber Threat & How to Stop It

This infographic explains ransomware, type of malware that blocks access to your files or your systems and holds them hostage until a ransom is paid. It also examines the different types of ransomware and explains what you can do to thwart this sinister online threat.  

 

Author Closing Comment

by:Ron1959
ID: 36481993
Kevin, you make this look too easy!

I can see that I've just been scraping the surface of MySQL capability.  I look forward to learning more.

Thanks, again!
0
 

Author Comment

by:Ron1959
ID: 36482006
The WHERE EXISTS syntax seems more straight forward than joins.

Other than not being able to draw field names from the other tables, are there any disadvantages to using WHERE EXISTS rather than JOINS?
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 36482007
You are most welcome. With an attitude like yours towards "learning," I am certain it will become "easy" for you also. Good luck in your adventures. If you ever need guidance, you know where to find us.

Best regards and happy coding,
Kevin
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 36482022
You named it. But as you have seen here, the disadvantage may be an advantage when there is a one-to-many relationship but you need unique rows. You have saved on overhead of a GROUP BY and as such even if EXISTS added a small amount of expense to the query, it should be less than having to group the final result. The EXISTS is a short-circuiting method, so as soon as it finds a row it is satisfied in either case. On the contrary, a NOT IN clause would have to check every value to ensure that there was definitely not a match later done the line. In short, the INNER JOIN and the EXISTS probably are treated the same by the optimizer.

To see for yourself, try doing an EXPLAIN or DESCRIBE on both queries and examine the result in comparison to each other.
0
 

Author Comment

by:Ron1959
ID: 36482046
Hummm....  Upon further reflection, I do see a need to pull data unique to each contact from the groups_has_contacts table.

I had forgotten that we started to track the dates contacts were added to the groups.  We have a groups_has_contacts.Create_Date field.

Does that put me back to JOINS or is there still a way to use the EXISTS calls?
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 36482073
Back to the joins. And sorry for my type-o. I said GROUP BY, but didn't include it.

SELECT `contacts`.`idContacts` 
     , `contacts`.`First_Name`, `contacts`.`Last_Name`
     , `contacts`.`Email1`
     , GROUP_CONCAT(`Group` ORDER BY `idGroup`) `Groups`
FROM `libertytools`.`contacts` 
JOIN `libertytools`.`groups_has_contacts` 
   ON Contacts_idContacts=idContacts 
JOIN `libertytools`.`groups` ON Groups_idGroup=idGroup
WHERE Groups_idGroup IN ( 20, 21 )
AND NOT EXISTS (
   SELECT 1
   FROM `libertytools`.`groups_has_contacts` gc2
   WHERE gc2.Groups_idGroup IN ( 350, 250 )
   AND gc2.Contacts_idContacts=idContacts
)
AND `contacts`.`Email1` <> ''
GROUP BY `contacts`.`idContacts` 
     , `contacts`.`First_Name`, `contacts`.`Last_Name`
     , `contacts`.`Email1`
ORDER BY `contacts`.`Last_Name` ASC;

Open in new window


You can add a MAX() on date or use GROUP_CONCAT() on that also.
0
 

Author Comment

by:Ron1959
ID: 36482166
I'm impressed that you caught the need to accommodate more than one group name!

This works great, but I don't understand the GROUP BY.  (Without it, the query only returns the first record in the Contacts table.)  

- Ron
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 36482179
Yes, that is a MySQL thing. It was definitely a type-o to not include the GROUP BY earlier. In other platforms like MS SQL, if you leave off the GROUP BY on columns not in an aggregate, then you will get an error saying column is not in aggregated or included in grouping. In MySQL, you can leave off columns in the SELECT no in an aggregate and MySQL will just grab a row. Think FIRST or LAST function of MS Access.

i.e., GROUP BY `contacts`.`idContacts`

Likely is sufficient for MySQL, but I prefer to stay with the syntax more clear to what I am doing. And it allows me to use copy and paste more readily to other platforms.
0

Featured Post

PowerShell Core for Advanced Linux Administrators

Understand advanced principals around Powershell Core with a focus on the Linux Administrator.  This course covers how to administer numerous environments across multiple platforms including Linux, Azure, AWS, and Google Cloud from a single shell instance.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
In this blog, we’ll look at how improvements to Percona XtraDB Cluster improved IST performance.
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
Suggested Courses

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question