Solved

MYSQL 1 Row Less :(

Posted on 2014-03-08
8
353 Views
Last Modified: 2014-04-02
Hi All,

Im pretty new to MYSQL and between you guys and You Tube I seem to be getting there.  basically I have a marketing system that has "Email Subscribers" and "Excluded Emails".  If I have an email in my subscriber table which also exists in my Excluded table then when I try to send out emails to my group any subscriber email that has a duplicate in the exclusion table will not go out.  Which sounds perfect unless sometimes people make mistakes when adding contacts and you end up adding emails to the exclusion table when you should be adding to the subscriber table.

Any hooo I have written some SQL to remove these duplicates which looks good apart from one key fact that it looks like its 1 row short.   This is what I did:


select count(*) from ACDEV1.em_exclusion
Inner join ACDEV1.em_subscriber
ON em_exclusion.email = em_subscriber.email

** I ran the above which gave me 71,477 duplicates

create table ACDEV1.temp_ids (id int(10))

** I ran the above to give me a temp table

insert into ACDEV1.temp_ids select em_exclusion.id from ACDEV1.em_exclusion  Inner join ACDEV1.em_subscriber  ON em_exclusion.email = em_subscriber.email

*** Did the above to copy all the IDS of the excluded emails that duplicate with the subscriber table (Do I can delete against later) Which again was 71,477 inserts

delete ACDEV1.em_exclusion_list from ACDEV1.em_exclusion_list inner join ACDEV1.temp_ids on em_exclusion_list.id=temp_ids.id

** Then did the above to delete all items from the exclusion.list table (This is just a data table that holds settings per exluded user).  This only returned 71,476 (1 short)

delete ACDEV1.em_exclusion from ACDEV1.em_exclusion inner join ACDEV1.temp_ids on em_exclusion.id=temp_ids.id

*** I then did the above to remove the actual emails themselves from the email table, which ended up being 1 short again 71,476

select count(*) from ACDEV1.em_exclusion
Inner join ACDEV1.em_subscriber
ON em_exclusion.email = em_subscriber.email

*** now when I run the above it shows 0 rows :(  

WHAT HAPPENS to the 1 missing row


So the question is,  is there a way I should test this / is this a known thing or a header row or something I dont know :(

Basically this looks bullet proof to me but why would there be 1 row less ?

Terry
0
Comment
Question by:ncomper
  • 5
  • 3
8 Comments
 
LVL 5

Author Comment

by:ncomper
ID: 39914532
And just to add I suppose the thing I would be looking for is to know what would you experts do in this situation how would you test what went wrong to find the missing record ?
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 39914816
I would say this means that in this table:
ACDEV1.temp_ids

you have 1 value of email_id being duplicated as such.

to find that value:
select email from 
ACDEV1.temp_ids
group by email 
having count(*) > 1

Open in new window

0
 
LVL 5

Author Comment

by:ncomper
ID: 39914858
Hi it was actually

SELECT id
FROM ACDEV1.temp_ids
GROUP BY id
HAVING COUNT( * ) >1
LIMIT 0 , 30

However it showed 24 ID that had duplicates, which is a step forward,  but would, be interested how experts would trouble shoot this "1 row less" issue further ?
0
VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 39914880
by knowing this "issue" that a JOIN can return more rows if you have a "many-to-many" relationship, resp in both tables the same values more than once.
the GROUP BY ,... HAVING COUNT(*) > 1 is the key to identify the actual values in such scenarios
0
 
LVL 5

Author Comment

by:ncomper
ID: 39914892
OK cool thanks so if it returned 24 rows i.e 24 duplicate ID's should that be enough for me to satisfy myself that my code is correct or would you still continue to find why there was 1 row less,  I dont know what is acceptable at being "correct" I need this script to be bullet proof
0
 
LVL 142

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 500 total points
ID: 39916152
you will need to learn that COUNT(*) on a JOIN may be "tricky".
say we consider a column (ID) in 2 tables, in which you join later.
if you have now in table1 2 rows with the same ID or 2 different ID's,
a SELECT COUNT(*) on the table1 will return 2 in both cases.
however, a SELECT COUNT(DISTINCT ID) on table 1 will return different results.
if the 2 rows have the same value for ID, it will return as result 1, otherwise 2!

now, if you JOIN that table with another table on the ID column, and consider that table has also 2 rows, with either 2 different values or 2 times the same value.
a select count(*) on the JOIN will return the following results:
same value on ID on both columns on both tables (all 4 values matching): 4
same value on ID in table 1, and different values on second table (only 1 value matching between table1 and table2): 2
different values on ID in both tables, with matching values on ID on the 2 tables: 2

so, before JOINing and then COUNTint, you need to first do a count for each value, before the join ...

select email, count(*) from table1 group by email
select email, count(*) from table2 group by email

select t1.email, t2.email, t1.cnt, t2.cnt
  from ( select email, count(*) cnt from table1 group by email ) t1
 full outer join ( select email, count(*) cnt from table2 group by email ) t2
   on t1.email = t2.email

...
0
 
LVL 5

Author Comment

by:ncomper
ID: 39916334
thanks will have a play with this tonight ;)
0
 
LVL 5

Author Closing Comment

by:ncomper
ID: 39971971
Thanks
0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Clean text to insert in database 9 63
Easy to use inventory management software 7 94
myqsl update statement on phpMyAdmin 8 30
mysql joining from the same table 6 42
Fore-Foreword Today (2016) Maxmind has a new approach to the distribution of its data sets.  This article may be obsolete.  Instead of using the examples here, have a look at the MaxMind API (https://www.maxmind.com/en/geolite2-developer-package). …
I use MySQL for many of my development projects in a Windows environment. To manage my databases (and perform queries) for years I used a tool called MySQL administrator.  This tool has since been replaced by MySQL Workbench. So I decided to m…
This Micro Tutorial demonstrates using Microsoft Excel pivot tables, how to reverse engineer competitors' marketing strategies through backlinks.
This video shows how to quickly and easily add an email signature for all users on Exchange 2016. The resulting signature is applied on a server level by Exchange Online. The email signature template has been downloaded from: www.mail-signatures…

773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question