Solved

MySQL: Left Join on Two Fields Problem

Posted on 2014-02-13
14
694 Views
Last Modified: 2014-02-14
Hi

Below is a snippet of code that provides intermittent results.

The first two Left Joins Work Fine.

The last one gives intermittent results. (ie: 50% of the time it returns the correct result the other 50% it returns NULL when there is definitely data in the related table)

It's worth noting that there is a UNIQUE KEY constraint on
 and [codes_region.region_code]

Any ideas what the problem is?

[code]
FROM log
LEFT OUTER JOIN ip 
ON log.ip = ip.ip

LEFT OUTER JOIN codes_country 
ON ip.country_code = codes_country.code

LEFT OUTER JOIN codes_region 
ON ip.country_code = codes_region.country_code 
AND ip.region_code = codes_region.region_code

Open in new window

0
Comment
Question by:AdrianSmithUK
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 5
14 Comments
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 39855785
I don't really have an answer, but I'd like to ask for a little more information so we can complete the SSCCE.  Please show us the table structures via the CREATE TABLE statements, some test data so we can reproduce your results and the complete query that fails.  Armed with that we may be able to help you get it working right.  Thanks, ~Ray
0
 

Author Comment

by:AdrianSmithUK
ID: 39855935
Hi Ray

I've slimmed the files down as much as possible.

Here are the create table statements

CREATE TABLE `log` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `ip` varchar(40) DEFAULT NULL,
  PRIMARY KEY (`id`)
);

CREATE TABLE `ip` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `ip` varchar(128) NOT NULL DEFAULT '0',
  `country_code` varchar(128) DEFAULT '0',
  `region_code` varchar(128) DEFAULT '0',
  PRIMARY KEY (`id`),
  UNIQUE KEY `ip` (`ip`)
);

CREATE TABLE `codes_country` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `code` varchar(2) DEFAULT NULL,
  `name` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `code_UNIQUE` (`code`)
);

CREATE TABLE `codes_region` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `country_code` varchar(2) DEFAULT NULL,
  `region_code` varchar(2) DEFAULT NULL,
  `region_name` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `country_code_unique` (`country_code`,`region_code`)
);

Open in new window


Here is the query:

SELECT
log.ip,
ip.country_code,
codes_country.name,
ip.region_code,
codes_region.region_name
LEFT OUTER JOIN ip ON log.ip = ip.ip
LEFT OUTER JOIN codes_country ON ip.country_code = codes_country.code
LEFT OUTER JOIN codes_region ON ip.country_code = codes_region.country_code AND ip.region_code = codes_region.region_code
WHERE 1;

Open in new window



Here a screenshot showing the results:

Notice AR/7 and AR/13

Here is a screenshot showing the region codes.

 Notice AR/7 and AR/13
Hopefully you can see that AR/13 and AR/7 are identical scenarios, yet AR/13 produces the correct result and AR/7 does not.

Does that help?

Kind Regards,
Adrian
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 39856249
I think the issue is you need an INNER JOIN between the codes_region and the ip tables; therefore, you can try like this.
...
FROM log
LEFT OUTER JOIN (
    ip JOIN codes_country ON ip.country_code = codes_country.code
    JOIN codes_region ON ip.country_code = codes_region.country_code 
     AND ip.region_code = codes_region.region_code
) ON log.ip = ip.ip
...

Open in new window


You still get the OUTER JOIN to log, but if you do have a match to ip table there must be a match in the codes_country and codes_region table.  If you may have an ip without a country, then you may need to go another level by making sure that codes_country.code = codes_region.country_code.  However, I would try what I just sent first, then we can adjust based on the data.
0
PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

 

Author Comment

by:AdrianSmithUK
ID: 39856293
Hi Kevin

Many thanks for the suggestion.

Unfortunately the snippet threw up hundreds of NULL values in the [country_code] column.

In the previous example the [country_code] column was 100% full (as expected).

Here is a screen shot:

Notice the NULLs in  the [country_code] column.
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 39856305
Okay, so you definitely have IP addresses without codes_regions.
Try:
...
FROM log
LEFT OUTER JOIN (
    ip JOIN codes_country ON ip.country_code = codes_country.code
    LEFT OUTER JOIN codes_region 
      ON ip.country_code = codes_region.country_code 
     AND ip.region_code = codes_region.region_code
) ON log.ip = ip.ip
...

Open in new window


EDIT: if you still get inconsistent results, make sure that the region_code is consistent in the ip table with the value in the codes_region table.  If you provide sample data of the rows that are failing with the supporting entries in the other tables along with the expected results, we can work through the solution with you.
0
 

Author Comment

by:AdrianSmithUK
ID: 39856320
Unfortunately still a problem. Notice that AR/7 is still NULL

AR/7 = NULL
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 39856339
Notice the region code says '07' versus 7.  Check that the data types are the same between the two tables.
0
 

Author Comment

by:AdrianSmithUK
ID: 39856376
Did you miss the create statements at the top of this post.
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 39856415
Yes, sorry.

`country_code` varchar(128) DEFAULT '0'
  `region_code` varchar(128) DEFAULT '0'

In your other two tables, though, you define:
`country_code` varchar(2) DEFAULT NULL
`region_code` varchar(2) DEFAULT NULL

Therefore, i would expect the columns in the ip table to match as well as, optionally, define REFERENCES to the respective country and region table.

The reason they are not matching is you have different data - '07' != '7'.  Maybe you intended for all the region codes to be two digits exactly (e.g., '05', '10', et cetera), so one fix is to update the region table since it likely has less records given the ip and other referencing tables consistently use two digits already.  If they have some records with '7' and some with '07', you have to fix the existing data and any automated processes to make sure you get consistency.
0
 

Author Comment

by:AdrianSmithUK
ID: 39856455
Good Observation. Give me 20 mins and I'll fix it.
0
 

Author Comment

by:AdrianSmithUK
ID: 39856563
OK - I've cleaned it up:

Here are the new create statements:

CREATE TABLE `codes_country` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `code` varchar(2) DEFAULT NULL,
  `name` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `code_unique` (`code`)
) ENGINE=InnoDB AUTO_INCREMENT=255 DEFAULT CHARSET=latin1;

CREATE TABLE `codes_region` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `country_code` varchar(2) DEFAULT NULL,
  `region_code` varchar(2) DEFAULT NULL,
  `region_name` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `country_code_unique` (`country_code`,`region_code`)
) ENGINE=InnoDB AUTO_INCREMENT=4063 DEFAULT CHARSET=latin1;

CREATE TABLE `ip` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `ip` varchar(40) DEFAULT '0',
  `country_code` varchar(2) DEFAULT NULL,
  `region_code` varchar(2) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `ip` (`ip`)
) ENGINE=MyISAM AUTO_INCREMENT=3043 DEFAULT CHARSET=latin1;

CREATE TABLE `log` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `ip` varchar(40) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `uid` (`uid`,`instance`)
) ENGINE=MyISAM AUTO_INCREMENT=6622 DEFAULT CHARSET=latin1;

Open in new window


And here is the screenshot of the results.

AR/7 is NULL
Notice AR/7 is unfortunately still NULL ... grrr!

I also tried your previous snippet and still same results.

Sorry Kevin, I've got to leave for the evening now and I can't test anything at home because I'm using an EC2 server that won't let me in, so unfortunately I'll have to come back to this tomorrow.

Thanks for all the help so far.

Kind Regards,
Adrian
0
 
LVL 60

Accepted Solution

by:
Kevin Cross earned 500 total points
ID: 39856570
Cleaning up the create statements is one thing.  You have to clean the data.  AR/7 is not the same as AR/07.
0
 

Author Comment

by:AdrianSmithUK
ID: 39856586
Genius!!!

I bet that's the problem.

That's why AR/13 works and not AR/7

I'll test it first thing in the morning and if that's the problem I'll close the question.

Many thanks.
Adrian
0
 

Author Closing Comment

by:AdrianSmithUK
ID: 39858455
The original code was fine.

As Kevin suggested, it was the fact that one table had two char codes (eg: 07) and the other had only one (eg: 7).

Many thanks Kevin.
0

Featured Post

Ransomware: The New Cyber Threat & How to Stop It

This infographic explains ransomware, type of malware that blocks access to your files or your systems and holds them hostage until a ransom is paid. It also examines the different types of ransomware and explains what you can do to thwart this sinister online threat.  

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
This article discusses how to create an extensible mechanism for linked drop downs.
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

737 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question