• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 727
  • Last Modified:

MySQL: Left Join on Two Fields Problem

Hi

Below is a snippet of code that provides intermittent results.

The first two Left Joins Work Fine.

The last one gives intermittent results. (ie: 50% of the time it returns the correct result the other 50% it returns NULL when there is definitely data in the related table)

It's worth noting that there is a UNIQUE KEY constraint on
 and [codes_region.region_code]

Any ideas what the problem is?

[code]
FROM log
LEFT OUTER JOIN ip 
ON log.ip = ip.ip

LEFT OUTER JOIN codes_country 
ON ip.country_code = codes_country.code

LEFT OUTER JOIN codes_region 
ON ip.country_code = codes_region.country_code 
AND ip.region_code = codes_region.region_code

Open in new window

0
AdrianSmithUK
Asked:
AdrianSmithUK
  • 8
  • 5
1 Solution
 
Ray PaseurCommented:
I don't really have an answer, but I'd like to ask for a little more information so we can complete the SSCCE.  Please show us the table structures via the CREATE TABLE statements, some test data so we can reproduce your results and the complete query that fails.  Armed with that we may be able to help you get it working right.  Thanks, ~Ray
0
 
AdrianSmithUKAuthor Commented:
Hi Ray

I've slimmed the files down as much as possible.

Here are the create table statements

CREATE TABLE `log` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `ip` varchar(40) DEFAULT NULL,
  PRIMARY KEY (`id`)
);

CREATE TABLE `ip` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `ip` varchar(128) NOT NULL DEFAULT '0',
  `country_code` varchar(128) DEFAULT '0',
  `region_code` varchar(128) DEFAULT '0',
  PRIMARY KEY (`id`),
  UNIQUE KEY `ip` (`ip`)
);

CREATE TABLE `codes_country` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `code` varchar(2) DEFAULT NULL,
  `name` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `code_UNIQUE` (`code`)
);

CREATE TABLE `codes_region` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `country_code` varchar(2) DEFAULT NULL,
  `region_code` varchar(2) DEFAULT NULL,
  `region_name` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `country_code_unique` (`country_code`,`region_code`)
);

Open in new window


Here is the query:

SELECT
log.ip,
ip.country_code,
codes_country.name,
ip.region_code,
codes_region.region_name
LEFT OUTER JOIN ip ON log.ip = ip.ip
LEFT OUTER JOIN codes_country ON ip.country_code = codes_country.code
LEFT OUTER JOIN codes_region ON ip.country_code = codes_region.country_code AND ip.region_code = codes_region.region_code
WHERE 1;

Open in new window



Here a screenshot showing the results:

Notice AR/7 and AR/13

Here is a screenshot showing the region codes.

 Notice AR/7 and AR/13
Hopefully you can see that AR/13 and AR/7 are identical scenarios, yet AR/13 produces the correct result and AR/7 does not.

Does that help?

Kind Regards,
Adrian
0
 
Kevin CrossChief Technology OfficerCommented:
I think the issue is you need an INNER JOIN between the codes_region and the ip tables; therefore, you can try like this.
...
FROM log
LEFT OUTER JOIN (
    ip JOIN codes_country ON ip.country_code = codes_country.code
    JOIN codes_region ON ip.country_code = codes_region.country_code 
     AND ip.region_code = codes_region.region_code
) ON log.ip = ip.ip
...

Open in new window


You still get the OUTER JOIN to log, but if you do have a match to ip table there must be a match in the codes_country and codes_region table.  If you may have an ip without a country, then you may need to go another level by making sure that codes_country.code = codes_region.country_code.  However, I would try what I just sent first, then we can adjust based on the data.
0
Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

 
AdrianSmithUKAuthor Commented:
Hi Kevin

Many thanks for the suggestion.

Unfortunately the snippet threw up hundreds of NULL values in the [country_code] column.

In the previous example the [country_code] column was 100% full (as expected).

Here is a screen shot:

Notice the NULLs in  the [country_code] column.
0
 
Kevin CrossChief Technology OfficerCommented:
Okay, so you definitely have IP addresses without codes_regions.
Try:
...
FROM log
LEFT OUTER JOIN (
    ip JOIN codes_country ON ip.country_code = codes_country.code
    LEFT OUTER JOIN codes_region 
      ON ip.country_code = codes_region.country_code 
     AND ip.region_code = codes_region.region_code
) ON log.ip = ip.ip
...

Open in new window


EDIT: if you still get inconsistent results, make sure that the region_code is consistent in the ip table with the value in the codes_region table.  If you provide sample data of the rows that are failing with the supporting entries in the other tables along with the expected results, we can work through the solution with you.
0
 
AdrianSmithUKAuthor Commented:
Unfortunately still a problem. Notice that AR/7 is still NULL

AR/7 = NULL
0
 
Kevin CrossChief Technology OfficerCommented:
Notice the region code says '07' versus 7.  Check that the data types are the same between the two tables.
0
 
AdrianSmithUKAuthor Commented:
Did you miss the create statements at the top of this post.
0
 
Kevin CrossChief Technology OfficerCommented:
Yes, sorry.

`country_code` varchar(128) DEFAULT '0'
  `region_code` varchar(128) DEFAULT '0'

In your other two tables, though, you define:
`country_code` varchar(2) DEFAULT NULL
`region_code` varchar(2) DEFAULT NULL

Therefore, i would expect the columns in the ip table to match as well as, optionally, define REFERENCES to the respective country and region table.

The reason they are not matching is you have different data - '07' != '7'.  Maybe you intended for all the region codes to be two digits exactly (e.g., '05', '10', et cetera), so one fix is to update the region table since it likely has less records given the ip and other referencing tables consistently use two digits already.  If they have some records with '7' and some with '07', you have to fix the existing data and any automated processes to make sure you get consistency.
0
 
AdrianSmithUKAuthor Commented:
Good Observation. Give me 20 mins and I'll fix it.
0
 
AdrianSmithUKAuthor Commented:
OK - I've cleaned it up:

Here are the new create statements:

CREATE TABLE `codes_country` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `code` varchar(2) DEFAULT NULL,
  `name` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `code_unique` (`code`)
) ENGINE=InnoDB AUTO_INCREMENT=255 DEFAULT CHARSET=latin1;

CREATE TABLE `codes_region` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `country_code` varchar(2) DEFAULT NULL,
  `region_code` varchar(2) DEFAULT NULL,
  `region_name` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `country_code_unique` (`country_code`,`region_code`)
) ENGINE=InnoDB AUTO_INCREMENT=4063 DEFAULT CHARSET=latin1;

CREATE TABLE `ip` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `ip` varchar(40) DEFAULT '0',
  `country_code` varchar(2) DEFAULT NULL,
  `region_code` varchar(2) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `ip` (`ip`)
) ENGINE=MyISAM AUTO_INCREMENT=3043 DEFAULT CHARSET=latin1;

CREATE TABLE `log` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `ip` varchar(40) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `uid` (`uid`,`instance`)
) ENGINE=MyISAM AUTO_INCREMENT=6622 DEFAULT CHARSET=latin1;

Open in new window


And here is the screenshot of the results.

AR/7 is NULL
Notice AR/7 is unfortunately still NULL ... grrr!

I also tried your previous snippet and still same results.

Sorry Kevin, I've got to leave for the evening now and I can't test anything at home because I'm using an EC2 server that won't let me in, so unfortunately I'll have to come back to this tomorrow.

Thanks for all the help so far.

Kind Regards,
Adrian
0
 
Kevin CrossChief Technology OfficerCommented:
Cleaning up the create statements is one thing.  You have to clean the data.  AR/7 is not the same as AR/07.
0
 
AdrianSmithUKAuthor Commented:
Genius!!!

I bet that's the problem.

That's why AR/13 works and not AR/7

I'll test it first thing in the morning and if that's the problem I'll close the question.

Many thanks.
Adrian
0
 
AdrianSmithUKAuthor Commented:
The original code was fine.

As Kevin suggested, it was the fact that one table had two char codes (eg: 07) and the other had only one (eg: 7).

Many thanks Kevin.
0

Featured Post

[Webinar] Cloud and Mobile-First Strategy

Maybe you’ve fully adopted the cloud since the beginning. Or maybe you started with on-prem resources but are pursuing a “cloud and mobile first” strategy. Getting to that end state has its challenges. Discover how to build out a 100% cloud and mobile IT strategy in this webinar.

  • 8
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now