Solved

SQL _ Cartesian Product

Posted on 2014-11-17
11
226 Views
Last Modified: 2014-11-24
I'm getting a Cartesian Product on the query below.
I'm getting 42 records and should be getting 3.
There are 14 extra's for that job... 42/3 = 14.

This is the part of the statement that returns the Cartesian Product.
- LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

See attached ERD

 SELECT
 j.JOBID 'Job',
 j.DSCR 'Job Name',
 ex.ExtraID 'Extra',
 c.CostCode 'Cost Code',
 e.EMPID 'Emp ID',
 l.HOURS 'Hours'
 FROM labor l
 LEFT JOIN emp e ON (e.EMP_ID = l.EMP_ID)
 LEFT JOIN job j ON (j.JOB_ID = l.JOB_ID)
 LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)
 LEFT JOIN ccode c ON (c.EXTRA_ID = ex.EXTRA_ID)
 WHERE DATE_WORK = '2014-11-3'
 AND l.DELETED = 'N'
 AND j.JOB_ID = 37014;
ERDTheEdgeLaborEntry.jpg
0
Comment
Question by:hdcowboyaz
  • 6
  • 5
11 Comments
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448045
Primary Key for Extra appears to be Extra ID / Job ID / WO_ID

You are joining just on job ID.

Do you have multiple Job ID's there ?

For Example:

Extra / Job / WO
1 / 1 / 1
2 / 1 / 2
3 / 1 / 3

In this case, joining just on Job ID would give you three matching rows back - if you just join on Job ID of "1", you'd get back a match for each row, thus returning 3 rows on your join (Extra 1, Extra 2 and Extra 3)
0
 

Author Comment

by:hdcowboyaz
ID: 40448078
The Primary Key for the extra table is EXTRA_ID.
Yes, there are multiple jobs in the extra table. The extra table is a child table to the job table.
WO is not part of this, in fact all records are blank for it at this point.
0
 
LVL 22

Accepted Solution

by:
Steve Wales earned 500 total points
ID: 40448094
Are there multiple "Extra_ID" values per Job in the table ?  If that's the case, then that's your problem.

Any time you have a situation like that and you can't refine the scope of your join, you'll get one match per row that matches returned - hence your cartesian product.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 

Author Comment

by:hdcowboyaz
ID: 40448116
Of course there are multiple Extra_ID's for a job, its a child table. There is a many to one relationship. That's the whole concept of a database.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448147
Then what further help do you need ?

If you understand that there's multiple rows, and there are in fact  multiple rows on the table, the query is working correctly.  It's returning EXACTLY what you're telling it to return based upon the format of the query.

Result set without that join is X records.  Add in that join and you're going to get, for each JOB ID, X time Y rows returned where Y is the number of child rows.

If you're after some way to alter that, I don't know what to tell you - you state  you understand the concept of a many to one child relationship and seem surprised by the multiplicative result set.
0
 

Author Comment

by:hdcowboyaz
ID: 40448158
In the labor table there is only three rows of data. That is what I'm trying to get.

My query returns 42 rows because the statement has a Cartesian product in it. That is what I'm trying to solve.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448184
For each of the 3 rows in the labor table, how many corresponding rows for each job are in the extras table.

If your jobs table has jobs A, B and C.

How many extra rows for job A ?  Call this value X.
How many extra rows for job B ?  Call this value Y.
How many extra rows for job C ?  Call this value Z.

Then add up the following:  (3 * X) + (3 * Y) + (3 * Z).

Does this add up to 42 ?
0
 

Author Comment

by:hdcowboyaz
ID: 40448203
I'm getting 42 records and should be getting 3. There are 14 extra's for that job... 42/3 = 14.

This is the part of the statement that returns the Cartesian Product.
 - LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

I have identified where the problems lies. I just have q query syntax issue.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448264
OK let me try to to explain this differently.

Before the join to extras you have 3 rows returned.

I'm assuming it's something like one job, three different cost codes and the hours etc charged to each cost code.

Then you try to link in extras, where there are 14 rows for the job.

A join will return every rows that matches the join condition and for an outer join, those rows that also exist and don't match the join condition.

Your join condition for extras is on the job number.

What the engine is doing is returning, for each row in the left table (your 3 rows), all rows that match job number for the second table (extras).

So, the first row matches all 14 extra rows, your second row matches all 14 extra rows and your third row matches all 14 extra rows.  42 total.

Since you can't give a 1:1 join condition, you're going to get a cartesian product.

It's not that your syntax is bad (it's not).

You maybe need to think of another way to represent your data so that you can get a 1:1 relationship?
0
 

Author Comment

by:hdcowboyaz
ID: 40448480
I've requested that this question be deleted for the following reason:

I had a friend solve this query for me.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448481
Moderators,

I'm going to ask this to be subject to a review please.

Based upon the question posted here: http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_28564096.html his solution was to change this:

LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

to this:

LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID AND ex.EXTRA_ID = c.EXTRA_ID)

Which is exactly what I indicated he should do in my response http:#a40448094 when I said "Any time you have a situation like that and you can't refine the scope of your join, you'll get one match per row that matches returned - hence your cartesian product."

Adding the extra condition is refining the scope of the join.
0

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

More Fun with XML and MySQL – Parsing Delimited String with a Single SQL Statement Are you ready for another of my SQL tidbits?  Hopefully so, as in this adventure, I will be covering a topic that comes up a lot which is parsing a comma (or other…
Creating and Managing Databases with phpMyAdmin in cPanel.
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
Nobody understands Phishing better than an anti-spam company. That’s why we are providing Phishing Awareness Training to our customers. According to a report by Verizon, only 3% of targeted users report malicious emails to management. With compan…

832 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question