Solved

SQL _ Cartesian Product

Posted on 2014-11-17
11
233 Views
Last Modified: 2014-11-24
I'm getting a Cartesian Product on the query below.
I'm getting 42 records and should be getting 3.
There are 14 extra's for that job... 42/3 = 14.

This is the part of the statement that returns the Cartesian Product.
- LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

See attached ERD

 SELECT
 j.JOBID 'Job',
 j.DSCR 'Job Name',
 ex.ExtraID 'Extra',
 c.CostCode 'Cost Code',
 e.EMPID 'Emp ID',
 l.HOURS 'Hours'
 FROM labor l
 LEFT JOIN emp e ON (e.EMP_ID = l.EMP_ID)
 LEFT JOIN job j ON (j.JOB_ID = l.JOB_ID)
 LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)
 LEFT JOIN ccode c ON (c.EXTRA_ID = ex.EXTRA_ID)
 WHERE DATE_WORK = '2014-11-3'
 AND l.DELETED = 'N'
 AND j.JOB_ID = 37014;
ERDTheEdgeLaborEntry.jpg
0
Comment
Question by:hdcowboyaz
  • 6
  • 5
11 Comments
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448045
Primary Key for Extra appears to be Extra ID / Job ID / WO_ID

You are joining just on job ID.

Do you have multiple Job ID's there ?

For Example:

Extra / Job / WO
1 / 1 / 1
2 / 1 / 2
3 / 1 / 3

In this case, joining just on Job ID would give you three matching rows back - if you just join on Job ID of "1", you'd get back a match for each row, thus returning 3 rows on your join (Extra 1, Extra 2 and Extra 3)
0
 

Author Comment

by:hdcowboyaz
ID: 40448078
The Primary Key for the extra table is EXTRA_ID.
Yes, there are multiple jobs in the extra table. The extra table is a child table to the job table.
WO is not part of this, in fact all records are blank for it at this point.
0
 
LVL 22

Accepted Solution

by:
Steve Wales earned 500 total points
ID: 40448094
Are there multiple "Extra_ID" values per Job in the table ?  If that's the case, then that's your problem.

Any time you have a situation like that and you can't refine the scope of your join, you'll get one match per row that matches returned - hence your cartesian product.
0
Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

 

Author Comment

by:hdcowboyaz
ID: 40448116
Of course there are multiple Extra_ID's for a job, its a child table. There is a many to one relationship. That's the whole concept of a database.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448147
Then what further help do you need ?

If you understand that there's multiple rows, and there are in fact  multiple rows on the table, the query is working correctly.  It's returning EXACTLY what you're telling it to return based upon the format of the query.

Result set without that join is X records.  Add in that join and you're going to get, for each JOB ID, X time Y rows returned where Y is the number of child rows.

If you're after some way to alter that, I don't know what to tell you - you state  you understand the concept of a many to one child relationship and seem surprised by the multiplicative result set.
0
 

Author Comment

by:hdcowboyaz
ID: 40448158
In the labor table there is only three rows of data. That is what I'm trying to get.

My query returns 42 rows because the statement has a Cartesian product in it. That is what I'm trying to solve.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448184
For each of the 3 rows in the labor table, how many corresponding rows for each job are in the extras table.

If your jobs table has jobs A, B and C.

How many extra rows for job A ?  Call this value X.
How many extra rows for job B ?  Call this value Y.
How many extra rows for job C ?  Call this value Z.

Then add up the following:  (3 * X) + (3 * Y) + (3 * Z).

Does this add up to 42 ?
0
 

Author Comment

by:hdcowboyaz
ID: 40448203
I'm getting 42 records and should be getting 3. There are 14 extra's for that job... 42/3 = 14.

This is the part of the statement that returns the Cartesian Product.
 - LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

I have identified where the problems lies. I just have q query syntax issue.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448264
OK let me try to to explain this differently.

Before the join to extras you have 3 rows returned.

I'm assuming it's something like one job, three different cost codes and the hours etc charged to each cost code.

Then you try to link in extras, where there are 14 rows for the job.

A join will return every rows that matches the join condition and for an outer join, those rows that also exist and don't match the join condition.

Your join condition for extras is on the job number.

What the engine is doing is returning, for each row in the left table (your 3 rows), all rows that match job number for the second table (extras).

So, the first row matches all 14 extra rows, your second row matches all 14 extra rows and your third row matches all 14 extra rows.  42 total.

Since you can't give a 1:1 join condition, you're going to get a cartesian product.

It's not that your syntax is bad (it's not).

You maybe need to think of another way to represent your data so that you can get a 1:1 relationship?
0
 

Author Comment

by:hdcowboyaz
ID: 40448480
I've requested that this question be deleted for the following reason:

I had a friend solve this query for me.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448481
Moderators,

I'm going to ask this to be subject to a review please.

Based upon the question posted here: http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_28564096.html his solution was to change this:

LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

to this:

LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID AND ex.EXTRA_ID = c.EXTRA_ID)

Which is exactly what I indicated he should do in my response http:#a40448094 when I said "Any time you have a situation like that and you can't refine the scope of your join, you'll get one match per row that matches returned - hence your cartesian product."

Adding the extra condition is refining the scope of the join.
0

Featured Post

Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Foreword In the years since this article was written, numerous hacking attacks have targeted password-protected web sites.  The storage of client passwords has become a subject of much discussion, some of it useful and some of it misguided.  Of cou…
Creating and Managing Databases with phpMyAdmin in cPanel.
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question