Solved

SQL _ Cartesian Product

Posted on 2014-11-17
11
214 Views
Last Modified: 2014-11-24
I'm getting a Cartesian Product on the query below.
I'm getting 42 records and should be getting 3.
There are 14 extra's for that job... 42/3 = 14.

This is the part of the statement that returns the Cartesian Product.
- LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

See attached ERD

 SELECT
 j.JOBID 'Job',
 j.DSCR 'Job Name',
 ex.ExtraID 'Extra',
 c.CostCode 'Cost Code',
 e.EMPID 'Emp ID',
 l.HOURS 'Hours'
 FROM labor l
 LEFT JOIN emp e ON (e.EMP_ID = l.EMP_ID)
 LEFT JOIN job j ON (j.JOB_ID = l.JOB_ID)
 LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)
 LEFT JOIN ccode c ON (c.EXTRA_ID = ex.EXTRA_ID)
 WHERE DATE_WORK = '2014-11-3'
 AND l.DELETED = 'N'
 AND j.JOB_ID = 37014;
ERDTheEdgeLaborEntry.jpg
0
Comment
Question by:hdcowboyaz
  • 6
  • 5
11 Comments
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448045
Primary Key for Extra appears to be Extra ID / Job ID / WO_ID

You are joining just on job ID.

Do you have multiple Job ID's there ?

For Example:

Extra / Job / WO
1 / 1 / 1
2 / 1 / 2
3 / 1 / 3

In this case, joining just on Job ID would give you three matching rows back - if you just join on Job ID of "1", you'd get back a match for each row, thus returning 3 rows on your join (Extra 1, Extra 2 and Extra 3)
0
 

Author Comment

by:hdcowboyaz
ID: 40448078
The Primary Key for the extra table is EXTRA_ID.
Yes, there are multiple jobs in the extra table. The extra table is a child table to the job table.
WO is not part of this, in fact all records are blank for it at this point.
0
 
LVL 22

Accepted Solution

by:
Steve Wales earned 500 total points
ID: 40448094
Are there multiple "Extra_ID" values per Job in the table ?  If that's the case, then that's your problem.

Any time you have a situation like that and you can't refine the scope of your join, you'll get one match per row that matches returned - hence your cartesian product.
0
 

Author Comment

by:hdcowboyaz
ID: 40448116
Of course there are multiple Extra_ID's for a job, its a child table. There is a many to one relationship. That's the whole concept of a database.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448147
Then what further help do you need ?

If you understand that there's multiple rows, and there are in fact  multiple rows on the table, the query is working correctly.  It's returning EXACTLY what you're telling it to return based upon the format of the query.

Result set without that join is X records.  Add in that join and you're going to get, for each JOB ID, X time Y rows returned where Y is the number of child rows.

If you're after some way to alter that, I don't know what to tell you - you state  you understand the concept of a many to one child relationship and seem surprised by the multiplicative result set.
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 

Author Comment

by:hdcowboyaz
ID: 40448158
In the labor table there is only three rows of data. That is what I'm trying to get.

My query returns 42 rows because the statement has a Cartesian product in it. That is what I'm trying to solve.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448184
For each of the 3 rows in the labor table, how many corresponding rows for each job are in the extras table.

If your jobs table has jobs A, B and C.

How many extra rows for job A ?  Call this value X.
How many extra rows for job B ?  Call this value Y.
How many extra rows for job C ?  Call this value Z.

Then add up the following:  (3 * X) + (3 * Y) + (3 * Z).

Does this add up to 42 ?
0
 

Author Comment

by:hdcowboyaz
ID: 40448203
I'm getting 42 records and should be getting 3. There are 14 extra's for that job... 42/3 = 14.

This is the part of the statement that returns the Cartesian Product.
 - LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

I have identified where the problems lies. I just have q query syntax issue.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448264
OK let me try to to explain this differently.

Before the join to extras you have 3 rows returned.

I'm assuming it's something like one job, three different cost codes and the hours etc charged to each cost code.

Then you try to link in extras, where there are 14 rows for the job.

A join will return every rows that matches the join condition and for an outer join, those rows that also exist and don't match the join condition.

Your join condition for extras is on the job number.

What the engine is doing is returning, for each row in the left table (your 3 rows), all rows that match job number for the second table (extras).

So, the first row matches all 14 extra rows, your second row matches all 14 extra rows and your third row matches all 14 extra rows.  42 total.

Since you can't give a 1:1 join condition, you're going to get a cartesian product.

It's not that your syntax is bad (it's not).

You maybe need to think of another way to represent your data so that you can get a 1:1 relationship?
0
 

Author Comment

by:hdcowboyaz
ID: 40448480
I've requested that this question be deleted for the following reason:

I had a friend solve this query for me.
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 40448481
Moderators,

I'm going to ask this to be subject to a review please.

Based upon the question posted here: http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_28564096.html his solution was to change this:

LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

to this:

LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID AND ex.EXTRA_ID = c.EXTRA_ID)

Which is exactly what I indicated he should do in my response http:#a40448094 when I said "Any time you have a situation like that and you can't refine the scope of your join, you'll get one match per row that matches returned - hence your cartesian product."

Adding the extra condition is refining the scope of the join.
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Fore-Foreword Today (2016) Maxmind has a new approach to the distribution of its data sets.  This article may be obsolete.  Instead of using the examples here, have a look at the MaxMind API (https://www.maxmind.com/en/geolite2-developer-package). …
This guide whil teach how to setup live replication (database mirroring) on 2 servers for backup or other purposes. In our example situation we have this network schema (see atachment). We need to replicate EVERY executed SQL query on server 1 to…
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now