Avatar of hdcowboyaz
Flag for United States of America asked on

SQL _ Cartesian Product

I'm getting a Cartesian Product on the query below.
I'm getting 42 records and should be getting 3.
There are 14 extra's for that job... 42/3 = 14.

This is the part of the statement that returns the Cartesian Product.
- LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

See attached ERD

 j.JOBID 'Job',
 j.DSCR 'Job Name',
 ex.ExtraID 'Extra',
 c.CostCode 'Cost Code',
 e.EMPID 'Emp ID',
 l.HOURS 'Hours'
 FROM labor l
 LEFT JOIN emp e ON (e.EMP_ID = l.EMP_ID)
 LEFT JOIN job j ON (j.JOB_ID = l.JOB_ID)
 LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)
 LEFT JOIN ccode c ON (c.EXTRA_ID = ex.EXTRA_ID)
 WHERE DATE_WORK = '2014-11-3'
 AND j.JOB_ID = 37014;
MySQL Server

Avatar of undefined
Last Comment
Steve Wales

8/22/2022 - Mon
Steve Wales

Primary Key for Extra appears to be Extra ID / Job ID / WO_ID

You are joining just on job ID.

Do you have multiple Job ID's there ?

For Example:

Extra / Job / WO
1 / 1 / 1
2 / 1 / 2
3 / 1 / 3

In this case, joining just on Job ID would give you three matching rows back - if you just join on Job ID of "1", you'd get back a match for each row, thus returning 3 rows on your join (Extra 1, Extra 2 and Extra 3)

The Primary Key for the extra table is EXTRA_ID.
Yes, there are multiple jobs in the extra table. The extra table is a child table to the job table.
WO is not part of this, in fact all records are blank for it at this point.
Steve Wales

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question

Of course there are multiple Extra_ID's for a job, its a child table. There is a many to one relationship. That's the whole concept of a database.
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
Steve Wales

Then what further help do you need ?

If you understand that there's multiple rows, and there are in fact  multiple rows on the table, the query is working correctly.  It's returning EXACTLY what you're telling it to return based upon the format of the query.

Result set without that join is X records.  Add in that join and you're going to get, for each JOB ID, X time Y rows returned where Y is the number of child rows.

If you're after some way to alter that, I don't know what to tell you - you state  you understand the concept of a many to one child relationship and seem surprised by the multiplicative result set.

In the labor table there is only three rows of data. That is what I'm trying to get.

My query returns 42 rows because the statement has a Cartesian product in it. That is what I'm trying to solve.
Steve Wales

For each of the 3 rows in the labor table, how many corresponding rows for each job are in the extras table.

If your jobs table has jobs A, B and C.

How many extra rows for job A ?  Call this value X.
How many extra rows for job B ?  Call this value Y.
How many extra rows for job C ?  Call this value Z.

Then add up the following:  (3 * X) + (3 * Y) + (3 * Z).

Does this add up to 42 ?
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.

I'm getting 42 records and should be getting 3. There are 14 extra's for that job... 42/3 = 14.

This is the part of the statement that returns the Cartesian Product.
 - LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

I have identified where the problems lies. I just have q query syntax issue.
Steve Wales

OK let me try to to explain this differently.

Before the join to extras you have 3 rows returned.

I'm assuming it's something like one job, three different cost codes and the hours etc charged to each cost code.

Then you try to link in extras, where there are 14 rows for the job.

A join will return every rows that matches the join condition and for an outer join, those rows that also exist and don't match the join condition.

Your join condition for extras is on the job number.

What the engine is doing is returning, for each row in the left table (your 3 rows), all rows that match job number for the second table (extras).

So, the first row matches all 14 extra rows, your second row matches all 14 extra rows and your third row matches all 14 extra rows.  42 total.

Since you can't give a 1:1 join condition, you're going to get a cartesian product.

It's not that your syntax is bad (it's not).

You maybe need to think of another way to represent your data so that you can get a 1:1 relationship?

I've requested that this question be deleted for the following reason:

I had a friend solve this query for me.
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
Steve Wales


I'm going to ask this to be subject to a review please.

Based upon the question posted here: https://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_28564096.html his solution was to change this:

LEFT JOIN extra ex ON (ex.JOB_ID = j.JOB_ID)

to this:


Which is exactly what I indicated he should do in my response http:#a40448094 when I said "Any time you have a situation like that and you can't refine the scope of your join, you'll get one match per row that matches returned - hence your cartesian product."

Adding the extra condition is refining the scope of the join.