• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 534
  • Last Modified:

MySQL Join, Left Join unnexpected result

Hey,

I seem to be doing something wrong in one of my queries. There are 3 tables:

-Documents
-Documents_Analyst_Mapping
-Documents_Analyst


Table structure simplified

Documents:
id | id_doc | title

Documents_Analyst_Mapping:
id_doc | analyst_id

Documents_Analyst
analyst_id | analyst_name

I am trying to display a result that looks like this:
id | id_doc | title | analyst_name


My query is bellow and it gives me more rows than expected (I guess I am making up rows as I go along...) unfortunatly I am not so great with left joins, any help is much appreciated!


SELECT docs.ida, docs.Id_Document, docs.Title, analyst.name AS Analyst_Name
FROM Documents AS docs

LEFT JOIN Documents_Analyst_Mapping AS analyst_map ON docs.Id_Document = analyst_map.Id_Document

LEFT JOIN Documents_Analyst AS analyst ON analyst_map.Analyst_Id = analyst.Analyst_Id

WHERE analyst_map.Analyst_Id = analyst.Analyst_Id
AND docs.Language = 'en'

Open in new window

0
cdaeppen
Asked:
cdaeppen
2 Solutions
 
KalpanCommented:
please modify your query

SELECT docs.ida, docs.Id_Document, docs.Title, analyst.name AS Analyst_Name
FROM Documents AS docs

LEFT JOIN Documents_Analyst_Mapping AS analyst_map ON docs.Id_Document = analyst_map.Id_Document

LEFT JOIN Documents_Analyst AS analyst ON analyst_map.Analyst_Id = analyst.Analyst_Id

WHERE docs.Language = 'en'
0
 
McNeticCommented:
In which way do you get more results than expected? You should get one row for each row in document_analyst_mapping. Btw, the "WHERE" clause is unnecessary and should anyways not be used to restrict what is joined.
0
 
mkobrinCommented:
it looks to me that you should actually be using an inner join here, because the ID fields will never be null, so I would do it as follows:


SELECT docs.id, docs.Id_Doc, docs.Title, analyst.Analyst_Name
FROM Documents AS docs

INNER JOIN Documents_Analyst_Mapping AS analyst_map ON analyst_map.Id_Doc = docs.Id_Doc

INNER JOIN Documents_Analyst AS analyst ON analyst.Analyst_Id = analyst_map.Analyst_Id

WHERE docs.Language = 'en'

Open in new window

0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
cdaeppenAuthor Commented:
@ kalmax: I've had this and it returns a lot more records than:
SELECT count(*) FROM Documents WHERE Language = "en" <- which I am using as my control

@McNetic: "You should get one row for each row in document_analyst_mapping" -> I am trying to get one row for each row in Documents, that's the point, I want to have the author name in my result next to the document title
0
 
rmonsenCommented:
Remember that this query will create one row for each valid document <-> analyst combination, so if a document has more than one analyst, you will get more than one row for each document, which means that row count will and should be larger than your "control" query.
As McNetic correctly stated, you should expect one row for each row in document_analyst_mapping.

If you only want one row for each documents, you need to explain how you wish documents with more than one analyst to be handled.
0
 
cdaeppenAuthor Commented:
Thanks mkobrin and rmonsen! I think mkobrin has the right idea and rmonsen made me think that maybe there are multible analysts per document giving me the funny result (though only one should be possible the tables are very old and have well over 20k records....)
0
 
cdaeppenAuthor Commented:
So it turns out there can be more than 1 analyst per document, I guess group_concat is what I am looking for here?
0
 
MurfurFull Stack DeveloperCommented:
There are basically, two types of join - INNER and OUTER.
LEFT and RIGHT joins merely dictate which way to process the results of an OUTER join.

An INNER join will select all rows from the joined tables ONLY where there is a match between the columns specified.

An OUTER join will select all rows from the joined tables REGARDLESS of whether there is a match between the columns specified.

A LEFT join will select ALL records from the left half of the join and only return data from the right where there is a match.

A RIGHT join is the opposite.

So in your example, a left join would list all book, show who has read them and if some books are not read, there would be no match in the mapping table, so the analyst name result would be null.

In these situations, I find it easiest to write out all of the fields in the query and only when it works do I start removing the columns I don't need. This way, any errors like ambiguous columns are reported.

I did this:
select d.*, dam.*, da.*
from Documents d
left join Documents_Analyst_Mapping dam on d.id_doc = dam.id_doc
left join Documents_Analyst da on dam.id_analyst = da.id_analyst 

Open in new window


and in my data, no-one had read book #3 so I got this result:

 id_doc     title     language     id      id_doc     id_analyst     id_analyst     analyst_name    
 ---------  --------  -----------  ------  ---------  -------------  -------------  --------------- 
 1          Book 1    en           1       1          1              1              John            
 1          Book 1    en           2       1          2              2              Steve           
 1          Book 1    en           3       1          3              3              Mary            
 1          Book 1    en           4       1          4              4              Simon           
 2          Book 2    en           6       2          2              2              Steve           
 2          Book 2    en           7       2          3              3              Mary            
 3          book 3    en           (null)  (null)     (null)         (null)         (null)          
 4          Book 4    en           11      4          2              2              Steve           
 4          Book 4    en           12      4          4              4              Simon  

Open in new window


Then I filtered the output columns required:

select d.id_doc, d.title, da.analyst_name
from Documents d
left join Documents_Analyst_Mapping dam on d.id_doc = dam.id_doc
left join Documents_Analyst da on dam.id_analyst = da.id_analyst 

Open in new window


and I got this:

 
id_doc     title     analyst_name    
 ---------  --------  --------------- 
 1          Book 1    John            
 1          Book 1    Steve           
 1          Book 1    Mary            
 1          Book 1    Simon           
 2          Book 2    Steve           
 2          Book 2    Mary            
 3          book 3    (null)          
 4          Book 4    Steve           
 4          Book 4    Simon 

Open in new window

0
 
cdaeppenAuthor Commented:
I added GROUP_CONCAT(analyst.name) and GROUP BY docs.Id and I am getting a result that I was expecting :) Thanks everyone for your help.
0

Featured Post

Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now