MySQL Join, Left Join unnexpected result

Hey,

I seem to be doing something wrong in one of my queries. There are 3 tables:

-Documents
-Documents_Analyst_Mapping
-Documents_Analyst


Table structure simplified

Documents:
id | id_doc | title

Documents_Analyst_Mapping:
id_doc | analyst_id

Documents_Analyst
analyst_id | analyst_name

I am trying to display a result that looks like this:
id | id_doc | title | analyst_name


My query is bellow and it gives me more rows than expected (I guess I am making up rows as I go along...) unfortunatly I am not so great with left joins, any help is much appreciated!


SELECT docs.ida, docs.Id_Document, docs.Title, analyst.name AS Analyst_Name
FROM Documents AS docs

LEFT JOIN Documents_Analyst_Mapping AS analyst_map ON docs.Id_Document = analyst_map.Id_Document

LEFT JOIN Documents_Analyst AS analyst ON analyst_map.Analyst_Id = analyst.Analyst_Id

WHERE analyst_map.Analyst_Id = analyst.Analyst_Id
AND docs.Language = 'en'

Open in new window

cdaeppenAsked:
Who is Participating?
 
mkobrinConnect With a Mentor Commented:
it looks to me that you should actually be using an inner join here, because the ID fields will never be null, so I would do it as follows:


SELECT docs.id, docs.Id_Doc, docs.Title, analyst.Analyst_Name
FROM Documents AS docs

INNER JOIN Documents_Analyst_Mapping AS analyst_map ON analyst_map.Id_Doc = docs.Id_Doc

INNER JOIN Documents_Analyst AS analyst ON analyst.Analyst_Id = analyst_map.Analyst_Id

WHERE docs.Language = 'en'

Open in new window

0
 
KalpanCommented:
please modify your query

SELECT docs.ida, docs.Id_Document, docs.Title, analyst.name AS Analyst_Name
FROM Documents AS docs

LEFT JOIN Documents_Analyst_Mapping AS analyst_map ON docs.Id_Document = analyst_map.Id_Document

LEFT JOIN Documents_Analyst AS analyst ON analyst_map.Analyst_Id = analyst.Analyst_Id

WHERE docs.Language = 'en'
0
 
McNeticCommented:
In which way do you get more results than expected? You should get one row for each row in document_analyst_mapping. Btw, the "WHERE" clause is unnecessary and should anyways not be used to restrict what is joined.
0
What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

 
cdaeppenAuthor Commented:
@ kalmax: I've had this and it returns a lot more records than:
SELECT count(*) FROM Documents WHERE Language = "en" <- which I am using as my control

@McNetic: "You should get one row for each row in document_analyst_mapping" -> I am trying to get one row for each row in Documents, that's the point, I want to have the author name in my result next to the document title
0
 
rmonsenCommented:
Remember that this query will create one row for each valid document <-> analyst combination, so if a document has more than one analyst, you will get more than one row for each document, which means that row count will and should be larger than your "control" query.
As McNetic correctly stated, you should expect one row for each row in document_analyst_mapping.

If you only want one row for each documents, you need to explain how you wish documents with more than one analyst to be handled.
0
 
cdaeppenAuthor Commented:
Thanks mkobrin and rmonsen! I think mkobrin has the right idea and rmonsen made me think that maybe there are multible analysts per document giving me the funny result (though only one should be possible the tables are very old and have well over 20k records....)
0
 
cdaeppenAuthor Commented:
So it turns out there can be more than 1 analyst per document, I guess group_concat is what I am looking for here?
0
 
MurfurConnect With a Mentor Full Stack DeveloperCommented:
There are basically, two types of join - INNER and OUTER.
LEFT and RIGHT joins merely dictate which way to process the results of an OUTER join.

An INNER join will select all rows from the joined tables ONLY where there is a match between the columns specified.

An OUTER join will select all rows from the joined tables REGARDLESS of whether there is a match between the columns specified.

A LEFT join will select ALL records from the left half of the join and only return data from the right where there is a match.

A RIGHT join is the opposite.

So in your example, a left join would list all book, show who has read them and if some books are not read, there would be no match in the mapping table, so the analyst name result would be null.

In these situations, I find it easiest to write out all of the fields in the query and only when it works do I start removing the columns I don't need. This way, any errors like ambiguous columns are reported.

I did this:
select d.*, dam.*, da.*
from Documents d
left join Documents_Analyst_Mapping dam on d.id_doc = dam.id_doc
left join Documents_Analyst da on dam.id_analyst = da.id_analyst 

Open in new window


and in my data, no-one had read book #3 so I got this result:

 id_doc     title     language     id      id_doc     id_analyst     id_analyst     analyst_name    
 ---------  --------  -----------  ------  ---------  -------------  -------------  --------------- 
 1          Book 1    en           1       1          1              1              John            
 1          Book 1    en           2       1          2              2              Steve           
 1          Book 1    en           3       1          3              3              Mary            
 1          Book 1    en           4       1          4              4              Simon           
 2          Book 2    en           6       2          2              2              Steve           
 2          Book 2    en           7       2          3              3              Mary            
 3          book 3    en           (null)  (null)     (null)         (null)         (null)          
 4          Book 4    en           11      4          2              2              Steve           
 4          Book 4    en           12      4          4              4              Simon  

Open in new window


Then I filtered the output columns required:

select d.id_doc, d.title, da.analyst_name
from Documents d
left join Documents_Analyst_Mapping dam on d.id_doc = dam.id_doc
left join Documents_Analyst da on dam.id_analyst = da.id_analyst 

Open in new window


and I got this:

 
id_doc     title     analyst_name    
 ---------  --------  --------------- 
 1          Book 1    John            
 1          Book 1    Steve           
 1          Book 1    Mary            
 1          Book 1    Simon           
 2          Book 2    Steve           
 2          Book 2    Mary            
 3          book 3    (null)          
 4          Book 4    Steve           
 4          Book 4    Simon 

Open in new window

0
 
cdaeppenAuthor Commented:
I added GROUP_CONCAT(analyst.name) and GROUP BY docs.Id and I am getting a result that I was expecting :) Thanks everyone for your help.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.