# Continuing the Question on Cleaning up data

After testing Further to Question on Cleaning up data the solution here its almost correct.

Here is the returned data for season 5 and season 6

Season 5
id      ClubId      TeamId      FirstName      Surname      TeamPosition      status      position      showit      Cost       seasonid
24417      2      47      Adam      Hills      2      1      Defenders      0      NULL      NULL
24371      2      47      Andrew      Clarke      4      1      Strikers      0      80      5
24377      2      47      Bobby      Bloggs      4      1      Strikers      0      70      5
24420      2      47      David      Cranbrook      3      1      Midfielders      0      NULL      NULL
24423      2      47      Jason      Admin      4      1      Strikers      0      NULL      NULL
24405      2      47      Jason      Batt      4      1      Strikers      0      NULL      NULL
24395      2      47      Philip      Batt      1      1      Goalkeepers      0      NULL      NULL
24376      2      47      Stewart      Clifton      1      1      Goalkeepers      0      NULL      NULL

Season 6
id      ClubId      TeamId      FirstName      Surname      TeamPosition      status      position      showit      Cost       seasonid
24417      2      47      Adam      Hills      2      1      Defenders      0      NULL      NULL
24420      2      47      David      Cranbrook      3      1      Midfielders      0      NULL      NULL
24423      2      47      Jason      Admin      4      1      Strikers      0      NULL      NULL
24405      2      47      Jason      Batt      4      1      Strikers      0      NULL      NULL
24395      2      47      Philip      Batt      1      1      Goalkeepers      0      NULL      NULL
24376      2      47      Stewart      Clifton      1      1      Goalkeepers      0      NULL      NULL

As I know I have no entries for season 6 in the 2nd table ie the one with the cost and seasonid, I need  to see ALL 8 records with NULL's for Cost and seasonid not just the entries that are NULL

How do I tweak this?
###### Who is Participating?

Software DeveloperCommented:
There is one simple query, that'll give you all 8 teammembers always, and then join in more info:

SELECT * FROM teammembers LEFT JOIN ...

So a way to see costs for season 6 would be

SELECT * FROM teammembers t LEFT JOIN costs c on t.id = c.teammemberid AND c.seasonid=6

Once you put the condition seasonid=6 into the WHERE clause of the overall query you force a join, you make the left join an inner join, and you're only left with the team members having a join with seasonid=6 and other team members have seasinid=NULL, which doesn't match with any condition.

In the WHERE clause of any query having OUTER joins (like LEFT, RIGHT), you can filter columns in the table you query all data, not in the table you join losely, that's never working.

Besides all this, your data points out errors in the position of teammmebers, it contradicts the id and its meaning according to static_position.

Bye, Olaf.
0

Software DeveloperCommented:
What does this give you?

SELECT d.id,d.ClubId, d.TeamId, d.FirstName, d.Surname, d.TeamPosition, d.status, d.Seasonid, c.position, d.showit
, t.cost
FROM (SELECT * , ROW_NUMBER() OVER (PARTITION BY firstname, surname, teamposition
ORDER BY ID DESC) AS rn
FROM teammembers
) AS d
join Static_Position as c
on 1=1

I'd assume this'll give you 6 records for season 6.

The "left side" of the data already has to have the 8 records you intend to have. The query of angelIII is not limiting data of the last joined table3 to only null records, the nulls also don't come from that table, they come from finding no join.

Bye, Olaf.
0

Author Commented:
It repeats the data 5 times one row for each position (which comes from the position table (Static_Position)
id      ClubId      TeamId      FirstName      Surname      TeamPosition      status      Seasonid      position      showit
24377      2      47      Bobby      Bloggs      4      1      6      Goalkeepers      0
24377      2      47      Bobby      Bloggs      4      1      6      Defenders      0
24377      2      47      Bobby      Bloggs      4      1      6      Midfielders      0
24377      2      47      Bobby      Bloggs      4      1      6      Strikers      0
24377      2      47      Bobby      Bloggs      4      1      6      Management      0
24372      2      47      fred      blogs      1      1      NULL      Goalkeepers      1
24372      2      47      fred      blogs      1      1      NULL      Defenders      1
24372      2      47      fred      blogs      1      1      NULL      Midfielders      1
24372      2      47      fred      blogs      1      1      NULL      Strikers      1
24372      2      47      fred      blogs      1      1      NULL      Management      1
24375      2      47      fred      blogs      5      1      NULL      Goalkeepers      1
24375      2      47      fred      blogs      5      1      NULL      Defenders      1
24375      2      47      fred      blogs      5      1      NULL      Midfielders      1
24375      2      47      fred      blogs      5      1      NULL      Strikers      1
24375      2      47      fred      blogs      5      1      NULL      Management      1
0

Software DeveloperCommented:
Then I don't see where the different first and surnames came from, you show in your question.

Anyway, you can only get joined data matching this part of the query. What angelIII did was Left join table3 t on t.teammemberid = d.id, so that join is on the first field with values 24377, 24372, 24375

I don't see how and why you assume you'll ever get 8 joins. There are just 5 rows woth seasonid 6 and only one id value 24377.

How is the data really related, what is the raw data from each table itself alone and how is it to be joined? It doesn't get clear.

Bye, Olaf.
0

Author Commented:
To try and explain what I'm doing a friend helps at a football club and is trying to work out what the players owe the club his database has 3 tables, Teammembers, Static_position and costs.  To confuse matters the season was originally in teammembers table BUT this meant they were recreating an entry for each player and season and they didn't always fill it out so lots of nulls.

They now want to create a table that for any season shows whose paid and who hasn't. The complications are that the teammembers table has effectively duplicate entries hence the use of RowNumbers, and the Costs tables doesn't have all the data (the reason for the joins ) so they can for every season see whose paid and who hasn't. I.e. next season they will have initially all the players and no one having paid so no entries in the costs table and they want to show this.

Hopefully the sample data from the attached excel spreadsheets shows this
SeasonJoin.xlsx
0

Software DeveloperCommented:
OK, I can see better now, but have not the final idea about the meaning of this.

You don't have any record in teammembers for seasonid 6, so what do you expect?

Actually it could even be vice versa, if the connection of teammembers and seasons is done in the costs table, there should be no seasonid column in teammembers, to get the data of one season you would join via costs and filter for costs.seasonid=6.

I would expect only 4 joins, though, since costs only has 4 rows for seasonid=6.

I don't know where to begin with. Do you have an even longer question history about this data? The previous question you join to also doesn't show enough explanation to me.

Bye, Olaf.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.