Solved

Continuing the Question on Cleaning up data

Posted on 2016-07-24
6
50 Views
Last Modified: 2016-07-24
After testing Further to Question on Cleaning up data the solution here its almost correct.

Here is the returned data for season 5 and season 6

Season 5
id      ClubId      TeamId      FirstName      Surname      TeamPosition      status      position      showit      Cost       seasonid
24417      2      47      Adam      Hills      2      1      Defenders      0      NULL      NULL
24371      2      47      Andrew      Clarke      4      1      Strikers      0      80      5
24377      2      47      Bobby      Bloggs      4      1      Strikers      0      70      5
24420      2      47      David      Cranbrook      3      1      Midfielders      0      NULL      NULL
24423      2      47      Jason      Admin      4      1      Strikers      0      NULL      NULL
24405      2      47      Jason      Batt      4      1      Strikers      0      NULL      NULL
24395      2      47      Philip      Batt      1      1      Goalkeepers      0      NULL      NULL
24376      2      47      Stewart      Clifton      1      1      Goalkeepers      0      NULL      NULL
                                                            
Season 6                                                            
id      ClubId      TeamId      FirstName      Surname      TeamPosition      status      position      showit      Cost       seasonid
24417      2      47      Adam      Hills      2      1      Defenders      0      NULL      NULL
24420      2      47      David      Cranbrook      3      1      Midfielders      0      NULL      NULL
24423      2      47      Jason      Admin      4      1      Strikers      0      NULL      NULL
24405      2      47      Jason      Batt      4      1      Strikers      0      NULL      NULL
24395      2      47      Philip      Batt      1      1      Goalkeepers      0      NULL      NULL
24376      2      47      Stewart      Clifton      1      1      Goalkeepers      0      NULL      NULL

As I know I have no entries for season 6 in the 2nd table ie the one with the cost and seasonid, I need  to see ALL 8 records with NULL's for Cost and seasonid not just the entries that are NULL

How do I tweak this?
0
Comment
Question by:markej
  • 4
  • 2
6 Comments
 
LVL 29

Expert Comment

by:Olaf Doschke
ID: 41726524
What does this give you?

SELECT d.id,d.ClubId, d.TeamId, d.FirstName, d.Surname, d.TeamPosition, d.status, d.Seasonid, c.position, d.showit
, t.cost
FROM (SELECT * , ROW_NUMBER() OVER (PARTITION BY firstname, surname, teamposition
                             ORDER BY ID DESC) AS rn
FROM teammembers
) AS d
join Static_Position as c
on 1=1

Open in new window


I'd assume this'll give you 6 records for season 6.

The "left side" of the data already has to have the 8 records you intend to have. The query of angelIII is not limiting data of the last joined table3 to only null records, the nulls also don't come from that table, they come from finding no join.

Bye, Olaf.
0
 

Author Comment

by:markej
ID: 41726568
It repeats the data 5 times one row for each position (which comes from the position table (Static_Position)
id      ClubId      TeamId      FirstName      Surname      TeamPosition      status      Seasonid      position      showit
24377      2      47      Bobby      Bloggs      4      1      6      Goalkeepers      0
24377      2      47      Bobby      Bloggs      4      1      6      Defenders      0
24377      2      47      Bobby      Bloggs      4      1      6      Midfielders      0
24377      2      47      Bobby      Bloggs      4      1      6      Strikers      0
24377      2      47      Bobby      Bloggs      4      1      6      Management      0
24372      2      47      fred      blogs      1      1      NULL      Goalkeepers      1
24372      2      47      fred      blogs      1      1      NULL      Defenders      1
24372      2      47      fred      blogs      1      1      NULL      Midfielders      1
24372      2      47      fred      blogs      1      1      NULL      Strikers      1
24372      2      47      fred      blogs      1      1      NULL      Management      1
24375      2      47      fred      blogs      5      1      NULL      Goalkeepers      1
24375      2      47      fred      blogs      5      1      NULL      Defenders      1
24375      2      47      fred      blogs      5      1      NULL      Midfielders      1
24375      2      47      fred      blogs      5      1      NULL      Strikers      1
24375      2      47      fred      blogs      5      1      NULL      Management      1
0
 
LVL 29

Expert Comment

by:Olaf Doschke
ID: 41726580
Then I don't see where the different first and surnames came from, you show in your question.

Anyway, you can only get joined data matching this part of the query. What angelIII did was Left join table3 t on t.teammemberid = d.id, so that join is on the first field with values 24377, 24372, 24375

I don't see how and why you assume you'll ever get 8 joins. There are just 5 rows woth seasonid 6 and only one id value 24377.

How is the data really related, what is the raw data from each table itself alone and how is it to be joined? It doesn't get clear.

Bye, Olaf.
0
DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

 

Author Comment

by:markej
ID: 41726595
To try and explain what I'm doing a friend helps at a football club and is trying to work out what the players owe the club his database has 3 tables, Teammembers, Static_position and costs.  To confuse matters the season was originally in teammembers table BUT this meant they were recreating an entry for each player and season and they didn't always fill it out so lots of nulls.

They now want to create a table that for any season shows whose paid and who hasn't. The complications are that the teammembers table has effectively duplicate entries hence the use of RowNumbers, and the Costs tables doesn't have all the data (the reason for the joins ) so they can for every season see whose paid and who hasn't. I.e. next season they will have initially all the players and no one having paid so no entries in the costs table and they want to show this.

Hopefully the sample data from the attached excel spreadsheets shows this
SeasonJoin.xlsx
0
 
LVL 29

Expert Comment

by:Olaf Doschke
ID: 41726602
OK, I can see better now, but have not the final idea about the meaning of this.

You don't have any record in teammembers for seasonid 6, so what do you expect?

Actually it could even be vice versa, if the connection of teammembers and seasons is done in the costs table, there should be no seasonid column in teammembers, to get the data of one season you would join via costs and filter for costs.seasonid=6.

I would expect only 4 joins, though, since costs only has 4 rows for seasonid=6.

I don't know where to begin with. Do you have an even longer question history about this data? The previous question you join to also doesn't show enough explanation to me.

Bye, Olaf.
0
 
LVL 29

Accepted Solution

by:
Olaf Doschke earned 500 total points
ID: 41726607
There is one simple query, that'll give you all 8 teammembers always, and then join in more info:

SELECT * FROM teammembers LEFT JOIN ...

Open in new window


So a way to see costs for season 6 would be

SELECT * FROM teammembers t LEFT JOIN costs c on t.id = c.teammemberid AND c.seasonid=6

Open in new window


Once you put the condition seasonid=6 into the WHERE clause of the overall query you force a join, you make the left join an inner join, and you're only left with the team members having a join with seasonid=6 and other team members have seasinid=NULL, which doesn't match with any condition.

In the WHERE clause of any query having OUTER joins (like LEFT, RIGHT), you can filter columns in the table you query all data, not in the table you join losely, that's never working.

Besides all this, your data points out errors in the position of teammmebers, it contradicts the id and its meaning according to static_position.

Bye, Olaf.
0

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.
Via a live example, show how to shrink a transaction log file down to a reasonable size.

914 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now