Link to home
Create AccountLog in
Avatar of highlawn
highlawnFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Database Query Assistance

I have a database that manages items for a sports club. There are the following tables present that I wish to query:

Member - has an autonumber ID and contains name details and other personal info
Player - if this member is a player then a row exists on the player table. The ID is an autonumber and there is a member_id column which contains the ID of the member associated with this player.

Selection - this table indicates when a player has been selected for a fixture. It has an id of the combination of fixture_id and player_id. It has no other relevant info.

Fixture - This table contains details of the fixture to be played (date, time, KO, opposition, location, referee etc.). It is keyed by an autonumber ID

The question I want to answer is "When was the last game for all players" and to show if they haven't yet been selected for a game.

So, I thought MEMBER JOIN PLAYER on player.member_id=member.id and PLAYER LEFT JOIN onto SELECTION player.id=selection.player_id and then SELECTION JOIN to FIXTURE on selection.fixture_id=fixture.id

The reasons being:

1. I ONLY want Members who ARE Players - so standard join
2. I want ALL Players - regardless of if they have been selected - so LEFT JOIN with player at the Left.
3. If there is a Selection - I want the Fixture - but I only want one fixture - which was the most recent one played.
4. There are 2 further columns on Fixture - STATUS which is an ENUM and can have values not played, played, cancelled WE ONLY WANT played and DATE which is a standard My SQL DATE field.

There are some other predicates needed - but the above is the nuts and bolts. For completeness, I only want ADULTS - so I check the Member Birth Date and I only want certain Teams (so I join FIXTURE with HOME_TEAM) and I want to know the title of the position by joining the FIXTURE with the POSITION table.

Below is the FULL query that I have come up with but it times out - possibly unsurprisingly - so my question is if there is a better way - even if it means duplicating things on tables.

SELECT m.membership_id , m.id , m.lastname , m.firstname , m.gender , 
m.date_of_birth , m.active , p.active , p.type , p.section_id , f.date , home_team.name , position.description  

FROM member m, player p
LEFT JOIN selection s ON s.player_id=player.id 
JOIN fixture f ON f.id=s.fixture_id 
JOIN home_team ON f.home_team_id=home_team.id 
AND home_team.id IN(1,2,4,5,6,7,281) 
JOIN position ON s.position_id=position.id

WHERE p.member_id=m.id 
AND (m.date_of_birth<"1993-09-01" OR date_of_birth="0000-00-00") 
AND m.gender="male"  

AND f.id=(SELECT substring_index(group_concat(fs.id ORDER BY fs.date DESC), ',', 1 ) FROM fixture fs, selection ss WHERE fs.status="played" 
AND fs.home_team_id IN(1,2,4,5,7,281) 
AND fs.id=ss.fixture_id AND ss.player_id=s.player_id )

GROUP BY p.id 
ORDER by p.active DESC, 
m.active DESC, 
f.date DESC

Open in new window


The idea behind the last predicate is to get the fixture whose date is the latest but was played in by the player.

Any help gratefully received.
Avatar of mkiredjian
mkiredjian
Flag of Lebanon image

You have to provide me the tables sql and sample data to help you but as a start:
 Start your select from the player table since you only want the players

Select m.membership_id , m.id , m.lastname , m.firstname , m.gender ,
m.date_of_birth , m.active , p.active , p.type , p.section_id , f.date , home_team.name , position.description  
from player p
inner join member m on p.member_id=m.id

the other things I need your tables sql and sample data to tell you what you can fix in this query


Avatar of highlawn

ASKER

Table structures added and data for the two small tables. I'll have to create an extract for the others and got to dash to a meeting now. Thanks tables.sql
One thing is I may not be able to load the member table as it contains real names and we have the DPA over here unless I can encrypt them. But the member table is not the real issue - it is the joining of data between player - selection and fixture that is the difficulty.

Highlawn I just need data so that your query give me some records I have executed your sql files. When I runed your query the result was table player does not exists please provide me the tables and some records that will return result for your query and I will provide you with an  optimized query
Sorry- I must have missed that one. Give me 30 minutes and I'll have them here. Like I said, we may need to remove member - but I'll see what I can do quickly.
give me some of memebers few players and few not players and change their real names in the sql file put anonymous1,anonymous2,anonymous3,anonymous4
OK - I've extracted the tables and data and messed around with the identifying data and removed a whole load of rows to allow a smaller subset. I've zipped them all into one file attached.

I ran my sql on it but the result isn't what I want - at first glance, but at least it doesn't time out!

I should say I made a minor error in copying my original sql above and the below is what I have been running.

SELECT m.membership_id , m.id , m.lastname , m.firstname , m.gender , 
m.date_of_birth , m.active , p.active , p.type , p.section_id , f.date , home_team.name , position.description  

FROM member m, player p
LEFT JOIN selection s ON s.player_id=p.id 
JOIN fixture f ON f.id=s.fixture_id 
JOIN home_team ON f.home_team_id=home_team.id 
AND home_team.id IN(1,2,4,5,6,7,281) 
JOIN position ON s.position_id=position.id

WHERE p.member_id=m.id 
AND (m.date_of_birth<"1993-09-01" OR date_of_birth="0000-00-00") 
AND m.gender="male"  

AND f.id=(SELECT substring_index(group_concat(fs.id ORDER BY fs.date DESC), ',', 1 ) FROM fixture fs, selection ss WHERE fs.status="played" 
AND fs.home_team_id IN(1,2,4,5,7,281) 
AND fs.id=ss.fixture_id AND ss.player_id=s.player_id )

GROUP BY p.id 
ORDER by p.active DESC, 
m.active DESC, 
f.date DESC

Open in new window


 selection-and-data.zip
Ok This is the most optimized query you can use try it


SELECT s.player_id,f.id,m.membership_id , m.id , m.lastname , m.firstname , m.gender ,
m.date_of_birth , m.active , p.active , p.type , p.section_id , f.date , home_team.name , position.description  
FROM player p
INNER JOIN member m ON m.id=p.member_id AND (m.date_of_birth<"1993-09-01" OR date_of_birth="0000-00-00") AND m.gender="male"
LEFT JOIN selection s ON s.player_id=p.id
JOIN fixture f ON f.id=s.fixture_id
AND f.home_team_id IN(1,2,4,5,7,281) AND f.status="played"
INNER JOIN fixture ff ON ff.id IN (f.id) AND ff.date=(SELECT MAX(DATE) FROM fixture WHERE fixture.id IN(f.id))
JOIN home_team ON f.home_team_id=home_team.id
JOIN POSITION ON s.position_id=position.id
WHERE 1
GROUP BY p.id
ORDER BY p.active DESC,
m.active DESC,
f.date DESC
OK - many thanks for that. It certainly matches all the players with their last selections based upon the ones that I have checked. My worry is that it doesn't seem to show all the players though.

What I wanted - and the reason for the left join on player to selection - was to show ALL players - including those who haven't yet been selected for any fixtures.

I haven't checked this thoroughly yet - just there are no players listed without selection and fixture data.

On my tables there are:

3188 members
1351 players

Of these 1351 players, 571 of them qualify based upon the extract of membership info - i.e. they have gender="male" and their DOB is either unset (0000-00-00) or is before "1993-09-01".

So, I guess I'm expecting a list of 571 players, of which 259 (the number returned from your query) show the matching selection, fixture, home_team and position data, whilst the remaining (non matched to selection) 311 players have null/zero/spaces returned instead.

Am I expecting the wrong things here?

Many Thanks.
I've just checked some of the players where I know their games and it is picking up the wrong fixture details.

An example using the data that I gave you.

Player id 88 in your query using the data that I gave you appears in the second page of the 58 returned rows - about half way down. He is shown as having last played in fixture 3884 on 2011-08-13 whereas he also played in fixture 3865 which took place on 2011-10-29. There are others - but I'm sure the example serves.

Thanks for your assistance.
The thing is that "left join on player to selection " has no meaning since later you are using

JOIN fixture f ON f.id=s.fixture_id  this will remove the players that don't have fixtures you are expecting wrong things here in my opinion
Possibly so - I sort of want to LEFT JOIN player to the already JOINed selection/fixture/home_team/position.

As I said in the initial post what I want to achieve is:

The question I want to answer is "When was the last game for all players" and to show if they haven't yet been selected for a game.
.

So - in English or pseudo code I want:

All players who have member rows with gender="male" and have a date of birth either before 1st Sep 20110 OR the DOB is unset.

This should be the ONLY criteria for listing the player and member data.

Then, I wish to augment that data with details of the LAST (date wise) fixture they played in. In order to do this, there must be  matching player_id on selection and that must be onward joined to fixture. Then, the minor detail of adding on the home team name and position description.

Possibly I need to approach it differently.
Try this

SELECT s.player_id,f.id,m.membership_id , m.id , m.lastname , m.firstname , m.gender ,
m.date_of_birth , m.active , p.active , p.type , p.section_id , f.date , home_team.name , position.description  
FROM player p
INNER JOIN member m ON m.id=p.member_id AND (m.date_of_birth<"1993-09-01" OR date_of_birth="0000-00-00") AND m.gender="male"
LEFT JOIN selection s ON s.player_id=p.id
LEFT JOIN fixture f ON f.id=s.fixture_id
AND f.home_team_id IN(1,2,4,5,7,281) AND f.status="played"
LEFT JOIN fixture ff ON ff.id IN (f.id) AND ff.date=(SELECT MAX(DATE) FROM fixture WHERE fixture.id IN(f.id))
LEFT JOIN home_team ON f.home_team_id=home_team.id
JOIN POSITION ON s.position_id=position.id
WHERE 1
GROUP BY p.id
ORDER BY p.active DESC,
m.active DESC,
f.date DESC
OK - that returns 306 rows and, some of them do look like they have NULL entries in them. However, the data is not always correct and there aren't enough rows.

I've been doing some reading on this and I'm not sure it is possible to do what I wish. The crux of the query is "to find the last date this player took part in a fixture". If we - for a moment - ignore everything else we can see that we can ONLY get the date from the fixture table. It doesn't exist anywhere else.

So, if we just queried fixture, we could get a series of fixtures - for the teams in question. However, if we constrain that date in any way - MAX ing it for instance - we are only talking about getting the last date that TEAM had a fixture - being as player is not part of the fixture data.

If we then introduce player - he may not match AT ALL because he didn't play in that chronologically last game for the team.

Now, I do see that you are only constraining the date by the results of ids and not the team - but I haven't quite got my head round your double join of fixture and using it a third time in the subselect!

BTW POSITION in your query should be position and I think that should be LEFT JOINed as well in the latest example.

Cheers
ASKER CERTIFIED SOLUTION
Avatar of mkiredjian
mkiredjian
Flag of Lebanon image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
Like I said in the previous post - I'm not sure we can do what I'm trying to do. So far, the query doesn't give the last game in every instance.

In some instances it gives values in the position.description with null values in the team and fixture data - which is very strange as I can't see how there can be a selection without a team or a fixture.

Also, some rows are giving null values in the player id and fixture id - which is incorrect. If they are a player then they have an id - same for fixture. In the same row - it has a value for player type which can only be when there is a player row - and that MUST have an id.

So, like I said - what we currently have doesn't do what I want. My guess is that it is the GROUP BY player.id as when I read details on GROUP BY it suggested that the one it picked was pretty random.

I've been a bit busy, but I was thinking of writing a PHP script and breaking the query down somewhat.

So, the bottom line is that this query doesn't do what I need it to.
The help was good, but we couldn't really solve the underlying problem. I've done things in a PHP script now in several queries so this can be closed.