Link to home
Start Free TrialLog in
Avatar of Big Monty
Big MontyFlag for United States of America

asked on

trouble with query

hello experts :)

This is a continuation from a previous question - https://www.experts-exchange.com/questions/27774164/need-some-help-adjusting-my-query.html. To summarize, I'm building a hockey website for a local league in my area and am currently working on getting the standings working properly.

thanks to the efforts of lowfatspread, i've come up with the query below, which I now need to add on to. I'm now trying to determine the number of regulation wins/losses/ties per team so that I can apply a points system based off of these numbers (3pts for a reg win, 2 pts for an OT win, 1 pt for an OT loss).

i have a table called tPlayerStats that keeps track of each goal scored by each player per game. In this table is a column called periodNum, which determines which period the goal was scored in. if the periodNum is equal to 4, its considered overtime.

right now, i'm getting the correct number of OTWins/OTLosses/OTTies (i think, I have fully tested it, but so far so good) but now my gamesPlayed column is way off, due to i'm joining the tPlayerStats table. Since there are multiple rows in tPlayerStats per game, its adding these rows into the count.

Can someone suggest a better way to accomplish what I want? Below is my code for the full query, the part that I believe is killing the count is:

	inner join tPlayerStats as ps
     on g.gameID = ps.gameID and ( g.homeTeam = ps.teamID or g.awayTeam = ps.teamID )

Open in new window


ALTER PROCEDURE [dbo].[sp_getStandings]
	@divisionID int,
	@season int,
	@gameType varchar(10)
AS
BEGIN
	SET NOCOUNT ON;

SELECT    *
FROM 
	(SELECT    t.Team, COUNT(*) AS GamesPlayed, t.color, captain = isNull( p.firstName + ' ' + p.lastName, 'N/A' ), 
					SUM(CASE 
								WHEN g.homeTeam = t.teamID AND g.homeScore > g.awayScore THEN 1  
								WHEN g.awayTeam = t.teamID AND g.homeScore < g.awayScore THEN 1 
								ELSE 0  
							END) AS Wins,  
					SUM(CASE 
								WHEN g.homeTeam = t.teamID AND g.homeScore < g.awayScore THEN 1  
								WHEN g.awayTeam = t.teamID AND g.homeScore > g.awayScore THEN 1  
								ELSE 0  
							END) AS Losses,  
					SUM(CASE 
								WHEN g.homeTeam = t.teamID AND g.homeScore > g.awayScore and divreq='Y' THEN 1  
								WHEN g.awayTeam = t.teamID  AND g.homeScore < g.awayScore and divreq='Y' THEN 1 
								ELSE 0  
							END) AS DivisionalWins,  
					SUM(CASE 
								WHEN g.homeTeam = t.teamID AND g.homeScore < g.awayScore and divreq='Y' THEN 1  
								WHEN g.awayTeam = t.teamID AND g.homeScore > g.awayScore and divreq='Y' THEN 1  
								ELSE 0  
							END) AS DivisionalLosses, 
				    SUM(CASE 
								WHEN g.homeScore = g.awayScore and divreq='Y' THEN 1  
								ELSE 0  
							END) AS DivisionalTies, 
					SUM(CASE 
								WHEN g.homeScore = g.awayScore THEN 1  
								ELSE 0  
							END) AS Ties,  
					SUM(CASE 
						    WHEN g.homeTeam = t.teamID THEN g.homeScore 
					        WHEN g.awayTeam = t.teamID THEN g.awayScore 
							ELSE 0 
                       END) as GoalsFor, 
					SUM(CASE 
						    WHEN g.homeTeam = t.teamID THEN g.awayScore 
					        WHEN g.awayTeam = t.teamID THEN g.homeScore 
							ELSE 0 
                       END) as GoalsAgainst,
					SUM(CASE
							WHEN g.homeTeam = t.teamID AND g.homeScore > g.awayScore AND ps.period < 4 THEN 1  
							WHEN g.awayTeam = t.teamID AND g.homeScore < g.awayScore AND ps.period < 4  THEN 1 
							ELSE 0
						END ) as regulationWins,
					SUM(CASE
							WHEN g.homeTeam = t.teamID AND g.homeScore > g.awayScore AND ps.period = 4 THEN 1  
							WHEN g.awayTeam = t.teamID AND g.homeScore < g.awayScore AND ps.period = 4  THEN 1 
							ELSE 0
						END ) as OTWins,
					SUM(CASE
							WHEN g.homeTeam = t.teamID AND g.homeScore < g.awayScore AND ps.period = 4 THEN 1  
							WHEN g.awayTeam = t.teamID AND g.homeScore > g.awayScore AND ps.period = 4  THEN 1 
							ELSE 0
						END ) as OTLosses
	FROM				
	    (select g.*,h.divisionid as homediv,a.divisionid as awaydiv
	           ,Case When h.divisionid=a.divisionid then 'Y' else 'N' end as samediv
	           ,Case When h.divisionid=a.divisionid and H.divisionid=@divisionid then 'Y' else 'N' end as divreq
	       from tGames  as g  
	        inner join tteams as H
	         on g.hometeam = h.teamid
	        inner join tteams as a
	         on g.awayteam = a.teamid
	    ) as g
	JOIN				tTeams t 
	  ON g.homeTeam = t.teamID 
	  OR g.awayTeam = t.teamID 
	LEFT OUTER JOIN     tPlayers p 
	  ON p.playerID = t.captainID 
	inner join tPlayerStats as ps
     on g.gameID = ps.gameID and ( g.homeTeam = ps.teamID or g.awayTeam = ps.teamID )
	WHERE				g.season = @season 
	and g.gameType = @gameType 
	AND t.divisionID = @divisionID 
	GROUP BY			t.Team, color, p.firstName, p.lastName) AS a
	ORDER BY			team ASC;
						
END

Open in new window

Avatar of David Todd
David Todd
Flag of New Zealand image

Hi,

I agree with your assessment that the join to player stats is killing the count.

In effect what is happening here is that your base query being okay, you are multiplying the number of rows by the number of players.

You are almost doing too much with the one query.

If necessary, store results in a temp table or two, update additional columns with other results, then select from the temp table for the final output.

Do watch the select * at the outer query - doesn't look good. All columns should be specified, all calculations should have a column alias, all columns should have a table alias.

Try this:
ALTER PROCEDURE [dbo].[sp_getStandings]
	@divisionID int,
	@season int,
	@gameType varchar(10)
AS
BEGIN
	SET NOCOUNT ON;

SELECT    *
FROM 
	(SELECT    t.Team, COUNT(*) AS GamesPlayed, t.color, captain = isNull( p.firstName + ' ' + p.lastName, 'N/A' ), 
					SUM(CASE 
								WHEN g.homeTeam = t.teamID AND g.homeScore > g.awayScore THEN 1  
								WHEN g.awayTeam = t.teamID AND g.homeScore < g.awayScore THEN 1 
								ELSE 0  
							END) AS Wins,
					SUM(CASE 
								WHEN g.homeTeam = t.teamID AND g.homeScore < g.awayScore THEN 1
								WHEN g.awayTeam = t.teamID AND g.homeScore > g.awayScore THEN 1  
								ELSE 0  
							END) AS Losses,  
					SUM(CASE 
								WHEN g.homeTeam = t.teamID AND g.homeScore > g.awayScore and divreq='Y' THEN 1  
								WHEN g.awayTeam = t.teamID  AND g.homeScore < g.awayScore and divreq='Y' THEN 1 
								ELSE 0  
							END) AS DivisionalWins,  
					SUM(CASE 
								WHEN g.homeTeam = t.teamID AND g.homeScore < g.awayScore and divreq='Y' THEN 1  
								WHEN g.awayTeam = t.teamID AND g.homeScore > g.awayScore and divreq='Y' THEN 1  
								ELSE 0  
							END) AS DivisionalLosses, 
				    SUM(CASE 
								WHEN g.homeScore = g.awayScore and divreq='Y' THEN 1  
								ELSE 0  
							END) AS DivisionalTies, 
					SUM(CASE 
								WHEN g.homeScore = g.awayScore THEN 1  
								ELSE 0  
							END) AS Ties,  
					SUM(CASE 
						    WHEN g.homeTeam = t.teamID THEN g.homeScore 
					        WHEN g.awayTeam = t.teamID THEN g.awayScore 
							ELSE 0 
                       END) as GoalsFor, 
					SUM(CASE 
						    WHEN g.homeTeam = t.teamID THEN g.awayScore 
					        WHEN g.awayTeam = t.teamID THEN g.homeScore 
							ELSE 0 
                       END) as GoalsAgainst,
					SUM(CASE
							WHEN g.homeTeam = t.teamID AND g.homeScore > g.awayScore AND ps.period < 4 THEN 1  
							WHEN g.awayTeam = t.teamID AND g.homeScore < g.awayScore AND ps.period < 4  THEN 1 
							ELSE 0
						END ) as regulationWins,
					SUM(CASE
							WHEN g.homeTeam = t.teamID AND g.homeScore > g.awayScore AND ps.period = 4 THEN 1  
							WHEN g.awayTeam = t.teamID AND g.homeScore < g.awayScore AND ps.period = 4  THEN 1 
							ELSE 0
						END ) as OTWins,
					SUM(CASE
							WHEN g.homeTeam = t.teamID AND g.homeScore < g.awayScore AND ps.period = 4 THEN 1  
							WHEN g.awayTeam = t.teamID AND g.homeScore > g.awayScore AND ps.period = 4  THEN 1 
							ELSE 0
						END ) as OTLosses
	FROM				
	    (select g.*,h.divisionid as homediv,a.divisionid as awaydiv
	           ,Case When h.divisionid=a.divisionid then 'Y' else 'N' end as samediv
	           ,Case When h.divisionid=a.divisionid and H.divisionid=@divisionid then 'Y' else 'N' end as divreq
	       from tGames  as g  
	        inner join tteams as H
	         on g.hometeam = h.teamid
	        inner join tteams as a
	         on g.awayteam = a.teamid
	    ) as g
	JOIN				tTeams t 
	  ON g.homeTeam = t.teamID 
	  OR g.awayTeam = t.teamID 
	LEFT OUTER JOIN     tPlayers p 
	  ON p.playerID = t.captainID 
	WHERE				g.season = @season
	and g.gameType = @gameType
	AND t.divisionID = @divisionID
	GROUP BY			t.Team, color, p.firstName, p.lastName) AS a
left outer join tPlayerStats as ps
     on ps.gameID = a.gameID and ( a.homeTeam = ps.teamID or a.awayTeam = ps.teamID )

	ORDER BY			a.team ASC;
						
END

Open in new window


Regards
   David
Is it possible to post some sample data (maybe an excel spreadsheet) reflecting where it is going wrong ?

The join isnt correct for your aggregations. A quick demo below will show that the tGames will be replicated and so will the scores.


create table #tGames (id int identity, hometeam varchar(20), awayteam varchar(20), homescore int, awayscore int)
create table #tTeams (id int identity, teamid varchar(20))

insert #tGames values ('H1','A1',3,1)
insert #tTeams values ('H1')
insert #tTeams values ('A1')

select * 
from #tgames g 
join #tteams t on g.hometeam = t.teamid or g.awayteam = t.teamid

Open in new window


The difficulty in using tGames would appear to be having both aspects of "Home" and "Away" on the same row

Ideally we normalise it a bit before we start to aggregate, or, maybe unpivot (or union etc).

Anyway, if you can provide some sample data, can definitely help fix.
Avatar of Big Monty

ASKER

dtodd:

I'm getting errors when I run your suggested fix:

Msg 4104, Level 16, State 1, Procedure sp_getStandings, Line 9
The multi-part identifier "ps.period" could not be bound.
Msg 4104, Level 16, State 1, Procedure sp_getStandings, Line 9
The multi-part identifier "ps.period" could not be bound.
Msg 4104, Level 16, State 1, Procedure sp_getStandings, Line 9
The multi-part identifier "ps.period" could not be bound.
Msg 4104, Level 16, State 1, Procedure sp_getStandings, Line 9
The multi-part identifier "ps.period" could not be bound.
Msg 4104, Level 16, State 1, Procedure sp_getStandings, Line 9
The multi-part identifier "ps.period" could not be bound.
Msg 4104, Level 16, State 1, Procedure sp_getStandings, Line 9
The multi-part identifier "ps.period" could not be bound.
Msg 207, Level 16, State 1, Procedure sp_getStandings, Line 85
Invalid column name 'gameID'.
Msg 207, Level 16, State 1, Procedure sp_getStandings, Line 85
Invalid column name 'homeTeam'.
Msg 207, Level 16, State 1, Procedure sp_getStandings, Line 85
Invalid column name 'awayTeam'.

Open in new window


if i was to break this down into multiple temp tables (i do agree, its getting a bit much to manage all in one query, what would be the best way to do it?

mark_willis:
tGames is just used as a general summary of each game, thats why home and away appear for each record, to capture each teams info. tPlayerStats records each goal and associated data) as an individual record. Would there be a better way to normalize that data?

i attached two files showing tGames and tPlayerStats
tGames.png
tPlayerStats.png
OK, so there should be an entry in tplayerstats for each and every goal ?
correct, each goal has its own record. The screen shot was simplified for readability :)
OK, got it...

Note that the problem is not just with the playerstats, but your join to teams (the second one).

I can see why you are doing it - to get discrete teamid's for the captain, team and color. But it is not quite correct - it will be duplicating certain aspects (as per my example above).

I will have a quick look  (it is 5:20am "downunder" so might fall asleep :D)
if i don't do any join on tPlayerStats and omit the columns from that table, everything gets returned as it should...

I'm just wondering if I should start creating a few temp tables and dump the info into them that way. Would that be better performance-wise, as I wouldnt have so many joins?

go get some sleep :)
Dont see a need for temp tables at this stage. Just sorting out the best way to tackle...
ASKER CERTIFIED SOLUTION
Avatar of Mark Wills
Mark Wills
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
this looks to be working great :) I'll do some more testing but so far so good!

2 questions:

- if I want to include teams that haven't played yet, would it be relatively easy to alter the query to allow for that?

- I want to add another column called points, this is a column thats calculated based upon the number of wins/ties/OT wins/ OT losses. do i need to create a temp table with the results of this table, then calculate the points column or is there a way to incorporate this column in the first select statement?

thanks again for all of your help on this!
Only 2 questions ? Phew *laughing*

1) Probably - but where do future draws live ?
2) Can do it directly without temp tables

And we dont really need the second link to tTeams either - it is already linked "inside" the union query. But will wait for feedback about the unplayed games.

As for points, we can incorporate - but the above indicates something more than the original comment : "3pts for a reg win, 2 pts for an OT win, 1 pt for an OT loss"
For points it is simply a matter of listing the requirements e.g.

       -- 3pts for a reg win, 2 pts for an OT win, 1 pt for an OT loss).
       , sum(case when OT = 'N' and wins = 1 then 3
                  when OT = 'Y' and wins = 1 then 2
                  when OT = 'Y' and losses = 1 then 1
             else 0 end) as points 

Open in new window

well, i always have lots of questions, just not that many related to this :)

in our league, there technically are no OT's, in the traditional sense of continued play, but rather it goes directly to a shootout of 5 players per team, and if the score is still tied then, its declared a tie. not my rules or preference, but since time is a precious commodity (we only have the gym we rent for so long) it makes sense.

this looks good, i'll test this out tonight and award the points right after. thanks so much for your assistance!
Hi Mark,

I had put this issue aside after my initial testing and never really tested with teams that haven't played yet. I recently came back to this page (with a much larger data set making it easier to test)  and noticed that we never got that part actually working. is this something you could help me with? I'll gladly open another question if you like for it, I figured I'd ask before in order to give you a chance to get the points.

Cheers
Josh
Hi Josh,

Sure can help, just need a bit of direction from you as to where we are now up to, and what we need to do next...

Just post the new question link here in this thread, or, post your next comment so we can move forward with whatever needs to happen to make you happy (from an SQL perspective that is :) )