CraigBFG
asked on
Personal best times, by Age & Distance
This question follows on from a previous which was expertly solved, but i've now made changes to the db and requirements, it is a little more complex.
https://www.experts-exchange.com/questions/24525039/Top-3-times-per-group-multiple-tables.html
The db is for a running club. I need to return the top 3 personal best times, by distance and age category. The query that we have working produces the results, but is VERY slow to produce them - so the question is, what can be done to improve the queries efficiency - have we got it very wrong?
My top 3 personal bests, I refer to individuals. For example, one person alone may hold the top 3 best times themselves, but for this exercise, the only their best counts,
eg
Paul G - 0:20:00
Paul G - 0:20:21
PaulG - 0:20:22
FredB - 0:20:35
FredB - 0:20:40
GaryC - 0:21:00
So the top 3 in this case needs to show PaulG - 0:20:00 / FredB - 0:20:35 / GaryC - 0:21:00
(Gold, silver & bronze positions)
So as a grid, the results should look something like ...
--> Distance (10k)
------> RaceCat (35) - this is an age (35/40/45/50 etc)
------------->Pos1 / Pos2 / Pos3
I have attached our db for real world reference - the results itself table has over 23000 records.
Many thanks in advance for assistance.
Craig
HHdB.zip
https://www.experts-exchange.com/questions/24525039/Top-3-times-per-group-multiple-tables.html
The db is for a running club. I need to return the top 3 personal best times, by distance and age category. The query that we have working produces the results, but is VERY slow to produce them - so the question is, what can be done to improve the queries efficiency - have we got it very wrong?
My top 3 personal bests, I refer to individuals. For example, one person alone may hold the top 3 best times themselves, but for this exercise, the only their best counts,
eg
Paul G - 0:20:00
Paul G - 0:20:21
PaulG - 0:20:22
FredB - 0:20:35
FredB - 0:20:40
GaryC - 0:21:00
So the top 3 in this case needs to show PaulG - 0:20:00 / FredB - 0:20:35 / GaryC - 0:21:00
(Gold, silver & bronze positions)
So as a grid, the results should look something like ...
--> Distance (10k)
------> RaceCat (35) - this is an age (35/40/45/50 etc)
------------->Pos1 / Pos2 / Pos3
I have attached our db for real world reference - the results itself table has over 23000 records.
Many thanks in advance for assistance.
Craig
HHdB.zip
ASKER
thanks, but the query has moved on somewhat since the accepted example in the previous question. If you take a look at the attached db - the remaining query is the one which we are working with presently.
thanks.
thanks.
ASKER
Hi all
As this is the final step in our project - its really important to us. As such, a further 1,000 points will be made available to the accepted solution.
thanks
C.
As this is the final step in our project - its really important to us. As such, a further 1,000 points will be made available to the accepted solution.
thanks
C.
ASKER
cheers for the heads up - not used ee for a while, things have changed :-)
anyhow, the comment I put is still valid, in general.
any condition (join . ON <condition>) is where you have to look at.
for example:
tblDist INNER JOIN tblRaces ON tblDist.DistID = tblRaces.DistID
ensure that you have a index on DistID for table tblRaces
WHERE (((tblDist.HoF)=True) AND ((tblRunners.RunnerClub)=" Hermitage" ) AND ((tblRunners.RunnerGender) ="M") AND ((tblResults.ResultID)
ensure you have 1 index on the fields RunnerClub + ResultID
the fields Hof and RunnderGender are not really "selective", so won't help for the index/query
the full sql:
any condition (join . ON <condition>) is where you have to look at.
for example:
tblDist INNER JOIN tblRaces ON tblDist.DistID = tblRaces.DistID
ensure that you have a index on DistID for table tblRaces
WHERE (((tblDist.HoF)=True) AND ((tblRunners.RunnerClub)="
ensure you have 1 index on the fields RunnerClub + ResultID
the fields Hof and RunnderGender are not really "selective", so won't help for the index/query
the full sql:
SELECT tblRaces.DistID, tblRaces.RaceName, tblResults.RaceCat, tblRaceDates.RaceDate, tblResults.RaceTime, tblDist.HoF, tblRunners.RunnerClub, tblRunners.RunnerGender, tblRunners.RunnerName
FROM tblRunners INNER JOIN ((tblDist INNER JOIN tblRaces ON tblDist.DistID = tblRaces.DistID) INNER JOIN (tblRaceDates INNER JOIN tblResults ON tblRaceDates.RaceDateID = tblResults.RaceDateID) ON tblRaces.RaceID = tblRaceDates.RaceID) ON tblRunners.RunnerKey = tblResults.RunnerKey
WHERE (((tblDist.HoF)=True) AND ((tblRunners.RunnerClub)="Hermitage") AND ((tblRunners.RunnerGender)="M") AND ((tblResults.ResultID) In
(
SELECT TOP 3 d.ResultID FROM tblRunners AS e INNER JOIN ((tblDist AS a INNER JOIN tblRaces AS b ON a.DistID = b.DistID) INNER JOIN (tblRaceDates AS c INNER JOIN tblResults AS d ON c.RaceDateID = d.RaceDateID) ON b.RaceID = c.RaceID) ON e.RunnerKey = d.RunnerKey
WHERE (d.RaceCat = tblResults.RaceCat) AND (a.DistID = tblDist.DistID) AND ((e.RunnerClub)="hermitage") AND ((e.RunnerGender)="m") AND ((a.HoF)=True) AND (((d.ResultID) In
(
SELECT TOP 1 i.ResultID FROM ((tblDist AS f INNER JOIN tblRaces AS g ON f.DistID = g.DistID) INNER JOIN (tblRaceDates AS h INNER JOIN tblResults AS i ON h.RaceDateID = i.RaceDateID) ON g.RaceID = h.RaceID) INNER JOIN tblRunners j ON i.RunnerKey = j.RunnerKey
WHERE (i.RaceCat = tblResults.RaceCat) AND (f.DistID = tblDist.DistID) AND (j.RunnerKey = e.RunnerKey) AND ((j.RunnerClub)="hermitage") AND ((j.RunnerGender)="m") AND ((f.HoF)=True)
ORDER BY f.PaceFactor, i.RaceCat, i.RaceTime )
)
)
ORDER BY a.PaceFactor, d.RaceCat, d.RaceTime )))
ORDER BY tblDist.PaceFactor, tblResults.RaceCat, tblResults.RaceTime;
ASKER
thanks - the indexes were inplace, but I added one against "RunnerClub". There was perhaps a 2 sec inprovement, but it still takes 25 secs to resolve the results - which is far too long.
q: is this a standalone ms access db, or does it actually have a ms sql server db behind?
if it has a sql server behind, I would suggest we solve the issue in mssql server (with a view), and the ms access shows the data from the view.
I know that with sql 2005, this "top 3" subquery can be improved a lot using the row_number() function...
if it has a sql server behind, I would suggest we solve the issue in mssql server (with a view), and the ms access shows the data from the view.
I know that with sql 2005, this "top 3" subquery can be improved a lot using the row_number() function...
ASKER
it is presently a MDB which will most likely be upsized to MS SQL 2000 or possibly 2008 (buit not 2005). Whilst we are "messing" with the raw data, I find Access a better tool.
with the new index and a compacted db, the query takes 19 secs to run.
with the new index and a compacted db, the query takes 19 secs to run.
>I find Access a better tool.
please create the db in sql server anyhow, and then use a ms access project to simply interface with the existing sql database (access 2007 can't do that, though)
you will have the full power of sql from ms sql server, while with the ms access interface it's indeed much easier to "mess" with the data.
please create the db in sql server anyhow, and then use a ms access project to simply interface with the existing sql database (access 2007 can't do that, though)
you will have the full power of sql from ms sql server, while with the ms access interface it's indeed much easier to "mess" with the data.
ASKER
Ok
I've upsized the MDB to SQL2008 - and have run the same query - and the performance is no better - so, what is the next step.??
I've upsized the MDB to SQL2008 - and have run the same query - and the performance is no better - so, what is the next step.??
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi
Ok, well that returns the same list as I have previously in Access & SQL - without the grouping in place - and runs a quickly.
SELECT TOP (100) PERCENT dbo.tblDist.RaceDist, dbo.tblRaces.RaceName, dbo.tblRaceDates.RaceDate, dbo.tblResults.RaceTime,
dbo.tblRunners.RunnerClub, dbo.tblRunners.RunnerGende r
FROM dbo.tblDist INNER JOIN
dbo.tblRaces ON dbo.tblDist.DistID = dbo.tblRaces.DistID INNER JOIN
dbo.tblRaceDates ON dbo.tblRaces.RaceID = dbo.tblRaceDates.RaceID INNER JOIN
dbo.tblResults ON dbo.tblRaceDates.RaceDateI D = dbo.tblResults.RaceDateID INNER JOIN
dbo.tblRunners ON dbo.tblResults.RunnerKey = dbo.tblRunners.RunnerKey
WHERE (dbo.tblDist.HoF = 1) AND (dbo.tblRunners.RunnerClub = N'Hermitage') AND (dbo.tblRunners.RunnerGend er = N'M')
ORDER BY dbo.tblDist.PaceFactor
Ok, well that returns the same list as I have previously in Access & SQL - without the grouping in place - and runs a quickly.
SELECT TOP (100) PERCENT dbo.tblDist.RaceDist, dbo.tblRaces.RaceName, dbo.tblRaceDates.RaceDate,
dbo.tblRunners.RunnerClub,
FROM dbo.tblDist INNER JOIN
dbo.tblRaces ON dbo.tblDist.DistID = dbo.tblRaces.DistID INNER JOIN
dbo.tblRaceDates ON dbo.tblRaces.RaceID = dbo.tblRaceDates.RaceID INNER JOIN
dbo.tblResults ON dbo.tblRaceDates.RaceDateI
dbo.tblRunners ON dbo.tblResults.RunnerKey = dbo.tblRunners.RunnerKey
WHERE (dbo.tblDist.HoF = 1) AND (dbo.tblRunners.RunnerClub
ORDER BY dbo.tblDist.PaceFactor
SELECT b.RaceName, b.DistID, a.ResTime, a.ResName
FROM tblResults a
INNER JOIN tblRaces b on a.RaceID = b.RaceID
where a.ResultsID in (select top 3 i1.ResultsID from tblResults i1, tblRaces i2 where i1.RaceID = i2.RaceID and i2.DistID = b.DistID order by i2.DistID, i1.ResTime)
ORDER BY b.DistID, a.ResTime
=> you need 1 index on the fields RaceID + DistID ...
if that is not enough, you would need to denormalize, for example nightly, to extract the information as needed, into separate tables, to make queries on those reduced data more efficient.