SQL query conversion to MS Access query

I need help converting SQL queries to MS Access queries for an Access database. I have an SQL query sequence to pull data from an SQL database and creates some named data fields in the process etc. I am creating a VB6 application/utility to pull this data out in MS Access. The database I'm using is the Lahman Baseball Database----it's the historical database of Major League Baseball statistics that's freely accessible to the public.

The 2012 version of the Lahman database (13.1mb ZIP) is attached to this post. This is the exact database I will be connecting to.

Here are my queries in SQL:

--BATTING
SELECT Batter = (Master.nameFirst + Master.nameLast), Team = (Batting.teamID) + CAST(Batting.yearID AS nvarchar),
Batting.POS, Batting.G, Batting.AB, Batting.R, Batting.H, Batting.[2B], Batting.[3B],
Batting.HR, Batting.RBI, Batting.TB, Batting.BB, Batting.SO, Batting.SB, Batting.CS,
OBP = ROUND(((SELECT CAST(Batting.HBP AS float) + CAST(Batting.BB AS float) + CAST(Batting.H AS float)) / (CAST(Batting.AB AS float) + CAST(Batting.BB AS float) + CAST(Batting.HBP AS float))),3),
SLG = ROUND(((SELECT CAST(Batting.H AS float) + CAST(Batting.[2B]*2 AS float) + CAST(Batting.[3B]*3 AS float) + CAST(Batting.
*4 AS float)) / (CAST(Batting.AB AS float))),3),
AVG = ROUND(((SELECT CAST(Batting.H AS float)) / (CAST(Batting.AB AS float))),3),
FPCT = 0
FROM Master INNER JOIN Batting ON Master.playerID = Batting.playerID
WHERE (((Master.nameLast)Like '%') AND ((Batting.yearID)=1951) AND ((Batting.teamID)='bro') AND AB > 99)
UNION

--PITCHING
SELECT Batter = (Master.nameFirst + Master.nameLast), Team = (Pitching.teamID) + CAST(Pitching.yearID AS nvarchar), POS = '' + 'P', Pitching.BAOpp,
Pitching.ERA, Pitching.G, Pitching.GS, Pitching.CG, Pitching.SHO, Pitching.SV, Pitching.BFP,
Pitching.IPOuts, Pitching.H, Pitching.R, Pitching.ER, Pitching.HR, Pitching.HBP, Pitching.BB,
Pitching.SO, Pitching.FPCT
FROM Master INNER JOIN Pitching ON Master.playerID = Pitching.playerID
WHERE (((Master.nameLast)Like '%') AND ((Pitching.yearID)=1951) AND ((Pitching.teamID)='bro') AND IPouts > 99)
ORDER BY POS

--FIELDING
SELECT *,FPCT = ROUND(((CAST(PO AS float) + CAST(A AS float)) / (CAST(PO AS float) + CAST(A AS float) + CAST(E AS float))),3)
FROM(
SELECT Fielding.PlayerID, Master.nameFirst, Master.nameLast, Fielding.yearID,
Fielding.teamID, PO = SUM(Fielding.PO), A = SUM(Fielding.A), E = SUM(Fielding.E)
FROM Master INNER JOIN Fielding ON Master.playerID = Fielding.playerID
WHERE((Fielding.yearID)=1985) AND (nameLast LIKE '%') AND (Fielding.teamID)='kca' AND PO > 0
GROUP BY Fielding.playerID, nameFirst, nameLast, yearID, teamID) as t1


If it helps any, my ultimate goal is to fill a series of 20 data field columns in an MSFlexGrid by selecting a year and a team from a couple of dropdown lists.

MSFlexGrids would have the following column header names...

For batters/position player data:
Batter, Team, POS, G, AB, R, H, 2B, 3B, HR, RBI, TB, BB, SO, SB, CS, OBP, SLG, AVG, FPCT

For pitcher data:
Pitcher, Team, POS, AVG, ERA, G, GS, CG, SHO, SAV, PA, IP, H, R, ER, HR, HBP, BB, SO, FPCT


***** Any help is GREATLY appreciated..thanks *****
lahman2012-ms.zip
LVL 4
jazjefAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

jerryb30Commented:
You would be better served here by posting a much truncated database.
0
jerryb30Commented:
The actual data would allow checking, but I think a db with just the table structure would suffice.
For me, an mdb is what I can handle.
0
jazjefAuthor Commented:
Here's a truncated version of the Lahman MS Access database that has only the 4 tables that cover the scope of my SQL queries. Thanks...
lahman2012--truncated-.zip
0
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

Patrick MatthewsCommented:
jazjef,

Before we go any further, unless you specifically obtained permission from Mr Lahman to omit the attribution, you are in violation of the license for this database.

The source was: http://www.seanlahman.com/baseball-archive/statistics/

On to your stats.  I see the following problems, and there may well be more:
The Batting table does not have POS or TB columns.  TB can be calculated from other columns.  POS would presumably have to be resolved by going to the fielding table.  In the case of players who appeared at multiple positions, how would you want position determined?
Your formula for OBP is wrong.  You have to include SF in the denominator
Your formula for SLG is wrong, as you are over-valuing extra-base hits
Why would you want to use UNION to combine the results for hitting and pitching?
Not necessarily a problem, per se, but the query as written will grab the stats only accrued for the 1951 Brooklyn Dodgers.  For any players who appeared for other teams that year, did you want to include or exclude their stats with other teams?

Patrick
0
Patrick MatthewsCommented:
I ignored your union and wrote separate queries for batting and pitching, as comparing them makes no sense at all.  I did not write a separate query for fielding.

I came up with this for the batting query:

PARAMETERS [Year] Long, [Team] Text ( 255 );
SELECT m.nameLast & m.nameFirst AS Batter, 
b.teamID & b.yearID AS Team, 
f.POS, 
b.G, 
b.AB, 
b.R, 
b.H, 
b.[2B], 
b.[3B], 
b.HR, 
b.RBI, 
b.H+b.[2B]+b.[3B]*2+b.HR*3 AS TB, 
b.BB, 
b.SO, 
b.SB, 
b.CS, 
Round((b.H+b.BB+b.HBP)/IIf(b.AB+b.BB+b.HBP+Nz(b.SF,0),b.AB+b.BB+b.HBP+Nz(b.SF,0),Null),3) AS OBP, 
Round((b.H+b.[2B]+b.[3B]*2+b.HR*3)/IIf(b.AB,b.AB,Null),3) AS SLG, 
Round(b.H / b.AB, 3) AS [AVG], 
Round((f.PO + f.A) / IIf(f.PO + f.A + f.E, f.PO + f.A + f.E, Null), 3) AS FldPct
FROM ((Master AS m INNER JOIN 
    Batting AS b ON m.playerID = b.playerID) INNER JOIN 
    Fielding AS f ON (b.yearID = f.yearID) AND (b.playerID = f.playerID) AND (b.teamID = f.teamID)) INNER JOIN 
    (SELECT f2.playerID, f2.yearID, f2.teamID, Max(f2.G) AS MaxOfG
    FROM Fielding f2
    WHERE f2.yearID = [Year] AND f2.teamID = [Team]
    GROUP BY f2.playerID, f2.yearID, f2.teamID)  AS fmax ON (f.playerID = fmax.playerID) AND (f.yearID = fmax.yearID) AND (f.teamID = fmax.teamID) AND (f.G = fmax.MaxOfG)
WHERE (((b.yearID)=[Year]) AND ((b.teamID)=[Team]) AND (([b].[AB]+[b].[BB])>99));

Open in new window


Note the following:
SF were not an official stat in 1951, but I included it because it is part of the current definition of OBP.  I used Nz to handle nulls there
In the rate stats I test for a zero denominator to avoid an error.  Should the denominator be zero I force the expression to null, which is NOT the same as zero
For players who appeared on more than one team in a given year, this returns ONLY the stats for the current team
TB and SLG are calculated correctly
Position is determined as the position for which that player appeared in the most games for that team for that year, and fielding stats are based solely on performance with that indicated team at that one position


For the pitching query, which was much easier:

PARAMETERS [Year] Long, [Team] Text ( 255 );
SELECT m.nameLast & m.nameFirst AS Pitcher, 
p.teamID & p.yearID AS Team, 
"P" AS POS, 
Round([p].[H]/([p].[H]+[p].[IPouts]),3) AS [AVG], 
p.ERA, 
p.G, 
p.GS, 
p.CG, 
p.SHO, 
p.SV, 
p.BFP, 
Round(p.[IPouts]/3,2) AS IP, 
p.H, 
p.R, 
p.ER, 
p.HR, 
p.HBP, 
p.BB, 
p.SO, 
Round(([f].[PO]+[f].[A])/IIf([f].[PO]+[f].[A]+[f].[E],[f].[PO]+[f].[A]+[f].[E],Null),3) AS FldPct
FROM (Pitching AS p INNER JOIN 
    Master AS m ON p.playerID = m.playerID) INNER JOIN 
    Fielding AS f ON (p.teamID = f.teamID) AND (p.yearID = f.yearID) AND (p.playerID = f.playerID)
WHERE (((p.yearID)=[Year]) AND ((p.teamID)=[Team]) AND ((p.IPouts)>99) AND ((f.POS)="P"));

Open in new window


Note the following:
In the rate stats I test for a zero denominator to avoid an error.  Should the denominator be zero I force the expression to null, which is NOT the same as zero
For players who appeared on more than one team in a given year, this returns ONLY the stats for the current team
Fielding stats are based solely on performance as P with that indicated team
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
jazjefAuthor Commented:
This really looks terrific Patrick---you clearly have top-drawer database skills; I'll test them when I get home.

1) Thank you immensely for mentioning my compliance with the terms of using the Lahman database; I'm not sure what I have to post to be in compliancee, but I'm guessing it's in the documentation. If I put it in the bottom of the question area that should suffice shoudn't it? I certainly don't want to be on the wrong side of a compliance issue.

2) The pitching and batting each have the same number of variable columns---so they go in the same MSFlexgrid for a single team; I have a 'toggle' button that toggles the headers and brings the pitcher data to the top with the proper headers (it uses the POS to do this by identifying all the players with a P (pitchers) from all non-P POS values), and when the toggle button is clicked again, it brings the non-P player data back to the top. This is the reason for the 'union' of pitcher/batter data---it maximizes the use of a single MSFlexgrid to allow both pitcher and batter data. The FPCT (fielding percentage) has to be done separately because it has fewer fields than the pitching and batting. I have manually add each FPCT to the FPCT columns created in the pitching and batting queries [there's a way to do this by combining all 3 statements, but I've yet to figure it out].

3) Thanks for analyzing the SF and SLG; that's a HUGE bonus; never saw that

4) The presence of the team name and the year is because I manually change these when I run the SQL queries in MSSQL. What would like to do is have a dropdown list for the year, and a drop down list for the teams; then you just pick what you want, and load it.

5) I think that most games played by postion is the best for POS, but I'm not too picky on this.... I'll explain it in #6

6) I'm integrating this into a free non-commercial game I made over the last couple of years. I just decided I would like to be able to be able to pull any team. Right now, the program loads teams from CSV file data where I manually create the teams from the SQL queries. It's gotten to be a real hassle. One has to eventually go to MS Access or SQL to make accessing large numbers of teams practical.

7) Once I get the database to generate a 'complete' table, then I will post a question at some point asking how to populate dropdowns with the years and teams respectivley. I can do this dropdown type of thing with ASP.Net and MSSQL within Visual Web Developer ... but I'm not sure how to do it with VB6 and MS Access.

8) I didn't post a request to merge all 3 of these queries so they would UNION to a single set of records (my ultimate goal) because I was not sure how hard it would be and I didn't want to appear 'greedy' by asking too much in one question.

Many, many thanks for weighing-in and assisting me Patrick...
0
Patrick MatthewsCommented:
Before getting into this post, I am sorry for the tone of my original comment.  In re-reading it now...wow, all I can say is that I must have been cranky because I hadn't had my lunch yet.

This really looks terrific Patrick---you clearly have top-drawer database skills; I'll test them when I get home.

1) Thank you immensely for mentioning my compliance with the terms of using the Lahman database; I'm not sure what I have to post to be in compliancee, but I'm guessing it's in the documentation. If I put it in the bottom of the question area that should suffice shoudn't it? I certainly don't want to be on the wrong side of a compliance issue.

Again, I'm sorry that I was being cranky.  You did mention the source of the data.  To be in full compliance, I would think the URL is all you would have had to add, and I posted the URL in my comment, so I think you are square there.  Just make sure that, when you post your game, you make it very clear where you got the data :)

2) The pitching and batting each have the same number of variable columns---so they go in the same MSFlexgrid for a single team; I have a 'toggle' button that toggles the headers and brings the pitcher data to the top with the proper headers (it uses the POS to do this by identifying all the players with a P (pitchers) from all non-P POS values), and when the toggle button is clicked again, it brings the non-P player data back to the top. This is the reason for the 'union' of pitcher/batter data---it maximizes the use of a single MSFlexgrid to allow both pitcher and batter data. The FPCT (fielding percentage) has to be done separately because it has fewer fields than the pitching and batting. I have manually add each FPCT to the FPCT columns created in the pitching and batting queries [there's a way to do this by combining all 3 statements, but I've yet to figure it out].

I get it now, but I still think it's an awful idea.  For example, you are combining columns that are integer counting stats for batters with decimal rate stats for pitchers, and that's bad practice.

Also, you've got another problem: some pitchers will cross your 100 plate appearance threshold (e.g., Don Newcombe for the '51 Dodgers), and thus end up with batting and pitching records.  As currently written, it would be difficult to tell whether or not a given row where the position is "P" is a batting record or a pitching record.  While you could modify the queries further to have a "type" column indicating whether the result row is batting or pitching stats, it seems to me to just have the extra flexgrid so you can keep your data clean.

3) Thanks for analyzing the SF and SLG; that's a HUGE bonus; never saw that

Glad to help :)

4) The presence of the team name and the year is because I manually change these when I run the SQL queries in MSSQL. What would like to do is have a dropdown list for the year, and a drop down list for the teams; then you just pick what you want, and load it.

It's simple enough to use a form in access to pass parameter values to queries.  I figured you were doing something like that, so that is why I went with the parameter approach.

5) I think that most games played by postion is the best for POS, but I'm not too picky on this.... I'll explain it in #6

A better approach would have been to do it by most defensive IPouts, but those data are unavailable for many of the seasons in the database.  Note also that you could try to refine the positions by joining to the FieldingOF table to break out OF into LF, CF, and RF, but it had already taken me too long to write the queries as is :)

6) I'm integrating this into a free non-commercial game I made over the last couple of years. I just decided I would like to be able to be able to pull any team. Right now, the program loads teams from CSV file data where I manually create the teams from the SQL queries. It's gotten to be a real hassle. One has to eventually go to MS Access or SQL to make accessing large numbers of teams practical.

7) Once I get the database to generate a 'complete' table, then I will post a question at some point asking how to populate dropdowns with the years and teams respectivley. I can do this dropdown type of thing with ASP.Net and MSSQL within Visual Web Developer ... but I'm not sure how to do it with VB6 and MS Access.

Access makes that pretty easy.  For example, you could use as the source for your year dropdown a query like this:

SELECT yearID
FROM Batter
GROUP BY yearID

Open in new window


Then use a similar query to get the distinct teams for the selected year.

8) I didn't post a request to merge all 3 of these queries so they would UNION to a single set of records (my ultimate goal) because I was not sure how hard it would be and I didn't want to appear 'greedy' by asking too much in one question.

Many, many thanks for weighing-in and assisting me Patrick...

I hear you, but I still think you are best off separating each of them into separate queries.
0
jazjefAuthor Commented:
LMAO Patrick.... there's a bunch of Snickers candy bar commercials about being 'cranky' when you're hungry--check them out on YouTube Snickers--You're not you when you're hungry.....

This code is AWESOME. In less than 2 minutes I was able to use it to pull out a team recordset that used to take me 20 minutes to put together with the MSSQL. And I t hink I will apply your suggestion about breaking down the pitching and batting data into separate grids; I have the space because each grid would be about half the size of the single grid I use now.

The way you have it set up is great---it will make it easier to eventually get the dropdown lists going. There's only one thing I didn't anticipate---players playing the exact same amount of games at two different positions. It lists the player twice when/if this occurs---not a big deal, but I guess randomly selecting one or the other would be fine. I'm not sure how the other baseball programs out there deal with this....any suggestions?

Two-tons of thanks for this....it works like a dream.
0
jazjefAuthor Commented:
GREAT solution; just what I needed. Thanks.....
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Access

From novice to tech pro — start learning today.