purplesoup
asked on
Calculating the Probability that a Soccer player with a high salary scores a lot of goals
Suppose I have a table of soccer player salaries (in terms of high/medium/low):
High: 99
Medium: 165
Low: 157
I also have a table of soccer player goals scored (again in terms of high/medium/low):
High: 167
Medium: 112
Low: 142
And suppose I have a third table showing a breakdown of soccer player salaries and goals:
Salaries
Low Medium High
Low 70 52 20
Goals Medium 43 47 22
High 44 66 57
Now suppose I want to calculate the probability that if I pick a soccer player at random, what is the chance that I will pick one with a high salary who scores a lot of goals?
It would seem there are two ways of calculating this - one from the separate goals and salaries tables - find the chance of picking one with a high salary, and combine it what the probability of picking one who scores a lot of goals, or look at the combined table and calculate the probability from that table.
The trouble is, each method gives a different answer.
Is there a way of finding what the correct answer is?
High: 99
Medium: 165
Low: 157
I also have a table of soccer player goals scored (again in terms of high/medium/low):
High: 167
Medium: 112
Low: 142
And suppose I have a third table showing a breakdown of soccer player salaries and goals:
Salaries
Low Medium High
Low 70 52 20
Goals Medium 43 47 22
High 44 66 57
Now suppose I want to calculate the probability that if I pick a soccer player at random, what is the chance that I will pick one with a high salary who scores a lot of goals?
It would seem there are two ways of calculating this - one from the separate goals and salaries tables - find the chance of picking one with a high salary, and combine it what the probability of picking one who scores a lot of goals, or look at the combined table and calculate the probability from that table.
The trouble is, each method gives a different answer.
Is there a way of finding what the correct answer is?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
You might want to leave the goal keepers out of your calculations, some of those are very highly paid and are still described as soccer players.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
sdstuber - here's my problem. Yes you can just use the main table an do 57/142 = 0.13 2dp.
But using the other two tables the chance of getting a high salary is 99/421 and the chance of scoring a high number of goals is 167/421 so shouldn't combining those two figures also get me to 0.13, but it doesn't, it works out as 0.09 2 dp.
TommySzalapski - sorry I'm still not getting it. I can see that the other two smaller tables are actually just repeating the data in the main table - so if you add up all the low salaries in the main table you get 157 which is in the small table, and if you add up all the low scorers in the main table you get 142, the same as the figure in the small table.
So I can see that it is possible to combine the chance of being a low scorer or a low salary in one of the smaller tables, and combine that matching row or column in larger table, to get a consistent value between the two.
My confusion - and clearly I'm wrong here I just can't see why - is that it would have been possible to use the data from the two smaller tables to get the same value as from the larger table.
Put simply, it is possible to find out - just using the two smaller tables - the chance of picking a player with a high salary and high number of goals (or low salary and low number of goals), with a value that is consistent with calculating this figure from the larger table?
But using the other two tables the chance of getting a high salary is 99/421 and the chance of scoring a high number of goals is 167/421 so shouldn't combining those two figures also get me to 0.13, but it doesn't, it works out as 0.09 2 dp.
TommySzalapski - sorry I'm still not getting it. I can see that the other two smaller tables are actually just repeating the data in the main table - so if you add up all the low salaries in the main table you get 157 which is in the small table, and if you add up all the low scorers in the main table you get 142, the same as the figure in the small table.
So I can see that it is possible to combine the chance of being a low scorer or a low salary in one of the smaller tables, and combine that matching row or column in larger table, to get a consistent value between the two.
My confusion - and clearly I'm wrong here I just can't see why - is that it would have been possible to use the data from the two smaller tables to get the same value as from the larger table.
Put simply, it is possible to find out - just using the two smaller tables - the chance of picking a player with a high salary and high number of goals (or low salary and low number of goals), with a value that is consistent with calculating this figure from the larger table?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Ok thanks - that's clear.
I had a typo in my previous reply - I wrote
"sdstuber - here's my problem. Yes you can just use the main table an do 57/142 = 0.13 2dp."
I should have written 421 instead of 142 - the answer is right (i.e. 57/421 = 0.13 2 dp).
Anyway, for the other replies, thanks for the clarification - the other two smaller tables are irrelevant to answering the question, although it is possible to use the data to confirm the answer, however all the information that is needed is in the larger tables.
I assume the reason you can't just multiply the two values from the smaller tables together is that they aren't independent?
Anyway, I'll share out the points, thanks again.
I had a typo in my previous reply - I wrote
"sdstuber - here's my problem. Yes you can just use the main table an do 57/142 = 0.13 2dp."
I should have written 421 instead of 142 - the answer is right (i.e. 57/421 = 0.13 2 dp).
Anyway, for the other replies, thanks for the clarification - the other two smaller tables are irrelevant to answering the question, although it is possible to use the data to confirm the answer, however all the information that is needed is in the larger tables.
I assume the reason you can't just multiply the two values from the smaller tables together is that they aren't independent?
Anyway, I'll share out the points, thanks again.
That is correct. If they were independent events then the expected values for the big table would be the products of the probabilities of the smaller tables.
ASKER
Thanks - thinking about it some more that's obvious really.
If we have two people, 50% are tall and 50% are bank managers, we can't say there is a 25% probability that picking one at random will be a tall bank manager because we don't know if the tall person is the bank manager or not.
On the other hand if there is a 50% chance I will meet a tall person at the gym on Monday and a 50% chance I'll meet a Bank Manager on my train journey on Tuesday, there is a 25% change I'll meet both.
Thanks again.
If we have two people, 50% are tall and 50% are bank managers, we can't say there is a 25% probability that picking one at random will be a tall bank manager because we don't know if the tall person is the bank manager or not.
On the other hand if there is a 50% chance I will meet a tall person at the gym on Monday and a 50% chance I'll meet a Bank Manager on my train journey on Tuesday, there is a 25% change I'll meet both.
Thanks again.