Solved

# Probability question

Posted on 2011-09-13
Medium Priority
441 Views
Hi

I am trying to solve a problem involving probabilities.

Suppose there are N1 events and n successes with each success having a probability of p. It is easy to show that the probability of these n events occuring is

N1Cn p^n (1-p)^(N1-n)

What is different about my problem is that rather than varying n I am varying N1.

That is, I want to ask, for example, what range of N1 values can I predict so that I am 95% certain of getting n successes?

As a more specific example, suppose n = 5 and p = 0.3. What is the range of N values which I can be 95% certain that I would get 5 successes? e.g. N = 10 to 30.

Issac
0
Question by:IssacJones
[X]
###### Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

• Help others & share knowledge
• Earn cash & points
• 12
• 7
• 3
• +1

LVL 18

Expert Comment

ID: 36529788
could you use a program or spreadsheet here to determine the answer?   Or are you looking for a mathematical formula which you could apply?
0

Author Comment

ID: 36529894
I suspect a mathematical formula would be extremely difficult. I would be happy with a program (algorithm) or method to work it out.
0

LVL 37

Expert Comment

ID: 36530032
As a more specific example, suppose n = 5 and p = 0.3. What is the range of N values which I can be 95% certain that I would get 5 successes? e.g. N = 10 to 30.

You can't. It's impossible. N = n/p = 17 gives you the highest possibility of getting 5 successes and the probability is 20.81%
0

LVL 37

Assisted Solution

TommySzalapski earned 200 total points
ID: 36530056
N = n/p will always give you the highest probability. Then you can just loop adding one to N until you pass the target probability and do the same thing on the other side subtracting one from the best N.
0

LVL 37

Expert Comment

ID: 36530082
The compicated mess I posted for finding the range in the other question is a method of using binary search, so it's very fast. But on a modern computer, you could do it the easy way (changing N by 1 every time) and it will still get you the answer quickly enough.
0

LVL 27

Expert Comment

ID: 36530098
There is a calculator here:   http://stattrek.com/tables/binomial.aspx

You have to go to N=28 to get 95% probability of 5 or more successes.
0

LVL 27

Expert Comment

ID: 36530147
0

Author Comment

ID: 36530187
Hi Tommy

"You can't. It's impossible. N = n/p = 17 gives you the highest possibility of getting 5 successes and the probability is 20.81%"

I think you've misunderstood what I was saying. For example, if I choose N= 10 to 30, it is possible to calculate probabilities for 5 successes. I'm not asking for a specific N to give a 95% probability. Rather I would be attempting to work out the range of values that I could be pretty certain that I the value of N was realistic. For example, the probability of only 5 successes in 1000 trials would be extremely remote.

As such, I'm looking for the most likely range of N's which would give me the best bet.
0

LVL 27

Expert Comment

ID: 36530227
Do you mean "exactly 5 successes"   or "5 or more successes"?
0

LVL 27

Expert Comment

ID: 36530240
If you do 1000 trials with p=0.3  you are most likely to get 300 successes.
0

LVL 27

Expert Comment

ID: 36530250
>>  As such, I'm looking for the most likely range of N's which would give me the best bet.

What exactly are the terms of this bet??
0

Author Comment

ID: 36530273
Hi d-glitch

I don't think your method will work.

In the calculator you mention, n is varied to obtain the 95% i.e. you are keeping N fixed. This is not the same thing I am considering.

Thanks for trying.
0

Author Comment

ID: 36530311
Hi again tommy

I have written some code which generates all the probabilities for varying N and as we have found they sum to 1/p when N tends to infinity.

What I have then tried to do it multiply the calculated probabilities by p. This means that my new series sums to 1. Do you think I can do the following? iterate through the probabilities until there accumulated sum is 2.5% (store the N value) and then to 97.5 (and store the next N value). This gives me the range of N values.

Do you think this would give me the 95% certainty I'm looking for? There is a niggling doubt at the back of my mind that it isn't quite right.
0

Author Comment

ID: 36530321
d-glitch -> exactly 5 successes but with N varying.
0

LVL 27

Expert Comment

ID: 36530355
In the calculator I showed, you can change N and see the results.
I checked the values of N from 20 to 30 to get the 95% confidence level.

We can calculate anything, but you really have to tell us what you are trying to do/find out.

What exactly are the terms of the bet or game you are considering?
What does a single trial consist of?  Where does p=0.3 come from?

You talk about a 95% certainty.  But a 95% certainly of what?
0

LVL 27

Expert Comment

ID: 36530399
d-glitch -> exactly 5 successes but with N varying.

So in this game, you run trials until you have 5 successes, then you quit.
And you would like to know with 95% probability how many trials it will take???
0

Author Comment

ID: 36530465
p = 0.3 is the success (do a google on Bernoulli trials for more information) but it would be any probability.

The terms of bet or nature aren't important. This is merely a mathematical query.

In the calculator method you are looking at you are considering a completely different problem.

Look at it this way, if I were to take N=16 there is roughly a 21% chance of getting 5 occurrences. So, we can agree, that if we got 5 occurrences, it is fairly likely that the N we started off with was N=16. Similarly, N=17 would be a reasonable answer.

However, the probability of getting 5 successes in N=1000000000 is incredibly small.

As such, there must be a range of values for N which we could be almost certain (well, 95% certain) that these were the number of trials that gave us 5 successes.

Sorry if I haven't explained myself clearly enough.
0

LVL 27

Expert Comment

ID: 36530514

``````From the calculator above:    If N=7  and p=0.3   you will get exactly 5 successes  2.5% of the time.
If N=8  and p=0.3   you will get exactly 5 successes  4.6% of the time.

If N=30 and p=0.3   you will get 4 or fewer successes 3.0% of the time.

So I think the range of N you need for 95% confidence is 7 to 30.
``````
0

LVL 27

Expert Comment

ID: 36530552
>>  Look at it this way, if I were to take N=16 there is roughly a 21% chance of getting 5 occurrences.
So, we can agree, that if we got 5 occurrences, it is fairly likely that the N we started off with was N=16.

I don't think we can agree to that at all.  What do you you mean by fairly likely?

I think we might be able to agree that N is between 7 and 30.

0

LVL 27

Expert Comment

ID: 36530581
When you talk about a 95% confidence level on the range, do you want to assign 2.5% to the low end and 2.5% to the high end?

That's what I tried to do when I picked the 7 to 30 range.
0

LVL 18

Accepted Solution

deighton earned 400 total points
ID: 36530960
try this 'SolveForAlpha'

r = your number of successes required
p = probability
alpha = tolerance level, so for 95% success level, alpha = .95

important to note I've done that the other way around

if you add a module to excel, then the code will give you an in built function 'SolveForAlpha'

e.g

+SolveForAlpha(10, .256, .999)

``````Function SolveForAlpha(r As Integer, p As Double, alpha As Double) As Integer

Dim q As Integer
'lowest test n we can have is r
q = r

Do Until probGE(q, r, p) >= alpha

q = q + 1

Loop

SolveForAlpha = q

End Function

Function probGE(n As Integer, r As Integer, p As Double) As Double

Dim c As Integer

For c = r To n

probGE = probGE + prob(n, c, p)

Next

End Function

Function prob(n As Integer, r As Integer, p As Double) As Double

prob = p ^ r * (1 - p) ^ (n - r) * nCr(n, r)

End Function

Function nCr(n As Integer, r As Integer) As Double

Dim c As Integer

nCr = fact(n) / fact(r) / fact(n - r)

End Function

Function fact(ByVal n As Integer) As Double

If n <= 0 Then
fact = 1
Else
fact = fact(n - 1) * n
End If

End Function
``````
0

LVL 18

Expert Comment

ID: 36530987
by the way that is '5 or more successes', it has to be that way clearly, otherwise for large values of n, the chance of any particular r can become small.
0

Author Comment

ID: 36531157
d-glitch:" When you talk about a 95% confidence level on the range, do you want to assign 2.5% to the low end and 2.5% to the high end?" Yes

re: fairly likely. Well, it is more likely than it being from N=10000000000000. Apologies for the wording.

0

LVL 27

Assisted Solution

d-glitch earned 400 total points
ID: 36531403
I ran the following simulation in Excel 200 times:

Flip your p=0.3 coin 32 times.  Keep track of the trial where the 5th success occurs.
When your done, sort the data and throw away the ten lowest and ten highest numbers.

In one of my cases they were    6 6 6 7 7 7 8 8 8 8   ......   30 30 30 31 31 31 32 32 32 32

So the range you are looking for is something like 8 to 30
ExEx-Probability-1.pdf
0

LVL 27

Expert Comment

ID: 36531448
Note that this is all consistent with the results from the on-line calculator.
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article provides a brief introduction to tissue engineering, the process by which organs can be grown artificially. It covers the problems with organ transplants, the tissue engineering process, and the current successes and problems of the tecâ€¦
When we purchase storage, we typically are advertised storage of 500GB, 1TB, 2TB and so on. However, when you actually install it into your computer, your 500GB HDD will actually show up as 465GB. Why? It has to do with the way people and computersâ€¦
This is a video describing the growing solar energy use in Utah. This is a topic that greatly interests me and so I decided to produce a video about it.
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templaâ€¦
###### Suggested Courses
Course of the Month13 days, 23 hours left to enroll