[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now

x
?
Solved

Probability question

Posted on 2011-09-13
25
Medium Priority
?
447 Views
Last Modified: 2012-08-13
Hi

I am trying to solve a problem involving probabilities.

Suppose there are N1 events and n successes with each success having a probability of p. It is easy to show that the probability of these n events occuring is

N1Cn p^n (1-p)^(N1-n)

What is different about my problem is that rather than varying n I am varying N1.

That is, I want to ask, for example, what range of N1 values can I predict so that I am 95% certain of getting n successes?

As a more specific example, suppose n = 5 and p = 0.3. What is the range of N values which I can be 95% certain that I would get 5 successes? e.g. N = 10 to 30.

Thanks in advance

Issac
0
Comment
Question by:IssacJones
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 12
  • 7
  • 3
  • +1
25 Comments
 
LVL 18

Expert Comment

by:deighton
ID: 36529788
could you use a program or spreadsheet here to determine the answer?   Or are you looking for a mathematical formula which you could apply?
0
 

Author Comment

by:IssacJones
ID: 36529894
I suspect a mathematical formula would be extremely difficult. I would be happy with a program (algorithm) or method to work it out.
0
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 36530032
As a more specific example, suppose n = 5 and p = 0.3. What is the range of N values which I can be 95% certain that I would get 5 successes? e.g. N = 10 to 30.

You can't. It's impossible. N = n/p = 17 gives you the highest possibility of getting 5 successes and the probability is 20.81%
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 200 total points
ID: 36530056
N = n/p will always give you the highest probability. Then you can just loop adding one to N until you pass the target probability and do the same thing on the other side subtracting one from the best N.
0
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 36530082
The compicated mess I posted for finding the range in the other question is a method of using binary search, so it's very fast. But on a modern computer, you could do it the easy way (changing N by 1 every time) and it will still get you the answer quickly enough.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530098
There is a calculator here:   http://stattrek.com/tables/binomial.aspx

You have to go to N=28 to get 95% probability of 5 or more successes.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530147
0
 

Author Comment

by:IssacJones
ID: 36530187
Hi Tommy

"You can't. It's impossible. N = n/p = 17 gives you the highest possibility of getting 5 successes and the probability is 20.81%"

I think you've misunderstood what I was saying. For example, if I choose N= 10 to 30, it is possible to calculate probabilities for 5 successes. I'm not asking for a specific N to give a 95% probability. Rather I would be attempting to work out the range of values that I could be pretty certain that I the value of N was realistic. For example, the probability of only 5 successes in 1000 trials would be extremely remote.

As such, I'm looking for the most likely range of N's which would give me the best bet.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530227
Do you mean "exactly 5 successes"   or "5 or more successes"?
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530240
If you do 1000 trials with p=0.3  you are most likely to get 300 successes.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530250
>>  As such, I'm looking for the most likely range of N's which would give me the best bet.

What exactly are the terms of this bet??
0
 

Author Comment

by:IssacJones
ID: 36530273
Hi d-glitch

I don't think your method will work.

In the calculator you mention, n is varied to obtain the 95% i.e. you are keeping N fixed. This is not the same thing I am considering.

Thanks for trying.
0
 

Author Comment

by:IssacJones
ID: 36530311
Hi again tommy

I have written some code which generates all the probabilities for varying N and as we have found they sum to 1/p when N tends to infinity.

What I have then tried to do it multiply the calculated probabilities by p. This means that my new series sums to 1. Do you think I can do the following? iterate through the probabilities until there accumulated sum is 2.5% (store the N value) and then to 97.5 (and store the next N value). This gives me the range of N values.

Do you think this would give me the 95% certainty I'm looking for? There is a niggling doubt at the back of my mind that it isn't quite right.
0
 

Author Comment

by:IssacJones
ID: 36530321
d-glitch -> exactly 5 successes but with N varying.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530355
In the calculator I showed, you can change N and see the results.
I checked the values of N from 20 to 30 to get the 95% confidence level.

We can calculate anything, but you really have to tell us what you are trying to do/find out.

What exactly are the terms of the bet or game you are considering?  
What does a single trial consist of?  Where does p=0.3 come from?  

You talk about a 95% certainty.  But a 95% certainly of what?
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530399
d-glitch -> exactly 5 successes but with N varying.

So in this game, you run trials until you have 5 successes, then you quit.
And you would like to know with 95% probability how many trials it will take???
0
 

Author Comment

by:IssacJones
ID: 36530465
p = 0.3 is the success (do a google on Bernoulli trials for more information) but it would be any probability.

The terms of bet or nature aren't important. This is merely a mathematical query.

In the calculator method you are looking at you are considering a completely different problem.

Look at it this way, if I were to take N=16 there is roughly a 21% chance of getting 5 occurrences. So, we can agree, that if we got 5 occurrences, it is fairly likely that the N we started off with was N=16. Similarly, N=17 would be a reasonable answer.

However, the probability of getting 5 successes in N=1000000000 is incredibly small.

As such, there must be a range of values for N which we could be almost certain (well, 95% certain) that these were the number of trials that gave us 5 successes.

Sorry if I haven't explained myself clearly enough.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530514

From the calculator above:    If N=7  and p=0.3   you will get exactly 5 successes  2.5% of the time.
                              If N=8  and p=0.3   you will get exactly 5 successes  4.6% of the time.

                              If N=30 and p=0.3   you will get 4 or fewer successes 3.0% of the time.

So I think the range of N you need for 95% confidence is 7 to 30.

Open in new window

0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530552
>>  Look at it this way, if I were to take N=16 there is roughly a 21% chance of getting 5 occurrences.
       So, we can agree, that if we got 5 occurrences, it is fairly likely that the N we started off with was N=16.


I don't think we can agree to that at all.  What do you you mean by fairly likely?

I think we might be able to agree that N is between 7 and 30.

0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530581
When you talk about a 95% confidence level on the range, do you want to assign 2.5% to the low end and 2.5% to the high end?

That's what I tried to do when I picked the 7 to 30 range.
0
 
LVL 18

Accepted Solution

by:
deighton earned 400 total points
ID: 36530960
try this 'SolveForAlpha'

r = your number of successes required
p = probability
alpha = tolerance level, so for 95% success level, alpha = .95

important to note I've done that the other way around


if you add a module to excel, then the code will give you an in built function 'SolveForAlpha'

e.g

+SolveForAlpha(10, .256, .999)

Function SolveForAlpha(r As Integer, p As Double, alpha As Double) As Integer

    Dim q As Integer
    'lowest test n we can have is r
    q = r
    

    Do Until probGE(q, r, p) >= alpha
        
        q = q + 1
    
    Loop
    
    SolveForAlpha = q

End Function

Function probGE(n As Integer, r As Integer, p As Double) As Double

    Dim c As Integer
    
    For c = r To n
    
        probGE = probGE + prob(n, c, p)
    
    Next

End Function

Function prob(n As Integer, r As Integer, p As Double) As Double

    prob = p ^ r * (1 - p) ^ (n - r) * nCr(n, r)

End Function


Function nCr(n As Integer, r As Integer) As Double

    Dim c As Integer
    
    nCr = fact(n) / fact(r) / fact(n - r)
    

End Function


Function fact(ByVal n As Integer) As Double

    If n <= 0 Then
        fact = 1
    Else
        fact = fact(n - 1) * n
    End If
        

End Function

Open in new window

0
 
LVL 18

Expert Comment

by:deighton
ID: 36530987
by the way that is '5 or more successes', it has to be that way clearly, otherwise for large values of n, the chance of any particular r can become small.
0
 

Author Comment

by:IssacJones
ID: 36531157
d-glitch:" When you talk about a 95% confidence level on the range, do you want to assign 2.5% to the low end and 2.5% to the high end?" Yes

re: fairly likely. Well, it is more likely than it being from N=10000000000000. Apologies for the wording.






0
 
LVL 27

Assisted Solution

by:d-glitch
d-glitch earned 400 total points
ID: 36531403
I ran the following simulation in Excel 200 times:

  Flip your p=0.3 coin 32 times.  Keep track of the trial where the 5th success occurs.
  When your done, sort the data and throw away the ten lowest and ten highest numbers.

In one of my cases they were    6 6 6 7 7 7 8 8 8 8   ......   30 30 30 31 31 31 32 32 32 32

So the range you are looking for is something like 8 to 30
ExEx-Probability-1.pdf
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36531448
Note that this is all consistent with the results from the on-line calculator.
0

Featured Post

Ask an Anonymous Question!

Don't feel intimidated by what you don't know. Ask your question anonymously. It's easy! Learn more and upgrade.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

We are taking giant steps in technological advances in the field of wireless telephony. At just 10 years since the advent of smartphones, it is crucial to examine the benefits and disadvantages that have been report to us.
Lithium-ion batteries area cornerstone of today's portable electronic devices, and even though they are relied upon heavily, their chemistry and origin are not of common knowledge. This article is about a device on which every smartphone, laptop, an…
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question