Solved

Probability question

Posted on 2011-09-13
25
422 Views
Last Modified: 2012-08-13
Hi

I am trying to solve a problem involving probabilities.

Suppose there are N1 events and n successes with each success having a probability of p. It is easy to show that the probability of these n events occuring is

N1Cn p^n (1-p)^(N1-n)

What is different about my problem is that rather than varying n I am varying N1.

That is, I want to ask, for example, what range of N1 values can I predict so that I am 95% certain of getting n successes?

As a more specific example, suppose n = 5 and p = 0.3. What is the range of N values which I can be 95% certain that I would get 5 successes? e.g. N = 10 to 30.

Thanks in advance

Issac
0
Comment
Question by:IssacJones
  • 12
  • 7
  • 3
  • +1
25 Comments
 
LVL 18

Expert Comment

by:deighton
ID: 36529788
could you use a program or spreadsheet here to determine the answer?   Or are you looking for a mathematical formula which you could apply?
0
 

Author Comment

by:IssacJones
ID: 36529894
I suspect a mathematical formula would be extremely difficult. I would be happy with a program (algorithm) or method to work it out.
0
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 36530032
As a more specific example, suppose n = 5 and p = 0.3. What is the range of N values which I can be 95% certain that I would get 5 successes? e.g. N = 10 to 30.

You can't. It's impossible. N = n/p = 17 gives you the highest possibility of getting 5 successes and the probability is 20.81%
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 50 total points
ID: 36530056
N = n/p will always give you the highest probability. Then you can just loop adding one to N until you pass the target probability and do the same thing on the other side subtracting one from the best N.
0
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 36530082
The compicated mess I posted for finding the range in the other question is a method of using binary search, so it's very fast. But on a modern computer, you could do it the easy way (changing N by 1 every time) and it will still get you the answer quickly enough.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530098
There is a calculator here:   http://stattrek.com/tables/binomial.aspx

You have to go to N=28 to get 95% probability of 5 or more successes.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530147
0
 

Author Comment

by:IssacJones
ID: 36530187
Hi Tommy

"You can't. It's impossible. N = n/p = 17 gives you the highest possibility of getting 5 successes and the probability is 20.81%"

I think you've misunderstood what I was saying. For example, if I choose N= 10 to 30, it is possible to calculate probabilities for 5 successes. I'm not asking for a specific N to give a 95% probability. Rather I would be attempting to work out the range of values that I could be pretty certain that I the value of N was realistic. For example, the probability of only 5 successes in 1000 trials would be extremely remote.

As such, I'm looking for the most likely range of N's which would give me the best bet.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530227
Do you mean "exactly 5 successes"   or "5 or more successes"?
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530240
If you do 1000 trials with p=0.3  you are most likely to get 300 successes.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530250
>>  As such, I'm looking for the most likely range of N's which would give me the best bet.

What exactly are the terms of this bet??
0
 

Author Comment

by:IssacJones
ID: 36530273
Hi d-glitch

I don't think your method will work.

In the calculator you mention, n is varied to obtain the 95% i.e. you are keeping N fixed. This is not the same thing I am considering.

Thanks for trying.
0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 

Author Comment

by:IssacJones
ID: 36530311
Hi again tommy

I have written some code which generates all the probabilities for varying N and as we have found they sum to 1/p when N tends to infinity.

What I have then tried to do it multiply the calculated probabilities by p. This means that my new series sums to 1. Do you think I can do the following? iterate through the probabilities until there accumulated sum is 2.5% (store the N value) and then to 97.5 (and store the next N value). This gives me the range of N values.

Do you think this would give me the 95% certainty I'm looking for? There is a niggling doubt at the back of my mind that it isn't quite right.
0
 

Author Comment

by:IssacJones
ID: 36530321
d-glitch -> exactly 5 successes but with N varying.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530355
In the calculator I showed, you can change N and see the results.
I checked the values of N from 20 to 30 to get the 95% confidence level.

We can calculate anything, but you really have to tell us what you are trying to do/find out.

What exactly are the terms of the bet or game you are considering?  
What does a single trial consist of?  Where does p=0.3 come from?  

You talk about a 95% certainty.  But a 95% certainly of what?
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530399
d-glitch -> exactly 5 successes but with N varying.

So in this game, you run trials until you have 5 successes, then you quit.
And you would like to know with 95% probability how many trials it will take???
0
 

Author Comment

by:IssacJones
ID: 36530465
p = 0.3 is the success (do a google on Bernoulli trials for more information) but it would be any probability.

The terms of bet or nature aren't important. This is merely a mathematical query.

In the calculator method you are looking at you are considering a completely different problem.

Look at it this way, if I were to take N=16 there is roughly a 21% chance of getting 5 occurrences. So, we can agree, that if we got 5 occurrences, it is fairly likely that the N we started off with was N=16. Similarly, N=17 would be a reasonable answer.

However, the probability of getting 5 successes in N=1000000000 is incredibly small.

As such, there must be a range of values for N which we could be almost certain (well, 95% certain) that these were the number of trials that gave us 5 successes.

Sorry if I haven't explained myself clearly enough.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530514

From the calculator above:    If N=7  and p=0.3   you will get exactly 5 successes  2.5% of the time.
                              If N=8  and p=0.3   you will get exactly 5 successes  4.6% of the time.

                              If N=30 and p=0.3   you will get 4 or fewer successes 3.0% of the time.

So I think the range of N you need for 95% confidence is 7 to 30.

Open in new window

0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530552
>>  Look at it this way, if I were to take N=16 there is roughly a 21% chance of getting 5 occurrences.
       So, we can agree, that if we got 5 occurrences, it is fairly likely that the N we started off with was N=16.


I don't think we can agree to that at all.  What do you you mean by fairly likely?

I think we might be able to agree that N is between 7 and 30.

0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36530581
When you talk about a 95% confidence level on the range, do you want to assign 2.5% to the low end and 2.5% to the high end?

That's what I tried to do when I picked the 7 to 30 range.
0
 
LVL 18

Accepted Solution

by:
deighton earned 100 total points
ID: 36530960
try this 'SolveForAlpha'

r = your number of successes required
p = probability
alpha = tolerance level, so for 95% success level, alpha = .95

important to note I've done that the other way around


if you add a module to excel, then the code will give you an in built function 'SolveForAlpha'

e.g

+SolveForAlpha(10, .256, .999)

Function SolveForAlpha(r As Integer, p As Double, alpha As Double) As Integer

    Dim q As Integer
    'lowest test n we can have is r
    q = r
    

    Do Until probGE(q, r, p) >= alpha
        
        q = q + 1
    
    Loop
    
    SolveForAlpha = q

End Function

Function probGE(n As Integer, r As Integer, p As Double) As Double

    Dim c As Integer
    
    For c = r To n
    
        probGE = probGE + prob(n, c, p)
    
    Next

End Function

Function prob(n As Integer, r As Integer, p As Double) As Double

    prob = p ^ r * (1 - p) ^ (n - r) * nCr(n, r)

End Function


Function nCr(n As Integer, r As Integer) As Double

    Dim c As Integer
    
    nCr = fact(n) / fact(r) / fact(n - r)
    

End Function


Function fact(ByVal n As Integer) As Double

    If n <= 0 Then
        fact = 1
    Else
        fact = fact(n - 1) * n
    End If
        

End Function

Open in new window

0
 
LVL 18

Expert Comment

by:deighton
ID: 36530987
by the way that is '5 or more successes', it has to be that way clearly, otherwise for large values of n, the chance of any particular r can become small.
0
 

Author Comment

by:IssacJones
ID: 36531157
d-glitch:" When you talk about a 95% confidence level on the range, do you want to assign 2.5% to the low end and 2.5% to the high end?" Yes

re: fairly likely. Well, it is more likely than it being from N=10000000000000. Apologies for the wording.






0
 
LVL 27

Assisted Solution

by:d-glitch
d-glitch earned 100 total points
ID: 36531403
I ran the following simulation in Excel 200 times:

  Flip your p=0.3 coin 32 times.  Keep track of the trial where the 5th success occurs.
  When your done, sort the data and throw away the ten lowest and ten highest numbers.

In one of my cases they were    6 6 6 7 7 7 8 8 8 8   ......   30 30 30 31 31 31 32 32 32 32

So the range you are looking for is something like 8 to 30
ExEx-Probability-1.pdf
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 36531448
Note that this is all consistent with the results from the on-line calculator.
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
First Mechanical Kit ? 3 56
Two list de-cypher 6 86
Logarithms 2 52
Access question - internal training 9 50
A Guide to the PMT, FV, IPMT and PPMT Functions In MS Excel we have the PMT, FV, IPMT and PPMT functions, which do a fantastic job for interest rate calculations.  But what if you don't have Excel ? This article is for programmers looking to re…
Article by: Nicole
This is a research brief on the potential colonization of humans on Mars.
This video discusses moving either the default database or any database to a new volume.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now