What is the general form of this probability problem?

I need help finding the general mathematical form (similar to "n choose k") of a counting/probability problem. It arose after a coworker implemented a simple JS fix to prevent something from displaying multiple times when sampled randomly and I wondered what the odds of it happening were. As formally as I could state it is as follows:

Problem Statement:
Given n distinct options with equal probability of occurring, what is the probability of at least one grouping of type n_{i} of size k or greater appearing after s independent samplings?

I've looked into multinomial coefficients but couldn't figure out how to generalize it so it didn't matter which pair happened or that larger pairs should count towards it. A Poisson Distribution also looked promising but the simplest form seems to only be concerned with the probability of a single thing happening over a given time frame. Instead I hacked together some JavaScript to at least get a sense of its behavior.

Example
Roll a die 10 times. What is the probability that any of the numbers occurs at least 3 times?

Monte Carlo Solution
This will happen approximately 93% of the time given by the following code:

function pairsPerIndependentSampling(){ console.log("<<---------------------------New Run--------------------------->>"); let numOptions = 6; let numSamplesPerTrial = 10; let grouping = 3; let numTrials = 10000; let slots = []; let trialTotal = []; let totals = []; let pairResults = []; let wasPairInTrialTotal = 0; // initialize results for (i = 0; i < numOptions; i++) { totals.push(0); pairResults.push(0); } // run trials for (i = 0; i < numTrials; i++) { // console.log("Trial number: " + i); slots = []; // reset slots for new trial let wasPairThisTrial = false; // fill the slots for this trial for (k = 0; k < numSamplesPerTrial; k++) { slots.push(randOption(numOptions)); } // tally trial result for (k = 0; k < slots.length; k++) { totals[slots[k]] += 1; } let countbyOption = _.countBy(slots); for (var option in countbyOption) { if(countbyOption[option] >= grouping){ pairResults[option] += 1; if (!wasPairThisTrial) { wasPairInTrialTotal++; wasPairThisTrial = true; } } } } console.log("Options: %i, Samples: %i, Grouping %i", numOptions, numSamplesPerTrial, grouping); console.log("Number of trials: " + numTrials); console.log("trials with at least one grouping: " + wasPairInTrialTotal); console.log("Percentage with at least one grouping: %.2f %", (wasPairInTrialTotal/numTrials)*100); /* console.log("Totals: "); console.log(totals); console.log("Pair Results"); console.log(pairResults); */ function randOption(numOptions) { return Math.floor((Math.random() * numOptions)); }}pairsPerIndependentSampling();

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Roll a die 10 times. What is the probability that any of the numbers occurs at least 3 times?

Consider that there are 10 outcomes and that they can be represented as a 10-digit number that has no zeros in it.
So, the number of unique outcomes from 10 rolls is 9^10.
Now, just for visualization purposes, sort the outcomes from most-occurring digits on the left and least-occurring digits on the right.
(It should be clear that the *order* of them is unimportant because we only care about the number of occurrences).
Now assume a case where the first 3 digits in the sorted outcomes are the same.
There are 9 outcomes where 3 digits are the same: 111 .... 999
The probability of the same number appearing 3 times would appear to be:
9 outcomes out of the total number of possible outcomes = 9/9^10 = 1/9^9

That gives at least 3 matches in 10 rolls 93.24% of the time, which agrees with your Monte Carlo results.

0

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

Michael ArciniegaExperts Exchange QAAuthor Commented:

@phoffric the size k is arbitrary and in the example it happens to be 3. It obviously has to less than or equal to the number of times you sample or else its zero.

@Fred and @d-glitch thanks for the conceptual framework! I'm going to start from there and see if I can codify it in the problem statement's variables.

That leaves 35 cases to consider. We were very lucky that we only needed to evaluate two of them.

You might also want to look at the write up for the Unknown Formula in Murderous Maths. They have a useful format for describing the outcome of multiple dice rolls and enumerating the possibilities. http://www.murderousmaths.co.uk/books/unknownform.htm

1

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Problem Statement:
Given n distinct options with equal probability of occurring, what is the probability of at least one grouping of type ni of size k or greater appearing after s independent samplings?

The "after s independent samplings" is immaterial if, indeed, the "n distinct options" do have "equal probability of occurring". That is, the sequence of outcomes is immaterial. Perhaps you meant "within s independent outcomes".
I might state this slightly differently:

Given n distinct possible outcomes per trial, each with equal probability of occurring, what is the probability of at least one grouping of outcomes type ni of size k or greater appearing within s independent trials?

Example: Given 6 possible outcomes per trial, each with equal probability of occurring, what is the probability of at least one grouping of outcomes type [1,2,3,4,5,6] of size 3 or greater appearing within 10 independent trials?
The example you gave:

Roll a die 10 times. What is the probability that any of the numbers occurs at least 3 times?

appears different than the problem statement. The problem statement appears to seek *groupings* of identical outcomes in s outcomes. Whereas this example appears to seek mere occurrence count of identical outcomes in s outcomes.
Either that or you'd have to say:

Roll a die 10 times. What is the probability that any of the numbers occurs at least 3 times in succession?

Either:
1233345621 is an example for which to calculate probability of a grouping of length 3 of 3's.
or
1323435621 is an example for which to calculate probability of 3 3's without regard to grouping together.
Which is it?

Michael ArciniegaExperts Exchange QAAuthor Commented:

@Fred I meant to pose the statement so that the ordering did not matter. The following outcomes of rolling a die 10 times would count as a positive case of a grouping of three or more: 5348037559 7291770367
7594444444

This would too even if it happened twice with two different numbers: 6667888993

Fun fact:
Given n distinct options with equal probability of occurring: Probability(at least one grouping of type n_{i} of size k or greater appearing after s trials) = 1, ifs > (k-1)*n

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for freeEdge Out The Competitionfor your dream job with proven skills and certifications.Get started todayStand Outas the employee with proven skills.Start learning today for freeMove Your Career Forwardwith certification training in the latest technologies.Start your trial today

What is k? How do you determine k?