# What is the general form of this probability problem?

I need help finding the general mathematical form (similar to "n choose k") of a counting/probability problem. It arose after a coworker implemented a simple JS fix to prevent something from displaying multiple times when sampled randomly and I wondered what the odds of it happening were. As formally as I could state it is as follows:

Problem Statement:
Given n distinct options with equal probability of occurring, what is the probability of at least one grouping of type ni of size k or greater appearing after s independent samplings?

I've looked into multinomial coefficients but couldn't figure out how to generalize it so it didn't matter which pair happened or that larger pairs should count towards it. A Poisson Distribution also looked promising but the simplest form seems to only be concerned with the probability of a single thing happening over a given time frame. Instead I hacked together some JavaScript to at least get a sense of its behavior.

Example
Roll a die 10 times. What is the probability that any of the numbers occurs at least 3 times?

Monte Carlo Solution
This will happen approximately 93% of the time given by the following code:

``````function pairsPerIndependentSampling()
{
console.log("<<---------------------------New Run--------------------------->>");
let numOptions = 6;
let numSamplesPerTrial = 10;
let grouping = 3;
let numTrials = 10000;
let slots = [];
let trialTotal = [];
let totals = [];
let pairResults = [];
let wasPairInTrialTotal = 0;

// initialize results
for (i = 0; i < numOptions; i++)
{
totals.push(0);
pairResults.push(0);
}

// run trials
for (i = 0; i < numTrials; i++)
{
// console.log("Trial number: " + i);
slots = []; // reset slots for new trial
let wasPairThisTrial = false;

// fill the slots for this trial
for (k = 0; k < numSamplesPerTrial; k++)
{
slots.push(randOption(numOptions));
}

// tally trial result
for (k = 0; k < slots.length; k++)
{
totals[slots[k]] += 1;
}

let countbyOption = _.countBy(slots);

for (var option in countbyOption)
{
if(countbyOption[option] >= grouping){
pairResults[option] += 1;
if (!wasPairThisTrial)
{
wasPairInTrialTotal++;
wasPairThisTrial = true;
}
}
}
}

console.log("Options: %i, Samples: %i, Grouping %i", numOptions, numSamplesPerTrial, grouping);
console.log("Number of trials: " + numTrials);
console.log("trials with at least one grouping: " + wasPairInTrialTotal);
console.log("Percentage with at least one grouping: %.2f %", (wasPairInTrialTotal/numTrials)*100);
/*
console.log("Totals: "); console.log(totals);
console.log("Pair Results"); console.log(pairResults);
*/

function randOption(numOptions)
{
return Math.floor((Math.random() * numOptions));
}
}

pairsPerIndependentSampling();
``````
LVL 2
###### Who is Participating?

Commented:
We were able to find the solution to your example case, but I don't think there will be a simple solution to the more general problem.

The  possible patterns for ten rolls of six-sided dice are related to a subset of the integer partitions of 10.
http://m.wolframalpha.com/input/?i=integer+partitions+of+10

There are 42 partitions, but only those with six or fewer elements are relevant here.
`````` Size | Count | Example
1 |   1   | 10 = 10
2 |   5   | 7 + 3 = 10
3 |   8   | 8 + 1 + 1 = 10
4 |   9   | 4 + 4 + 1 + 1 = 10
5 |   7   | 4 + 3 + 1 + 1 + 1 = 10
6 |   5   | 5 + 1 + 1 + 1 + 1 + 1 = 10
7 |   3   | 3 + 2 + 1 + 1 + 1 + 1 + 1 = 10
8 |   2   | 2 + 2 + 1 + 1 + 1 + 1 + 1 + 1 = 10
9 |   1   | 2 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 = 10
10 |   1   | 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 = 10
``````

That leaves 35 cases to consider.  We were very lucky that we only needed to evaluate two of them.

You might also want to look at the write up for the Unknown Formula in Murderous Maths.  They have a useful format for describing the outcome of multiple dice rolls and enumerating the possibilities.
http://www.murderousmaths.co.uk/books/unknownform.htm
1

Commented:
>> grouping of type n_i of size k
What is k? How do you determine k?
0

PrincipalCommented:
Roll a die 10 times. What is the probability that any of the numbers occurs at least 3 times?
Consider that there are 10 outcomes and that they can be represented as a 10-digit number that has no zeros in it.
So, the number of unique outcomes from 10 rolls is 9^10.
Now, just for visualization purposes, sort the outcomes from most-occurring digits on the left and least-occurring digits on the right.
(It should be clear that the *order* of them is unimportant because we only care about the number of occurrences).
Now assume a case where the first 3 digits in the sorted outcomes are the same.
There are 9 outcomes where 3 digits are the same: 111 .... 999
The probability of the same number appearing 3 times would appear to be:
9 outcomes out of the total number of possible outcomes = 9/9^10 = 1/9^9
0

Commented:
>> outcomes ... can be represented as a 10-digit number that has no zeros in it.

I think the die has six sides, so there aren't any 7's, 8's, or 9's either.
The total number of outcomes should be 6^10.

There is no simple relationship between total outcomes and sorting based on repetitions.

You can find the probability of any particular pattern occurring, but there are going to be a lot of patterns to account for:
``````10, 0, 0, 0, 0, 0     6 ways
9, 1, 0, 0, 0, 0    30 ways
8, 2, 0, 0, 0, 0    30 ways
8, 1, 1, 0, 0, 0    60 ways
. . .
``````
Fortunately, there are only two possible patterns that don't give you at least three of kind:
`````` 2, 2, 2, 2, 2, 0   ==>   (6*10!) / (2^5 * 6^10) = 1.13%
2, 2, 2, 2, 1, 1   ==>  (15*10!) / (2^4 * 6^10) = 5.63%
``````
That gives at least 3 matches in 10 rolls  93.24% of the time, which agrees with your Monte Carlo results.
0

Experts Exchange QAAuthor Commented:
@phoffric the size k is arbitrary and in the example it happens to be 3. It obviously has to less than or equal to the number of times you sample or else its zero.

@Fred and @d-glitch thanks for the conceptual framework! I'm going to start from there and see if I can codify it in the problem statement's variables.

I'll report back anything I come up with.
0

PrincipalCommented:
Problem Statement:
Given n distinct options with equal probability of occurring, what is the probability of at least one grouping of type ni of size k or greater appearing after s independent samplings?
The "after s independent samplings" is immaterial if, indeed, the "n distinct options" do have "equal probability of occurring".  That is, the sequence of outcomes is immaterial.   Perhaps you meant "within s independent outcomes".
I might state this slightly differently:
Given n distinct possible outcomes per trial, each with equal probability of occurring, what is the probability of at least one grouping of outcomes type ni of size k or greater appearing within s independent trials?
Example: Given 6 possible outcomes per trial, each with equal probability of occurring, what is the probability of at least one grouping of outcomes type [1,2,3,4,5,6] of size 3 or greater appearing within 10 independent trials?
The example you gave:
Roll a die 10 times. What is the probability that any of the numbers occurs at least 3 times?
appears different than the problem statement.  The problem statement appears to seek *groupings* of identical outcomes in s outcomes.  Whereas this example appears to seek mere occurrence count of identical outcomes in s outcomes.
Either that or you'd have to say:
Roll a die 10 times. What is the probability that any of the numbers occurs at least 3 times in succession?

Either:
1233345621 is an example for which to calculate probability of a grouping of length 3 of 3's.
or
1323435621 is an example for which to calculate probability of 3 3's without regard to grouping together.
Which is it?
0

Experts Exchange QAAuthor Commented:
@Fred I meant to pose the statement so that the ordering did not matter. The following outcomes of rolling a die 10 times would count as a positive case of a grouping of three or more:
5348037559
7291770367
7594444444

This would too even if it happened twice with two different numbers: 6667888993
0

Commented:
Fun fact:
Given n distinct options with equal probability of occurring:
Probability(at least one grouping of type ni of size k or greater appearing after s trials) = 1,
if s > (k-1)*n
0

Experts Exchange QAAuthor Commented:
Thanks for your insight guys. I'll let close request go through as d-glitch suggests.
0

Commented:
I think this is resolved.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.