Generate a table of biased "random" integer cells for a given range of numbers with 4 subsets to each have a different weight for insertion in the table

For background and context, please see the following EE Question: http://www.experts- where the Accepted Solution is =INT(RAND()*(122-65+1)+65)
I've now expanded the range to be from 48 to 122 as in =INT(RAND()*(122-48+1)+48).

How I would use the Accepted Solution to generate a table of integer random numbers in the range of 48 to 122, like in my previous EE question, but where there are (let's say
3 or 4) sub ranges within -- and that each sub range has a specified weight/bias (Bias) for appearing in the table.   I'm thinking of: (1) Using the Accepted Solution to generate
random integers in the range of 0 to 9, =INT(RAND()*(9-0+1)+0, for each column (in a new Exterior Row above the table) and for each row (in a new Exterior Column left of the table); AND (2) Then using for each cell in the main table an IF Function with (a) with its Test for the Bias being some range of combinations of the cell's Exterior row and Exterior cell (random integers), and (b) using the Accepted Solution in the resulting True/False results (for the 4 sub ranges in the main table).

The sub ranges could be 33 to 47, 48 to 57, 65 to 90 and 91 to 122.  The respective bias
are 10, 20, 30 and 40% approximately.

I only have Excel 2003 and I'd rather not use =RANDBETWEEN(65,122), because I don't want to bother digging up my Microsoft CD's (still unpacked from moving), in order to load the Analysis ToolPak add-in.

How might this be done easily?

LVL 18
Who is Participating?

Improve company productivity with a Business Account.Sign Up

Patrick MatthewsConnect With a Mentor Commented:

Actually, the widest band can be anywhere, as this version indicates:



Patrick MatthewsCommented:
Please see the attached file:


I use a random value in Col A, and a table of values in G2:I5 to hold the biases:

0      33      47
0.1      48      57
0.3      65      90
0.6      91      122

Based on that, the formula to draw a random integer is:

WaterStreetAuthor Commented:
Thanks mp,

I just got back and am starting to look at what you posted.
Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

WaterStreetAuthor Commented:
I spent a lot of time tinkering with it and some fun too.  My sense is that I can get it to work for me if I understood the logic of the biases and how they are applied (they are not percentages of the distributions throughout the table ).

1.  How can the biases work to relate them to percentages?  Do they need to add up to 1.0?

2.  The High range values never appear in the distribution, so I fixed this by adding one to each.

3.  I understand VLOOKUP, how do I apply this solution when I only have three ranges instead of 4?


Patrick MatthewsCommented:
The attached file differs only in that I have more values being generated, and I added formulas in Columns K & L to show how many items there are in each band.

The numbers will fluctuate a bit on each recalc, but they are coming in to the targeted spread.


As for having three bands, or five, or nineteen, just adjust the lookup table accordingly, and the VLOOKUP expressions where the random integers are generated.

The biasing works because the random number in Col A is a value >= 0 and < 1.  So, the the first band is for (Col A) >=0 and (Col A) < 0.1.  That spread is 10%.
WaterStreetAuthor Commented:
Thanks for the expanded Bias table with the calculated distributions and percentages for each.

I'm tinkering with it now.  

1. When adjusting the biases, do I try to lock in the largest distribution first, 2nd largest, etc., or do it in reverse order?

2. If I add more ranges or delete one, is the last VLOOKUP always preceded with a minus sign?

Thanks in advance.
Patrick MatthewsCommented:
1) Which band you list first does not matter.  Just make sure that what you use in the first column of that table starts with zero, and that the index values are such that you get your (targeted) distributions correct.

2) The formula should be structured:

=INT(RAND()*( #vlookup to grab upper bound# - #vlookup to grab lower bound# ) + #vlookup to grab lower bound# )
WaterStreetAuthor Commented:
It looks like I cannot do what I wanted, because the sequences of biases need to be in ascending sequence and the corresponding ranges have to be in ascending sequence.  That constraint doesn't seem to fit my business case.  For example, for one business case I want numbers in the range 48 to 57 to have around 50%, range 65 to 90 to be around 32%, range 91 to 122 to be around 16% and range 33 to 47 to be around 2%.

It is a masterful technique but it seems to have constraints that can't even approximately fit what I want to do.

I hope I'm just doing this wrong and you can show how, or does the technique not work for what I want to do; is there another solution?

Ultimately, I need to populate a 64 X 64 table with the population distributions of my business case example, above.   Maybe one solution would be to show me how to easily fully populate the table first with my largest distribution.  Then successively populate it with the 2nd largest distribution, 3rd largest, etc.

In any case, I need to populate a 64 X 64 table with approximately the right percent distribution for each range -- and the ranges need to randomly distributed throughout the table.

Thanks in advance
WaterStreetAuthor Commented:

I can cut the number of bands from four to three and I can adjust the span of the bands so that the lowest (45 to 57) would have the largest biased distribution % in the main table, the next highest band (65 to 96) would have 2nd largest % distribution, and the highest band (97 to 122) would have smallest distribution % in the table.

The technique you provided requires the highest band to have the largest distribution percentage;  the mid-range band(s) would have successively lower distribution percentages; and the lowest band, the least distribution

In other words, the the current technique seems to require the largest bias be given to the highest band and the lowest bias be given to the lowest band.   That just happens to be the reverse of what I need.

So, (please refer to the last worksheet you provided, above in ID: 38860481) can you modify the following algorithm so that the highest bias be given to the lowest of my (now) three bands, and that the lowest Bias be given to the highest band?

=INT(RAND()*( #vlookup to grab upper bound# - #vlookup to grab lower bound# ) + #vlookup to grab lower bound# )

Or just show me and I'll do it.

WaterStreetAuthor Commented:
Well, I'll be ...

Tweaked it to get what I wanted.

Thanks a lot, and I learned a lot of good technique too.

Adjusting the Bias was tricky.  Then had to recalculate a number of times until I got the desired ratios.

When you get a chance, any theory you can pass on for adjusting the biases.  It seems to relate the proportions between the adjacent biases as well as their selected values.  Other that that, I am clueless.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.