Solved

# Generate a table of biased "random" integer cells for a given range of numbers with 4 subsets to each have a different weight for insertion in the table

Posted on 2013-02-05
Medium Priority
388 Views
For background and context, please see the following EE Question: http://www.experts-
exchange.com/Software/Office_Productivity/Office_Suites/MS_Office/Excel/Q_28020212.html#a38854251 where the Accepted Solution is =INT(RAND()*(122-65+1)+65)

I've now expanded the range to be from 48 to 122 as in =INT(RAND()*(122-48+1)+48).

How I would use the Accepted Solution to generate a table of integer random numbers in the range of 48 to 122, like in my previous EE question, but where there are (let's say
3 or 4) sub ranges within -- and that each sub range has a specified weight/bias (Bias) for appearing in the table.   I'm thinking of: (1) Using the Accepted Solution to generate
random integers in the range of 0 to 9, =INT(RAND()*(9-0+1)+0, for each column (in a new Exterior Row above the table) and for each row (in a new Exterior Column left of the table); AND (2) Then using for each cell in the main table an IF Function with (a) with its Test for the Bias being some range of combinations of the cell's Exterior row and Exterior cell (random integers), and (b) using the Accepted Solution in the resulting True/False results (for the 4 sub ranges in the main table).

The sub ranges could be 33 to 47, 48 to 57, 65 to 90 and 91 to 122.  The respective bias
are 10, 20, 30 and 40% approximately.

I only have Excel 2003 and I'd rather not use =RANDBETWEEN(65,122), because I don't want to bother digging up my Microsoft CD's (still unpacked from moving), in order to load the Analysis ToolPak add-in.

How might this be done easily?

Thanks
0
Question by:WaterStreet
[X]
###### Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

• Help others & share knowledge
• Earn cash & points
• 7
• 4

LVL 93

Expert Comment

ID: 38857891

Q-28021494.xls

I use a random value in Col A, and a table of values in G2:I5 to hold the biases:

0      33      47
0.1      48      57
0.3      65      90
0.6      91      122

Based on that, the formula to draw a random integer is:

=INT(RAND()*(VLOOKUP(A2,\$G\$2:\$I\$5,3)-VLOOKUP(A2,\$G\$2:\$I\$5,2))+VLOOKUP(A2,\$G\$2:\$I\$5,2))
0

LVL 18

Author Comment

ID: 38858094
Thanks mp,

I just got back and am starting to look at what you posted.
0

LVL 18

Author Comment

ID: 38860341
I spent a lot of time tinkering with it and some fun too.  My sense is that I can get it to work for me if I understood the logic of the biases and how they are applied (they are not percentages of the distributions throughout the table ).

1.  How can the biases work to relate them to percentages?  Do they need to add up to 1.0?

2.  The High range values never appear in the distribution, so I fixed this by adding one to each.

3.  I understand VLOOKUP, how do I apply this solution when I only have three ranges instead of 4?

Thanks

Thanks
0

LVL 93

Expert Comment

ID: 38860481
The attached file differs only in that I have more values being generated, and I added formulas in Columns K & L to show how many items there are in each band.

The numbers will fluctuate a bit on each recalc, but they are coming in to the targeted spread.

Q-28021494.xls

As for having three bands, or five, or nineteen, just adjust the lookup table accordingly, and the VLOOKUP expressions where the random integers are generated.

The biasing works because the random number in Col A is a value >= 0 and < 1.  So, the the first band is for (Col A) >=0 and (Col A) < 0.1.  That spread is 10%.
0

LVL 18

Author Comment

ID: 38860712
Thanks for the expanded Bias table with the calculated distributions and percentages for each.

I'm tinkering with it now.

1. When adjusting the biases, do I try to lock in the largest distribution first, 2nd largest, etc., or do it in reverse order?

2. If I add more ranges or delete one, is the last VLOOKUP always preceded with a minus sign?

0

LVL 93

Expert Comment

ID: 38860886
1) Which band you list first does not matter.  Just make sure that what you use in the first column of that table starts with zero, and that the index values are such that you get your (targeted) distributions correct.

2) The formula should be structured:

=INT(RAND()*( #vlookup to grab upper bound# - #vlookup to grab lower bound# ) + #vlookup to grab lower bound# )
0

LVL 18

Author Comment

ID: 38862197
It looks like I cannot do what I wanted, because the sequences of biases need to be in ascending sequence and the corresponding ranges have to be in ascending sequence.  That constraint doesn't seem to fit my business case.  For example, for one business case I want numbers in the range 48 to 57 to have around 50%, range 65 to 90 to be around 32%, range 91 to 122 to be around 16% and range 33 to 47 to be around 2%.

It is a masterful technique but it seems to have constraints that can't even approximately fit what I want to do.

I hope I'm just doing this wrong and you can show how, or does the technique not work for what I want to do; is there another solution?

Ultimately, I need to populate a 64 X 64 table with the population distributions of my business case example, above.   Maybe one solution would be to show me how to easily fully populate the table first with my largest distribution.  Then successively populate it with the 2nd largest distribution, 3rd largest, etc.

In any case, I need to populate a 64 X 64 table with approximately the right percent distribution for each range -- and the ranges need to randomly distributed throughout the table.

0

LVL 18

Author Comment

ID: 38866058
mp,

I can cut the number of bands from four to three and I can adjust the span of the bands so that the lowest (45 to 57) would have the largest biased distribution % in the main table, the next highest band (65 to 96) would have 2nd largest % distribution, and the highest band (97 to 122) would have smallest distribution % in the table.

The technique you provided requires the highest band to have the largest distribution percentage;  the mid-range band(s) would have successively lower distribution percentages; and the lowest band, the least distribution

In other words, the the current technique seems to require the largest bias be given to the highest band and the lowest bias be given to the lowest band.   That just happens to be the reverse of what I need.

So, (please refer to the last worksheet you provided, above in ID: 38860481) can you modify the following algorithm so that the highest bias be given to the lowest of my (now) three bands, and that the lowest Bias be given to the highest band?

=INT(RAND()*( #vlookup to grab upper bound# - #vlookup to grab lower bound# ) + #vlookup to grab lower bound# )

Or just show me and I'll do it.

Thanks
0

LVL 93

Accepted Solution

Patrick Matthews earned 2000 total points
ID: 38866312
WS,

Actually, the widest band can be anywhere, as this version indicates:

Q-28021494.xls

:)

Patrick
0

LVL 18

Author Closing Comment

ID: 38867083
Well, I'll be ...

Tweaked it to get what I wanted.

Thanks a lot, and I learned a lot of good technique too.

Adjusting the Bias was tricky.  Then had to recalculate a number of times until I got the desired ratios.

When you get a chance, any theory you can pass on for adjusting the biases.  It seems to relate the proportions between the adjacent biases as well as their selected values.  Other that that, I am clueless.

Michael
0

LVL 18

Author Comment

ID: 38870064
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Access developers frequently have requirements to interact with Excel (import from or output to) in their applications.  You might be able to accomplish this with the TransferSpreadsheet and OutputTo methods, but in this series of articles I will diâ€¦
This article describes how to use a set of graphical playing cards to create a Draw Poker game in Excel or VB6.
This Micro Tutorial demonstrates in Microsoft Excel how to consolidate your marketing data by creating an interactive charts using form controls. This creates cool drop-downs for viewers of your chart to choose from.
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templaâ€¦
###### Suggested Courses
Course of the Month10 days, 21 hours left to enroll