Solved

Generate a table of biased "random" integer cells for a given range of numbers with 4 subsets to each have a different weight for insertion in the table

Posted on 2013-02-05
11
368 Views
Last Modified: 2013-02-08
For background and context, please see the following EE Question: http://www.experts-
exchange.com/Software/Office_Productivity/Office_Suites/MS_Office/Excel/Q_28020212.html#a38854251 where the Accepted Solution is =INT(RAND()*(122-65+1)+65)
 
I've now expanded the range to be from 48 to 122 as in =INT(RAND()*(122-48+1)+48).

How I would use the Accepted Solution to generate a table of integer random numbers in the range of 48 to 122, like in my previous EE question, but where there are (let's say
3 or 4) sub ranges within -- and that each sub range has a specified weight/bias (Bias) for appearing in the table.   I'm thinking of: (1) Using the Accepted Solution to generate
random integers in the range of 0 to 9, =INT(RAND()*(9-0+1)+0, for each column (in a new Exterior Row above the table) and for each row (in a new Exterior Column left of the table); AND (2) Then using for each cell in the main table an IF Function with (a) with its Test for the Bias being some range of combinations of the cell's Exterior row and Exterior cell (random integers), and (b) using the Accepted Solution in the resulting True/False results (for the 4 sub ranges in the main table).

The sub ranges could be 33 to 47, 48 to 57, 65 to 90 and 91 to 122.  The respective bias
are 10, 20, 30 and 40% approximately.

I only have Excel 2003 and I'd rather not use =RANDBETWEEN(65,122), because I don't want to bother digging up my Microsoft CD's (still unpacked from moving), in order to load the Analysis ToolPak add-in.

How might this be done easily?

Thanks
0
Comment
Question by:WaterStreet
  • 7
  • 4
11 Comments
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 38857891
Please see the attached file:

Q-28021494.xls

I use a random value in Col A, and a table of values in G2:I5 to hold the biases:

0      33      47
0.1      48      57
0.3      65      90
0.6      91      122

Based on that, the formula to draw a random integer is:

=INT(RAND()*(VLOOKUP(A2,$G$2:$I$5,3)-VLOOKUP(A2,$G$2:$I$5,2))+VLOOKUP(A2,$G$2:$I$5,2))
0
 
LVL 18

Author Comment

by:WaterStreet
ID: 38858094
Thanks mp,

I just got back and am starting to look at what you posted.
0
 
LVL 18

Author Comment

by:WaterStreet
ID: 38860341
I spent a lot of time tinkering with it and some fun too.  My sense is that I can get it to work for me if I understood the logic of the biases and how they are applied (they are not percentages of the distributions throughout the table ).

1.  How can the biases work to relate them to percentages?  Do they need to add up to 1.0?

2.  The High range values never appear in the distribution, so I fixed this by adding one to each.

3.  I understand VLOOKUP, how do I apply this solution when I only have three ranges instead of 4?


Thanks


Thanks
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 38860481
The attached file differs only in that I have more values being generated, and I added formulas in Columns K & L to show how many items there are in each band.

The numbers will fluctuate a bit on each recalc, but they are coming in to the targeted spread.

Q-28021494.xls

As for having three bands, or five, or nineteen, just adjust the lookup table accordingly, and the VLOOKUP expressions where the random integers are generated.

The biasing works because the random number in Col A is a value >= 0 and < 1.  So, the the first band is for (Col A) >=0 and (Col A) < 0.1.  That spread is 10%.
0
 
LVL 18

Author Comment

by:WaterStreet
ID: 38860712
Thanks for the expanded Bias table with the calculated distributions and percentages for each.

I'm tinkering with it now.  

1. When adjusting the biases, do I try to lock in the largest distribution first, 2nd largest, etc., or do it in reverse order?

2. If I add more ranges or delete one, is the last VLOOKUP always preceded with a minus sign?

Thanks in advance.
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 38860886
1) Which band you list first does not matter.  Just make sure that what you use in the first column of that table starts with zero, and that the index values are such that you get your (targeted) distributions correct.

2) The formula should be structured:

=INT(RAND()*( #vlookup to grab upper bound# - #vlookup to grab lower bound# ) + #vlookup to grab lower bound# )
0
 
LVL 18

Author Comment

by:WaterStreet
ID: 38862197
It looks like I cannot do what I wanted, because the sequences of biases need to be in ascending sequence and the corresponding ranges have to be in ascending sequence.  That constraint doesn't seem to fit my business case.  For example, for one business case I want numbers in the range 48 to 57 to have around 50%, range 65 to 90 to be around 32%, range 91 to 122 to be around 16% and range 33 to 47 to be around 2%.

It is a masterful technique but it seems to have constraints that can't even approximately fit what I want to do.

I hope I'm just doing this wrong and you can show how, or does the technique not work for what I want to do; is there another solution?

Ultimately, I need to populate a 64 X 64 table with the population distributions of my business case example, above.   Maybe one solution would be to show me how to easily fully populate the table first with my largest distribution.  Then successively populate it with the 2nd largest distribution, 3rd largest, etc.

In any case, I need to populate a 64 X 64 table with approximately the right percent distribution for each range -- and the ranges need to randomly distributed throughout the table.


Thanks in advance
0
 
LVL 18

Author Comment

by:WaterStreet
ID: 38866058
mp,

I can cut the number of bands from four to three and I can adjust the span of the bands so that the lowest (45 to 57) would have the largest biased distribution % in the main table, the next highest band (65 to 96) would have 2nd largest % distribution, and the highest band (97 to 122) would have smallest distribution % in the table.

The technique you provided requires the highest band to have the largest distribution percentage;  the mid-range band(s) would have successively lower distribution percentages; and the lowest band, the least distribution

In other words, the the current technique seems to require the largest bias be given to the highest band and the lowest bias be given to the lowest band.   That just happens to be the reverse of what I need.

So, (please refer to the last worksheet you provided, above in ID: 38860481) can you modify the following algorithm so that the highest bias be given to the lowest of my (now) three bands, and that the lowest Bias be given to the highest band?

=INT(RAND()*( #vlookup to grab upper bound# - #vlookup to grab lower bound# ) + #vlookup to grab lower bound# )

Or just show me and I'll do it.

Thanks
0
 
LVL 92

Accepted Solution

by:
Patrick Matthews earned 500 total points
ID: 38866312
WS,

Actually, the widest band can be anywhere, as this version indicates:

Q-28021494.xls

:)

Patrick
0
 
LVL 18

Author Closing Comment

by:WaterStreet
ID: 38867083
Well, I'll be ...

Tweaked it to get what I wanted.

Thanks a lot, and I learned a lot of good technique too.

Adjusting the Bias was tricky.  Then had to recalculate a number of times until I got the desired ratios.

When you get a chance, any theory you can pass on for adjusting the biases.  It seems to relate the proportions between the adjacent biases as well as their selected values.  Other that that, I am clueless.

Michael
0
 
LVL 18

Author Comment

by:WaterStreet
ID: 38870064
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Excel and Formulas 8 31
Excel - Page layout - Margins 7 32
Excel 6 18
How to compare data in excel 5 27
Drop Down List with Unique/Distinct Values (enhancing the Combo-Box with a few steps and a little code) David miller (dlmille) Intro Have you ever created a data validation list from a database field or spreadsheet column (e.g., Zip Codes or Co…
This code takes an Excel list of URL’s and adds a header titled “URL List”. It then searches through all URL’s in column “A”, looking for duplicates. When a duplicate is found, it is moved to the top of the list. The duplicate URL’s are then highlig…
This Micro Tutorial will demonstrate in Google Sheets how to use the HYPERLINK function to create live links inside your spreadsheet.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now