Solved

Histogram in VBA

Posted on 2011-09-24
4
516 Views
Last Modified: 2012-05-12
I have a simulation which runs in VBA. The array i want to create a histogram has over 1 million different numbers inside it. All of the numbers are percentages. The lowest number is -100%, and the highest number is unknown. It can be well above 100%. I am guessing the best way to do this, is when the array is being filled it should also be put into the appropriate bins. The binsize is going to be determined by the user.

What is the best way to do this?

I dont need the actual histogram, but if I can get the bins and number of members inside the bins that will be perfect.


0
Comment
Question by:awesomejohn19
  • 2
4 Comments
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 36593986
awesomejohn19,

It would be useful to see some sample data and/or a sample file.  EE now allows you to directly upload files to your question.

The sample file should clearly illustrate both the input and the expected output given that input.  The sample file need not be very large, but it should have enough examples to cover the expected range of values/scenarios.

Please be advised that once you upload a file, it can be publicly accessed, and that it may not be possible to fully and permanently delete it.  The file may also be indexed by the major search engines.

Therefore, be very careful about posting proprietary, confidential, or other sensitive information.  If necessary, use "fake" and/or obfuscated data in your sample.

Please note that at present EE restricts uploads to certain file types.  If your file type does not match those in the list, you can use http://www.ee-stuff.com instead, which is not officially an EE site, but is run by people connected to EE.

Patrick
0
 

Author Comment

by:awesomejohn19
ID: 36594020
THanks Patrick.  I can just put sample data here.

So the sample output should be like this. I am using a bin size of 0.5% in here. But that needs to be determined by the user.

It will start at -100% and will go like this. The first part is the range (with bin size 0.5%), and the second part is the number of observations in that range
-1% - - 0.5% - 98
-0.5% - 0 - 143
0%-0.5% - 154
0.5% - 1% -130
1%-1.5% - 100
 and goes on like this
.
.

so a sample of data to be analyzed will be like the numbers below. there is going to be at least 500k observations and might go to millions. So, we are going to put the following sample data into the bins of the histogram. Please let me know if you have other questions.

0.14%
-0.05%
0.00%
-0.15%
0.09%
0.18%
-0.58%
-0.94%
-1.10%
-0.94%
-0.87%
-0.80%
0.56%
1.02%
1.45%
-0.14%
-0.29%
1.27%
2.25%
2.85%
3.21%
2.74%
2.90%
2.96%
2.64%
2.78%
2.80%
3.70%
3.52%
3.15%
14.90%
14.40%
13.73%
14.16%
13.75%
13.92%
14.61%
14.73%
14.73%
16.20%
17.87%
16.88%
16.88%
16.83%
15.97%
16.08%
15.82%
15.73%
16.84%
16.73%
16.72%
15.62%
16.52%
17.91%
20.79%
21.61%
20.57%
22.46%
27.99%
28.54%
29.05%
27.91%
28.06%
27.50%
29.68%
32.06%
32.50%
33.74%
28.21%
28.84%
28.86%
31.15%
31.61%
31.07%
31.68%
31.75%
27.17%
27.16%
27.84%
27.66%
29.32%
29.44%
29.08%
30.21%
30.56%
32.13%
32.67%
32.03%
26.58%
26.16%
27.01%
27.58%
30.52%
30.77%
30.44%
36.57%
36.95%
37.35%
38.78%
38.01%
38.03%
37.77%
38.15%
38.69%
38.65%
38.98%
38.92%
39.37%
40.74%
41.12%
41.56%
41.22%
40.78%
37.96%
37.78%
36.74%
36.59%
36.35%
36.29%
29.34%
30.06%
29.44%
28.40%
29.15%
28.86%
23.54%
24.03%
25.89%
24.77%
24.88%
26.10%
25.83%
25.95%
25.89%
25.97%
26.33%
36.67%
28.81%
28.90%
29.86%
28.28%
27.90%
27.97%
28.29%
28.73%
28.71%
29.37%
29.79%
29.56%
29.41%
29.65%
30.03%
29.58%
29.43%
30.13%
30.88%
29.84%
29.70%
29.51%
29.52%
35.13%
34.62%
35.08%
34.71%
34.70%
34.13%
34.42%
33.75%
32.34%
35.17%
34.05%
35.56%
36.17%
36.20%
37.25%
37.88%
33.54%
33.67%
33.73%
33.79%
34.59%
33.91%
34.55%
34.82%
34.03%
34.10%
36.58%
36.93%
37.09%
40.29%
40.97%
38.73%
37.86%
37.69%
37.45%
40.88%
40.97%
41.81%
41.82%
42.21%
42.19%
41.47%
40.51%
41.19%
40.83%
40.78%
40.15%
38.93%
42.14%
44.32%
44.58%
44.63%
42.90%
44.19%
44.14%
43.61%
43.14%
42.62%
42.89%
42.40%
38.88%
39.89%
39.85%
45.30%
45.00%
45.41%
46.86%
46.33%
43.32%
43.15%
43.47%
43.55%
43.55%
43.18%
43.58%
43.31%
47.27%
46.89%
45.72%
41.74%
41.49%
41.05%
39.91%
40.04%
39.78%
40.11%
39.46%
41.74%
41.55%
40.91%
41.70%
41.77%
42.29%
41.69%
44.16%
44.73%
44.49%
45.03%
44.75%
42.87%
42.23%
41.28%
41.50%
40.76%
40.62%
35.33%
35.93%
36.02%
33.88%
33.32%
32.05%
34.35%
36.97%
36.40%
32.95%
33.12%
-8.79%
-8.60%
-8.56%
-9.13%
-9.40%
-10.45%
-10.38%
-10.53%
-8.79%
-9.17%
-8.70%
-8.38%
1.07%
0.64%
1.04%
0.71%
0.56%
-0.02%
1.18%
-1.51%
-1.50%
-0.17%
-0.24%
-0.59%
-0.11%
-0.31%
-0.04%
-0.30%
-0.20%
-0.27%
-0.40%
-0.42%
-0.75%
4.30%
6.90%
6.47%
6.94%
6.91%
7.91%
8.24%
5.73%
2.05%
2.08%
1.80%
-2.91%
-2.36%
-3.13%
-2.57%
-2.00%
-1.78%
-2.47%
-2.51%
-3.87%
-3.50%
-3.19%
-3.30%
-3.31%
-2.63%
-2.76%
-2.76%
-1.42%
-1.33%
-1.68%
-2.27%
-2.33%
-1.47%
-2.50%
-2.32%
-2.20%
-3.12%
-2.44%
-2.57%
-3.08%
-2.68%
-2.47%
-1.27%
-1.76%
-2.39%
-2.40%
-2.26%
-1.75%
-1.37%
-0.97%
-0.02%
0.60%
-0.36%
-0.20%
0.70%
0.62%
1.42%
1.95%
2.97%
3.80%
3.45%
3.68%
3.72%
4.27%
4.24%
4.03%
4.39%
4.43%
4.53%
4.50%
4.34%
4.73%
4.88%
5.21%
8.12%
7.85%
8.78%
9.77%
9.81%
9.98%
10.14%
9.86%
10.30%
10.53%
10.73%
10.15%
10.34%
11.34%
11.46%
11.32%
13.09%
12.37%
11.84%
11.98%
11.95%
12.17%
13.25%
13.00%
14.34%
14.16%
14.27%
14.78%
14.89%
16.41%
16.78%
16.29%
17.24%
16.62%
16.95%
16.74%
16.74%
16.79%
16.61%
16.40%
16.28%
15.81%
-6.75%
-6.78%
-6.79%
-7.34%
-8.05%
-7.70%
-9.09%
-9.04%
-11.19%
-11.93%
-10.81%
-11.24%
-11.45%
-11.35%
-11.14%
-10.09%
-9.72%
-9.29%
-8.66%
-9.52%
-9.91%
-9.53%
-12.94%
-12.88%
-13.31%
-12.15%
-12.25%
-11.72%
-12.04%
-12.14%
-12.38%
-12.46%
-11.64%
-11.66%
-11.76%
-11.74%
-11.52%
-11.70%
-11.59%
-11.04%
-11.16%
-11.51%
-11.05%
-10.56%
-9.57%
-9.46%
-9.71%
-8.47%
-8.07%
-8.01%
-8.59%
-8.63%
-8.87%
-8.33%
-7.82%
-8.22%
-9.10%
-8.90%
-9.35%
-8.87%
-9.04%
-17.98%
-17.49%
-15.96%
-14.72%
-15.05%
-15.29%
-15.64%
-15.82%
-15.48%
-13.09%
-12.60%
-10.91%
-13.68%
-11.85%
-11.53%
-11.01%
-10.19%
-10.46%
-10.87%
-11.16%
-12.02%
-12.19%
-12.37%
-12.99%
-11.65%
-11.43%
-10.03%
-10.03%
-10.02%
-9.67%
-9.33%
-8.71%
-8.47%
-8.61%
-8.38%
-7.85%
-7.33%
-8.53%

Open in new window

0
 

Author Comment

by:awesomejohn19
ID: 36599034
Did I confuse you more? :)

0
 
LVL 13

Accepted Solution

by:
frankhelk earned 500 total points
ID: 36599424
Well - I assume that getting the counts into the bins would be no problem. The only basic problem would be to determine what value in the data is the highest without generating and storing all of the data first.

1: Brute force attempt ...
Just create a bin array big enough to fit for any percentage you deam of.  That attempt may be not sufficient due to lack of knowledge about the numbers to work on. Besides of theat it's neither size optimized nor elegant.

2: Dynamic attempt:
- On init, create an array for the bins used within (i.e.) -100% to 100%, store the maximum index for the highest Value into a variable.
- sort your data into the bins until you find a value whose array index exceeds the stored value
- Redimension your Array with
    REDIM PRESERVE variable(new size)
- store the new max bin index into the variable
- repeat until all data is categorized.
You might as well (generic) start the array with only one member that represents the leftmost bin at -100% ... that way the array would even be of the lowest possible size if there are only members within the lowest bin.
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Background What I'm presenting in this article is the result of 2 conditions in my work area: We have a SQL Server production environment but no development or test environment; andWe have an MS Access front end using tables in SQL Server but we a…
This article descibes how to create a connection between Excel and SAP and how to move data from Excel to SAP or the other way around.
The viewer will learn how to simulate a series of sales calls dependent on a single skill level and learn how to simulate a series of sales calls dependent on two skill levels. Simulating Independent Sales Calls: Enter .75 into cell C2 – “skill leve…
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now