Solved

Histogram in VBA

Posted on 2011-09-24
4
580 Views
Last Modified: 2012-05-12
I have a simulation which runs in VBA. The array i want to create a histogram has over 1 million different numbers inside it. All of the numbers are percentages. The lowest number is -100%, and the highest number is unknown. It can be well above 100%. I am guessing the best way to do this, is when the array is being filled it should also be put into the appropriate bins. The binsize is going to be determined by the user.

What is the best way to do this?

I dont need the actual histogram, but if I can get the bins and number of members inside the bins that will be perfect.


0
Comment
Question by:awesomejohn19
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 93

Expert Comment

by:Patrick Matthews
ID: 36593986
awesomejohn19,

It would be useful to see some sample data and/or a sample file.  EE now allows you to directly upload files to your question.

The sample file should clearly illustrate both the input and the expected output given that input.  The sample file need not be very large, but it should have enough examples to cover the expected range of values/scenarios.

Please be advised that once you upload a file, it can be publicly accessed, and that it may not be possible to fully and permanently delete it.  The file may also be indexed by the major search engines.

Therefore, be very careful about posting proprietary, confidential, or other sensitive information.  If necessary, use "fake" and/or obfuscated data in your sample.

Please note that at present EE restricts uploads to certain file types.  If your file type does not match those in the list, you can use http://www.ee-stuff.com instead, which is not officially an EE site, but is run by people connected to EE.

Patrick
0
 

Author Comment

by:awesomejohn19
ID: 36594020
THanks Patrick.  I can just put sample data here.

So the sample output should be like this. I am using a bin size of 0.5% in here. But that needs to be determined by the user.

It will start at -100% and will go like this. The first part is the range (with bin size 0.5%), and the second part is the number of observations in that range
-1% - - 0.5% - 98
-0.5% - 0 - 143
0%-0.5% - 154
0.5% - 1% -130
1%-1.5% - 100
 and goes on like this
.
.

so a sample of data to be analyzed will be like the numbers below. there is going to be at least 500k observations and might go to millions. So, we are going to put the following sample data into the bins of the histogram. Please let me know if you have other questions.

0.14%
-0.05%
0.00%
-0.15%
0.09%
0.18%
-0.58%
-0.94%
-1.10%
-0.94%
-0.87%
-0.80%
0.56%
1.02%
1.45%
-0.14%
-0.29%
1.27%
2.25%
2.85%
3.21%
2.74%
2.90%
2.96%
2.64%
2.78%
2.80%
3.70%
3.52%
3.15%
14.90%
14.40%
13.73%
14.16%
13.75%
13.92%
14.61%
14.73%
14.73%
16.20%
17.87%
16.88%
16.88%
16.83%
15.97%
16.08%
15.82%
15.73%
16.84%
16.73%
16.72%
15.62%
16.52%
17.91%
20.79%
21.61%
20.57%
22.46%
27.99%
28.54%
29.05%
27.91%
28.06%
27.50%
29.68%
32.06%
32.50%
33.74%
28.21%
28.84%
28.86%
31.15%
31.61%
31.07%
31.68%
31.75%
27.17%
27.16%
27.84%
27.66%
29.32%
29.44%
29.08%
30.21%
30.56%
32.13%
32.67%
32.03%
26.58%
26.16%
27.01%
27.58%
30.52%
30.77%
30.44%
36.57%
36.95%
37.35%
38.78%
38.01%
38.03%
37.77%
38.15%
38.69%
38.65%
38.98%
38.92%
39.37%
40.74%
41.12%
41.56%
41.22%
40.78%
37.96%
37.78%
36.74%
36.59%
36.35%
36.29%
29.34%
30.06%
29.44%
28.40%
29.15%
28.86%
23.54%
24.03%
25.89%
24.77%
24.88%
26.10%
25.83%
25.95%
25.89%
25.97%
26.33%
36.67%
28.81%
28.90%
29.86%
28.28%
27.90%
27.97%
28.29%
28.73%
28.71%
29.37%
29.79%
29.56%
29.41%
29.65%
30.03%
29.58%
29.43%
30.13%
30.88%
29.84%
29.70%
29.51%
29.52%
35.13%
34.62%
35.08%
34.71%
34.70%
34.13%
34.42%
33.75%
32.34%
35.17%
34.05%
35.56%
36.17%
36.20%
37.25%
37.88%
33.54%
33.67%
33.73%
33.79%
34.59%
33.91%
34.55%
34.82%
34.03%
34.10%
36.58%
36.93%
37.09%
40.29%
40.97%
38.73%
37.86%
37.69%
37.45%
40.88%
40.97%
41.81%
41.82%
42.21%
42.19%
41.47%
40.51%
41.19%
40.83%
40.78%
40.15%
38.93%
42.14%
44.32%
44.58%
44.63%
42.90%
44.19%
44.14%
43.61%
43.14%
42.62%
42.89%
42.40%
38.88%
39.89%
39.85%
45.30%
45.00%
45.41%
46.86%
46.33%
43.32%
43.15%
43.47%
43.55%
43.55%
43.18%
43.58%
43.31%
47.27%
46.89%
45.72%
41.74%
41.49%
41.05%
39.91%
40.04%
39.78%
40.11%
39.46%
41.74%
41.55%
40.91%
41.70%
41.77%
42.29%
41.69%
44.16%
44.73%
44.49%
45.03%
44.75%
42.87%
42.23%
41.28%
41.50%
40.76%
40.62%
35.33%
35.93%
36.02%
33.88%
33.32%
32.05%
34.35%
36.97%
36.40%
32.95%
33.12%
-8.79%
-8.60%
-8.56%
-9.13%
-9.40%
-10.45%
-10.38%
-10.53%
-8.79%
-9.17%
-8.70%
-8.38%
1.07%
0.64%
1.04%
0.71%
0.56%
-0.02%
1.18%
-1.51%
-1.50%
-0.17%
-0.24%
-0.59%
-0.11%
-0.31%
-0.04%
-0.30%
-0.20%
-0.27%
-0.40%
-0.42%
-0.75%
4.30%
6.90%
6.47%
6.94%
6.91%
7.91%
8.24%
5.73%
2.05%
2.08%
1.80%
-2.91%
-2.36%
-3.13%
-2.57%
-2.00%
-1.78%
-2.47%
-2.51%
-3.87%
-3.50%
-3.19%
-3.30%
-3.31%
-2.63%
-2.76%
-2.76%
-1.42%
-1.33%
-1.68%
-2.27%
-2.33%
-1.47%
-2.50%
-2.32%
-2.20%
-3.12%
-2.44%
-2.57%
-3.08%
-2.68%
-2.47%
-1.27%
-1.76%
-2.39%
-2.40%
-2.26%
-1.75%
-1.37%
-0.97%
-0.02%
0.60%
-0.36%
-0.20%
0.70%
0.62%
1.42%
1.95%
2.97%
3.80%
3.45%
3.68%
3.72%
4.27%
4.24%
4.03%
4.39%
4.43%
4.53%
4.50%
4.34%
4.73%
4.88%
5.21%
8.12%
7.85%
8.78%
9.77%
9.81%
9.98%
10.14%
9.86%
10.30%
10.53%
10.73%
10.15%
10.34%
11.34%
11.46%
11.32%
13.09%
12.37%
11.84%
11.98%
11.95%
12.17%
13.25%
13.00%
14.34%
14.16%
14.27%
14.78%
14.89%
16.41%
16.78%
16.29%
17.24%
16.62%
16.95%
16.74%
16.74%
16.79%
16.61%
16.40%
16.28%
15.81%
-6.75%
-6.78%
-6.79%
-7.34%
-8.05%
-7.70%
-9.09%
-9.04%
-11.19%
-11.93%
-10.81%
-11.24%
-11.45%
-11.35%
-11.14%
-10.09%
-9.72%
-9.29%
-8.66%
-9.52%
-9.91%
-9.53%
-12.94%
-12.88%
-13.31%
-12.15%
-12.25%
-11.72%
-12.04%
-12.14%
-12.38%
-12.46%
-11.64%
-11.66%
-11.76%
-11.74%
-11.52%
-11.70%
-11.59%
-11.04%
-11.16%
-11.51%
-11.05%
-10.56%
-9.57%
-9.46%
-9.71%
-8.47%
-8.07%
-8.01%
-8.59%
-8.63%
-8.87%
-8.33%
-7.82%
-8.22%
-9.10%
-8.90%
-9.35%
-8.87%
-9.04%
-17.98%
-17.49%
-15.96%
-14.72%
-15.05%
-15.29%
-15.64%
-15.82%
-15.48%
-13.09%
-12.60%
-10.91%
-13.68%
-11.85%
-11.53%
-11.01%
-10.19%
-10.46%
-10.87%
-11.16%
-12.02%
-12.19%
-12.37%
-12.99%
-11.65%
-11.43%
-10.03%
-10.03%
-10.02%
-9.67%
-9.33%
-8.71%
-8.47%
-8.61%
-8.38%
-7.85%
-7.33%
-8.53%

Open in new window

0
 

Author Comment

by:awesomejohn19
ID: 36599034
Did I confuse you more? :)

0
 
LVL 14

Accepted Solution

by:
frankhelk earned 500 total points
ID: 36599424
Well - I assume that getting the counts into the bins would be no problem. The only basic problem would be to determine what value in the data is the highest without generating and storing all of the data first.

1: Brute force attempt ...
Just create a bin array big enough to fit for any percentage you deam of.  That attempt may be not sufficient due to lack of knowledge about the numbers to work on. Besides of theat it's neither size optimized nor elegant.

2: Dynamic attempt:
- On init, create an array for the bins used within (i.e.) -100% to 100%, store the maximum index for the highest Value into a variable.
- sort your data into the bins until you find a value whose array index exceeds the stored value
- Redimension your Array with
    REDIM PRESERVE variable(new size)
- store the new max bin index into the variable
- repeat until all data is categorized.
You might as well (generic) start the array with only one member that represents the leftmost bin at -100% ... that way the array would even be of the lowest possible size if there are only members within the lowest bin.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

You can of course define an array to hold data that is of a particular type like an array of Strings to hold customer names or an array of Doubles to hold customer sales, but what do you do if you want to coordinate that data? This article describes…
Freeze panes is an option within all variants of Excel to enable parts of a sheet to remain stationary when the cursor is in another part of the sheet. This is a very useful feature which is overlooked or under used.
This Micro Tutorial demonstrate the bugs in Microsoft Excel for Mac with Pivot Charts.
This Micro Tutorial demonstrates in Microsoft Excel how to consolidate your marketing data by creating an interactive charts using form controls. This creates cool drop-downs for viewers of your chart to choose from.

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question