Expiring Today—Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Histogram in VBA

Posted on 2011-09-24
4
Medium Priority
?
617 Views
Last Modified: 2012-05-12
I have a simulation which runs in VBA. The array i want to create a histogram has over 1 million different numbers inside it. All of the numbers are percentages. The lowest number is -100%, and the highest number is unknown. It can be well above 100%. I am guessing the best way to do this, is when the array is being filled it should also be put into the appropriate bins. The binsize is going to be determined by the user.

What is the best way to do this?

I dont need the actual histogram, but if I can get the bins and number of members inside the bins that will be perfect.


0
Comment
Question by:awesomejohn19
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 93

Expert Comment

by:Patrick Matthews
ID: 36593986
awesomejohn19,

It would be useful to see some sample data and/or a sample file.  EE now allows you to directly upload files to your question.

The sample file should clearly illustrate both the input and the expected output given that input.  The sample file need not be very large, but it should have enough examples to cover the expected range of values/scenarios.

Please be advised that once you upload a file, it can be publicly accessed, and that it may not be possible to fully and permanently delete it.  The file may also be indexed by the major search engines.

Therefore, be very careful about posting proprietary, confidential, or other sensitive information.  If necessary, use "fake" and/or obfuscated data in your sample.

Please note that at present EE restricts uploads to certain file types.  If your file type does not match those in the list, you can use http://www.ee-stuff.com instead, which is not officially an EE site, but is run by people connected to EE.

Patrick
0
 

Author Comment

by:awesomejohn19
ID: 36594020
THanks Patrick.  I can just put sample data here.

So the sample output should be like this. I am using a bin size of 0.5% in here. But that needs to be determined by the user.

It will start at -100% and will go like this. The first part is the range (with bin size 0.5%), and the second part is the number of observations in that range
-1% - - 0.5% - 98
-0.5% - 0 - 143
0%-0.5% - 154
0.5% - 1% -130
1%-1.5% - 100
 and goes on like this
.
.

so a sample of data to be analyzed will be like the numbers below. there is going to be at least 500k observations and might go to millions. So, we are going to put the following sample data into the bins of the histogram. Please let me know if you have other questions.

0.14%
-0.05%
0.00%
-0.15%
0.09%
0.18%
-0.58%
-0.94%
-1.10%
-0.94%
-0.87%
-0.80%
0.56%
1.02%
1.45%
-0.14%
-0.29%
1.27%
2.25%
2.85%
3.21%
2.74%
2.90%
2.96%
2.64%
2.78%
2.80%
3.70%
3.52%
3.15%
14.90%
14.40%
13.73%
14.16%
13.75%
13.92%
14.61%
14.73%
14.73%
16.20%
17.87%
16.88%
16.88%
16.83%
15.97%
16.08%
15.82%
15.73%
16.84%
16.73%
16.72%
15.62%
16.52%
17.91%
20.79%
21.61%
20.57%
22.46%
27.99%
28.54%
29.05%
27.91%
28.06%
27.50%
29.68%
32.06%
32.50%
33.74%
28.21%
28.84%
28.86%
31.15%
31.61%
31.07%
31.68%
31.75%
27.17%
27.16%
27.84%
27.66%
29.32%
29.44%
29.08%
30.21%
30.56%
32.13%
32.67%
32.03%
26.58%
26.16%
27.01%
27.58%
30.52%
30.77%
30.44%
36.57%
36.95%
37.35%
38.78%
38.01%
38.03%
37.77%
38.15%
38.69%
38.65%
38.98%
38.92%
39.37%
40.74%
41.12%
41.56%
41.22%
40.78%
37.96%
37.78%
36.74%
36.59%
36.35%
36.29%
29.34%
30.06%
29.44%
28.40%
29.15%
28.86%
23.54%
24.03%
25.89%
24.77%
24.88%
26.10%
25.83%
25.95%
25.89%
25.97%
26.33%
36.67%
28.81%
28.90%
29.86%
28.28%
27.90%
27.97%
28.29%
28.73%
28.71%
29.37%
29.79%
29.56%
29.41%
29.65%
30.03%
29.58%
29.43%
30.13%
30.88%
29.84%
29.70%
29.51%
29.52%
35.13%
34.62%
35.08%
34.71%
34.70%
34.13%
34.42%
33.75%
32.34%
35.17%
34.05%
35.56%
36.17%
36.20%
37.25%
37.88%
33.54%
33.67%
33.73%
33.79%
34.59%
33.91%
34.55%
34.82%
34.03%
34.10%
36.58%
36.93%
37.09%
40.29%
40.97%
38.73%
37.86%
37.69%
37.45%
40.88%
40.97%
41.81%
41.82%
42.21%
42.19%
41.47%
40.51%
41.19%
40.83%
40.78%
40.15%
38.93%
42.14%
44.32%
44.58%
44.63%
42.90%
44.19%
44.14%
43.61%
43.14%
42.62%
42.89%
42.40%
38.88%
39.89%
39.85%
45.30%
45.00%
45.41%
46.86%
46.33%
43.32%
43.15%
43.47%
43.55%
43.55%
43.18%
43.58%
43.31%
47.27%
46.89%
45.72%
41.74%
41.49%
41.05%
39.91%
40.04%
39.78%
40.11%
39.46%
41.74%
41.55%
40.91%
41.70%
41.77%
42.29%
41.69%
44.16%
44.73%
44.49%
45.03%
44.75%
42.87%
42.23%
41.28%
41.50%
40.76%
40.62%
35.33%
35.93%
36.02%
33.88%
33.32%
32.05%
34.35%
36.97%
36.40%
32.95%
33.12%
-8.79%
-8.60%
-8.56%
-9.13%
-9.40%
-10.45%
-10.38%
-10.53%
-8.79%
-9.17%
-8.70%
-8.38%
1.07%
0.64%
1.04%
0.71%
0.56%
-0.02%
1.18%
-1.51%
-1.50%
-0.17%
-0.24%
-0.59%
-0.11%
-0.31%
-0.04%
-0.30%
-0.20%
-0.27%
-0.40%
-0.42%
-0.75%
4.30%
6.90%
6.47%
6.94%
6.91%
7.91%
8.24%
5.73%
2.05%
2.08%
1.80%
-2.91%
-2.36%
-3.13%
-2.57%
-2.00%
-1.78%
-2.47%
-2.51%
-3.87%
-3.50%
-3.19%
-3.30%
-3.31%
-2.63%
-2.76%
-2.76%
-1.42%
-1.33%
-1.68%
-2.27%
-2.33%
-1.47%
-2.50%
-2.32%
-2.20%
-3.12%
-2.44%
-2.57%
-3.08%
-2.68%
-2.47%
-1.27%
-1.76%
-2.39%
-2.40%
-2.26%
-1.75%
-1.37%
-0.97%
-0.02%
0.60%
-0.36%
-0.20%
0.70%
0.62%
1.42%
1.95%
2.97%
3.80%
3.45%
3.68%
3.72%
4.27%
4.24%
4.03%
4.39%
4.43%
4.53%
4.50%
4.34%
4.73%
4.88%
5.21%
8.12%
7.85%
8.78%
9.77%
9.81%
9.98%
10.14%
9.86%
10.30%
10.53%
10.73%
10.15%
10.34%
11.34%
11.46%
11.32%
13.09%
12.37%
11.84%
11.98%
11.95%
12.17%
13.25%
13.00%
14.34%
14.16%
14.27%
14.78%
14.89%
16.41%
16.78%
16.29%
17.24%
16.62%
16.95%
16.74%
16.74%
16.79%
16.61%
16.40%
16.28%
15.81%
-6.75%
-6.78%
-6.79%
-7.34%
-8.05%
-7.70%
-9.09%
-9.04%
-11.19%
-11.93%
-10.81%
-11.24%
-11.45%
-11.35%
-11.14%
-10.09%
-9.72%
-9.29%
-8.66%
-9.52%
-9.91%
-9.53%
-12.94%
-12.88%
-13.31%
-12.15%
-12.25%
-11.72%
-12.04%
-12.14%
-12.38%
-12.46%
-11.64%
-11.66%
-11.76%
-11.74%
-11.52%
-11.70%
-11.59%
-11.04%
-11.16%
-11.51%
-11.05%
-10.56%
-9.57%
-9.46%
-9.71%
-8.47%
-8.07%
-8.01%
-8.59%
-8.63%
-8.87%
-8.33%
-7.82%
-8.22%
-9.10%
-8.90%
-9.35%
-8.87%
-9.04%
-17.98%
-17.49%
-15.96%
-14.72%
-15.05%
-15.29%
-15.64%
-15.82%
-15.48%
-13.09%
-12.60%
-10.91%
-13.68%
-11.85%
-11.53%
-11.01%
-10.19%
-10.46%
-10.87%
-11.16%
-12.02%
-12.19%
-12.37%
-12.99%
-11.65%
-11.43%
-10.03%
-10.03%
-10.02%
-9.67%
-9.33%
-8.71%
-8.47%
-8.61%
-8.38%
-7.85%
-7.33%
-8.53%

Open in new window

0
 

Author Comment

by:awesomejohn19
ID: 36599034
Did I confuse you more? :)

0
 
LVL 14

Accepted Solution

by:
frankhelk earned 2000 total points
ID: 36599424
Well - I assume that getting the counts into the bins would be no problem. The only basic problem would be to determine what value in the data is the highest without generating and storing all of the data first.

1: Brute force attempt ...
Just create a bin array big enough to fit for any percentage you deam of.  That attempt may be not sufficient due to lack of knowledge about the numbers to work on. Besides of theat it's neither size optimized nor elegant.

2: Dynamic attempt:
- On init, create an array for the bins used within (i.e.) -100% to 100%, store the maximum index for the highest Value into a variable.
- sort your data into the bins until you find a value whose array index exceeds the stored value
- Redimension your Array with
    REDIM PRESERVE variable(new size)
- store the new max bin index into the variable
- repeat until all data is categorized.
You might as well (generic) start the array with only one member that represents the leftmost bin at -100% ... that way the array would even be of the lowest possible size if there are only members within the lowest bin.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This code takes an Excel list of URL’s and adds a header titled “URL List”. It then searches through all URL’s in column “A”, looking for duplicates. When a duplicate is found, it is moved to the top of the list. The duplicate URL’s are then highlig…
How to get Spreadsheet Compare 2016 working with the 64 bit version of Office 2016
This Micro Tutorial will demonstrate on a Mac how to change the sort order for chart legend values and decrpyt the intimidating chart menu.
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question