Solved

Explanation of a histogram code required

Posted on 2009-04-11
3
474 Views
Last Modified: 2012-05-06
I found a histogram code online
is some one able to explain the code fully?
thanks
sub Hist3UK(M As Long, arr() As Double)  

    Dim i As Long, j As Long

    Dim Length As Single

    ReDim breaks(M) As Single

    ReDim freq(M) As Single

    

    For i = 1 To M

        freq(i) = 0

    Next i
 

    Length = (arr(UBound(arr)) - arr(1)) / M

    

    For i = 1 To M

        breaks(i) = arr(1) + Length * i

    Next i

    

    For i = 1 To UBound(arr)

        If (arr(i) <= breaks(1)) Then freq(1) = freq(1) + 1

        If (arr(i) >= breaks(M - 1)) Then freq(M) = freq(M) + 1

        For j = 2 To M - 1

            If (arr(i) > breaks(j - 1) And arr(i) <= breaks(j)) Then freq(j) = freq(j) + 1

        Next j

    Next i

    

    For i = 1 To M

        Sheet3.Cells(i + 9, 10) = breaks(i)

        Sheet3.Cells(i + 9, 11) = freq(i)

    Next i

End Sub

Open in new window

0
Comment
Question by:MarcusJiffy
  • 2
3 Comments
 
LVL 13

Accepted Solution

by:
newyuppie earned 250 total points
ID: 24143875
I'll give it a try... You probably need to have basic background in statistics to understand this:

For performing statistical analysis and graphing a series of data points, it is best if you present the data in an orderly fashion. The way to do this is to make a histogram, which is basically presenting the myriad data points you may have as a series of easy to handle categories.

Within those categories fall all your data points. So if you were building a scientific study to measure kids heights as they grow up, you would end up with hundreds of data points of ages and heights for every kid you measure. To make a histogram is to categorize for example Heights from 0 to 1 foot, "Heights from 1 foot to 2 feet", "Heights from 2 feet to 3 feet" and so on. You would probably only reach to about "Heights from 6 feet to 7 feet" because that's where the majority of the human race falls, right? So you categorized hundreds of data points into 6 or 7 categories, and all you have to do is add the data points that fall into each category and then graph the categories against the sum of the data points in them: what is called a Histogram.

Now, this method you present is a sub that takes as an input the variable M and the array arr(). As variable M, you need to pass the number categories that you want to have. This number is not by magic, there are a few methods to obtain the optimum number of categories for a given data set.

As variable arr() you pass the values of the ordered data points. It is important that they have to be in order, from lower number to higher number (that you achieve with a basic sort on the spreadsheet). If your ordered data points are (1, 1.2, 1.3, 1.4, 1.6, 2.1, etc) you pass that to the method as an array.

Now, the method does the following with those numbers:
 
    Dim i As Long, j As Long
    Dim Length As Single
    ReDim breaks(M) As Single
    ReDim freq(M) As Single

1) It initializes variables that will be used

For i = 1 To M
        freq(i) = 0
    Next i

2) It sets the array freq to zeroes (this is unnecesary as it already is initialized to zero when you Dim it)

Length = (arr(UBound(arr)) - arr(1)) / M

3) It takes the highest data point, substracts the lowest data point from it, and divides the result by the number of categories you want to have. So if your tallest kid is 5.4 and your lowest 1.4, and you want 4 categories, the result would be (5.4-1.4)/4 which is 1. That is the length of the 4 categories (from 1 to 2, from 2 to 3, from 3 to 4 and from 4 to 5, for example)

For i = 1 To M
        breaks(i) = arr(1) + Length * i
    Next i

4) It calculates the categories for your data points based on the previous result. It sets M number of categories (remember M was passed to the method). It begins with the lowest value (say 1.4 in our example) and it adds the length, so it ends up with 4 categories: 1.4 to 2.4, 2.4 to 3.4, 3.4 to 4.4 and 4.4 to 5.4). Notice that it ends, not by miracle, on 5.4 which is the tallest kid's measure. It stores all this in the breaks() array

For i = 1 To UBound(arr)
        If (arr(i) <= breaks(1)) Then freq(1) = freq(1) + 1
        If (arr(i) >= breaks(M - 1)) Then freq(M) = freq(M) + 1
        For j = 2 To M - 1
            If (arr(i) > breaks(j - 1) And arr(i) <= breaks(j)) Then freq(j) = freq(j) + 1
        Next j
    Next i

5) This is the frequency counter. Basically it begins to count, of all your data points, how many fall into the first category, how many to the second, and so on. It loops from 1 to ubound(arr), which in our case is 4, once per category. If the data point is below the category, it adds 1 to the freq() array, if it is above the last category it adds one to the final category. For everything in between the first and last, it loops from the second to the final minus one category and performs the counts.

For i = 1 To M
        Sheet3.Cells(i + 9, 10) = breaks(i)
        Sheet3.Cells(i + 9, 11) = freq(i)
    Next i

6) it copies the result to sheet3


Hope this was of some help, let me know if you need more detail...

NY


0
 
LVL 13

Expert Comment

by:newyuppie
ID: 24143897
Try it with this code
Private Sub test()
 

Dim arr() As Double

Dim t As Long

Dim cell As Range

Dim i As Integer
 

t = 4 'number of categories
 

ReDim arr(1 To 8) 'number of data points... if you have 8 points 1 to 8. if you have more, replace the 8 with number of points
 

i = 1

For Each cell In Range("THERANGE") 'name a range on the worksheet which has all the data points, SORTED from low to high

    arr(i) = cell.Value

    i = i + 1

Next
 

Hist3UK t, arr
 

End Sub

Open in new window

0

Featured Post

Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

As with any other System Center product, the installation for the Authoring Tool can be quite a pain sometimes. This article serves to help you avoid making these mistakes and hopefully save you a ton of time on troubleshooting :)  Step 1: Make sur…
Technology opened people to different means of presenting information, but PowerPoint remains to be above competition. Know why PPT still works today.
The viewer will learn how to use the =DISCRINV command to create a discrete random variable, use this command to model a set of probabilities and outcomes in a Monte Carlo simulation, and learn how to find the standard deviation of a set of probabil…
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

895 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now