Solved

Sort SMALL numbers in bins

Posted on 2006-11-12
7
210 Views
Last Modified: 2010-04-15
I have alot extremely small numbers that I need to sort and add into bins to keep count, which is going to be used for a histogram.  I have a loop that is chosing 20 numbers from an array of 25 numbers (it can select the same number multiple times).  The array stores doubles ranging from 0.0 to 1.0.  The numbers i'm multiplying look somehting like this:
2.580645e-01
3.225806e-01
9.677419e-02
6.451613e-02
2.580645e-01

Multiplying these numbers are giving me results that look like this:
9.536743e-03
5.293956e-09
5.651273e-22
I think the smallest I saw was Xe-24.  I'm going to end up with a ton of these, about 5^20 worth of numbers!

As I am multiplying, how can I sort these results into MAXBIN number of bins to be used for a histogram?  I'm using 1000 for MAXBIN, for example here is part of my code (Not exactly the best ):

#define STEPS 20
#define MAXN 5
#define MAXBIN 1000

int counter[STEPS];
double pathTotal;
int c;
for (c0 = 0; c0 < MAXN; c0++) {
  counter[0] = c0;
  for (c1 = 0; c1 < MAXN; c1++) {
    counter[1] = c1;
...
    for (c19 = 0; c19 < MAXN; c19++) {
    counter[19] = c19;
    pathTotal = 1;
    for (c = 0; c < STEPS; c++) {
        pathTotal = pathTotal * arr[counter[c]][counter[c+1]];
        /* INSERT CODE TO ADD TO BIN FOR COUNT */
}



0
Comment
Question by:deadite
  • 3
  • 2
  • 2
7 Comments
 
LVL 16

Expert Comment

by:PaulCaswell
ID: 17926098
Hi deadite,

Sorry but I dont understand what you are trying to achieve here. Could you explain a little more please?

Paul
0
 
LVL 8

Author Comment

by:deadite
ID: 17926651
Here is some quick background info:
In C, I have created a 5x5 transition matrix that contains conditional probabilities of state changes (Markov Chain).  Each row is a state. Each column represents the probability of moving to that next state. Each row will add up to 1.0  For example, using the following matrix, starting from state S0, the transition probability of moving to state S2 then state S4 is (0.4 * 0.1)

       S0   S1 S2  S3  S4
S0{ 0.3 0.1 0.4 0.1 0.1 }
S1{ 0.6 0.1 0.0 0.2 0.1 }
S2{ 0.9 0.0 0.1 0.0 0.1 }
S3{ 0.2 0.3 0.2 0.1 0.2 }
S4{ 0.6 0.1 0.1 0.1 0.1 }

From the previous code, that is looping through every possible path in this matrix for 20 steps.  For instance, here are the first few possible states:
00000000000000000000  (this is .3^20)
00000000000000000001 (.3^19 * .1)
00000000000000000002 (.3^19 * .4)
00000000000000000003 (.3^19 * .1)
00000000000000000004 (.3^19 * .1)
00000000000000000010 (.3^18 *.1 *.6)
...
44444444444444444444  (this is .1^20)

I already have all of this code producing those numbers (which is in the loop i pasted above).  As you can tell, this is going to produce some tiny numbers from 0e0 to Xe-25 probably.  Anyways I need to sort these outputs into 1000 bins as I calculate them.  For instance, let's say I create the following bins:

0
.1
.2
.3
.....
.9
.01
.02
.03
...
.1e-25

Basically, something similar to this but for 1000 bins.  It will compare the pathTotal and add a count to the bin that is closest to it.  Better?

If needed, let me know if I need to up the points.... I'm fresh out and will have to buy them.

Thanks
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 17929461
Some questions :

1) Do you need to store the actual values in the bins, or do you just need the count ?
2) Will your "bin partitioning" be as in your example ? If not, what will the "bin borders" be ? Note that it's best to have some kind of logic behind this, because it makes the calculations a lot easier.


Note that I used "border" in the previous point. It will probably be handier to define borders for the bins instead of basing your bin choice on "closest to". Ie. for your example, the bins would be something like :

    ]0.9 .. 1.0]
    ]0.8 .. 0.9]
    ]0.7 .. 0.8]
    .....
    ]0.1 .. 0.2]
    ]0.09 .. 0.1]
    ]0.08 .. 0.09]
    ]0.07 .. 0.08]
    .....
    ]0.01 .. 0.02]
    ]0.009 .. 0.01]
    .....

But you don't get to 1000 bins this way, so maybe you'll need to split some bins up.

A suggestion to determine which bin to put a value in, is to use a binary tree with a bin on each node, ie. :

    if (value < current_node) {
      // go left
    }
    else if (value > current_node) {
      // go right
    }
    else {
      // found the bin
    }

Untill you arrive at the correct bin.


Just a question : what will this histogram be used for ?
0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 
LVL 8

Author Comment

by:deadite
ID: 17931669
The bins just need to keep count, not store the value... but the bins will pretty much represent the same value.  It doesn't  have to be 1000 bins, but it needs to be sufficient enough to spread the values out a little bit.  The bins will just be output to a file then imported to excel for the histogram.  Basically, the program is just producing some data to be manipulated in Excel.

I can use C a little bit, but by no means am a C programmer.  The more code you can provide for adding to the the results to the bins the better.  I'm fine with your binary tree idea, but can you expand the code a bit more?  

Also, if you want me to raise the points, let me know what you think it should be raised to.

Thanks

0
 
LVL 16

Accepted Solution

by:
PaulCaswell earned 20 total points
ID: 17932623
Hi deadite,

This may not be correct but it should be close:

typedef struct Bin {
 double min; // Lowest number in the bin.
 int count;
} Bin;

// Use one extra so the last bin is always empty. Makes the code easier.
Bin bins[MAXBIN+1];

// Set up the bins.
double min = 0;
double step = 1.0 / MAXBIN;
for ( i = 0; i < MAXBIN + 1; i++ ) {
 bins[i].min = min;
 min += step;
}


// Place a number in a bin.
for ( i = 0; i < MAXBIN && n < bins[i+1].min; i++ );
bins[i].count += 1;

Paul
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 17936449
>> I'm fine with your binary tree idea, but can you expand the code a bit more?  
Maybe to get things working first, Paul's idea is more suitable.
You can always expand the code with an optimised search algorithm to get a better performance.
0
 
LVL 8

Author Comment

by:deadite
ID: 17945401
Thanks for the replies, I'm finally getting back to working on this code.  I actually came up with something similar to Paul's code... so I'm going to give that a try...
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address. This address might be address of another variable/address of devices/address of fu…
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use conditional statements in the C programming language.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now