Solved

HASHING

Posted on 2003-11-10
6
631 Views
Last Modified: 2010-04-15
how do i create a hash index and store say names and addresses in that array. how do i avoid generating the same hash index value for the same name. i have basically worked out how to put together most of the program but i am lost when it comes to this nothing i do seems to work..
0
Comment
Question by:CODY439
  • 2
6 Comments
 
LVL 16

Accepted Solution

by:
imladris earned 63 total points
ID: 9717490
How you create a hash index, and how you (attempt to) avoid generating duplicates depends a great deal on your data.

Given the reference to "avoiding generating the same has index value for the same name", I assume you have names as a key, and addresses as the data. This is then supposed to allow you to, given a name, find the address for that name. You are supposed to do that by getting a hash code for the name, and then retreiving the address at that location.

Hashing schemes can go from simple to elaborate. What you want is to "translate" every name into a position in the array. A simple tactic would be to add up all the letters in the name and use the result as the hashcode. You could get more elaborate by doubling or cubing it and taking the middle digits of the result. The objective is to be as sure as possible that a small change in the name, will result in a change in the hash code.

Note though, that no matter what you do, you usually can't guarantee that all names will come up with unique hash codes. Normally you must take the possibility of "collisions" into account.
0
 
LVL 45

Assisted Solution

by:Kent Olsen
Kent Olsen earned 62 total points
ID: 9717505


One of the "down sides" to hashing is that the algorithm may well produce collisions (two data items reduce to the same value).  Your program can allow for it in several different ways (or combinations of ways).

Often times, the value that the hash returns is an index.  This index is usually to a buffer (page) where the data is stored.  This buffer (page) is usually large enough to contain several items.  When the buffer (page) is full and the algorithm needs to store another item in the buffer, it must allocate another buffer, called an "overflow block".  A field in the original buffer's header indicates if an overflow block is present.  If enough items hash to the same buffer, the overflow block may also link to another overflow block in a linked-list effect.  Whenever the block is search for a particular item, all overflow blocks must also be searched.


The algorithm that you choose to generate the index is extremely important.  A really bad algorithm might be to return "even or odd", in which case all data will be stored in one of two buffers (and their overflow chains).  Another really bad algorithm is to produce an index so large that no collisions ever occur AND the returned indexed are sparse (not all numbers in the range are returned.)  You'll waste a bunch of space because all of the buffers won't be used.  (Actually, this is used in a few time-critical applications where the amount of data is small and "memory is cheap", but it's not suitable for "general purpose" needs.)


In short, you're going to have to come up with an algorithm that works well for your data, and you're very likely going to have to code overflow blocks.

Kent
0
 
LVL 6

Expert Comment

by:Ajar
ID: 9722398
Lets  define  a data structure to hold  key ,value pair as

struct hash_node
{
 void * key ;
 void * value ;
 struct hash_node * next
};// a link list node

a hash table will be an array of such hash nodes


struct hash_node * hash_table[10]  ; // array with ten nodes say
// initialize to null nodes

when ever we want to put a new entry into hash table  we
create  a hash_node type of object by malloc e.g

struct hash_node * new_node = (struct hash_node *) malloc(sizeof(struct hash_node));

Then you can attach the key and value like

new_node->key = KEY;
new_node->value = VALUE;


to calculate the position of the new node  in hash_table array

int pos =  ((int)new_node->key)%10

if this position is empty place the new entry there
if this position is not empty then this position will have a hash_node which
is actually a link list so append the new value to its end

thus you can place new items of same hashcode value at the same pos in the array

diagram is

0
|
1-121-191
|
72-3002
|
33-3-93-43
|
NULL
|
345-65
|

with each of these number representing the  (int) ((void *) key)



0
 
LVL 16

Expert Comment

by:imladris
ID: 9740784
Did any of those answers help?

If so, it is now time to select one and grade it to close the question.

If not, perhaps a clarifying question would help.
0

Featured Post

Does Powershell have you tied up in knots?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Why this code doesn't work? 8 106
Unable to start eclipse ? 17 153
nested if statement in excel help 4 37
Display a Float Variable in C without using the function printf. 2 25
An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
Preface I don't like visual development tools that are supposed to write a program for me. Even if it is Xcode and I can use Interface Builder. Yes, it is a perfect tool and has helped me a lot, mainly, in the beginning, when my programs were small…
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use nested-loops in the C programming language.
The goal of this video is to provide viewers with basic examples to understand opening and reading files in the C programming language.

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question