Link to home
Start Free TrialLog in
Avatar of aagoh
aagoh

asked on

hashing

what is hashing with separate chain? How does it works?
Avatar of Neelima
Neelima

In the hashing,the key is used to determine the actual position where the record /data is to be stored.These keys which are themselves unique(Generally) are supplied to a hash function to give the storage index.This index may be repetitive depending solely on whta kind of data is being stored as well as the hashing function chosen.
  In hashing with seperate chain,when the key hashes to an already previously hashed place,a chain is formed.
Hence the structure would be some thing like.

typedef struct node
{
  int key;
  struct node * next;
}NODE;

typedef struct bucket
{
  int key;//asumption that the key is of type int.
  NODE* link;
}BUCKET;
BUCKET bucket[BUCK_MAX];


Hence for array index, a seperate chain of the keys hashing on that index is formed.When searching for the keys,they are first hashed thru the same function that oyu used before storing them.If the element when compared to the array element on that index is not same,the linked list is traveresed .Hence the worst case for Key Search becomes O(N) where n is the length of the longest chain.
Avatar of aagoh

ASKER

It sounds confusing, is there a sample code to demostrate
hashing with separate chaining?
It's not that confusing, really, as long as you understand linked lists as well as basic hash tables - the 'BUCKET' is the parent node and will start with its 'NODE *' member set to NULL. Each time a given 'BUCKET' is hit you chain down through the linked list of nodes until you either hit the key you are searching for or you reach the NULL marker at the end of the list. If a new element has to be added you follow conventional linked list logic ie allocate a new 'NODE' (with its own 'next' member set to NULL) and point the 'next' which was set to NULL to point to it.
Avatar of aagoh

ASKER

So if I were to have an array of pointers (which will be
of my hash table size) to point to
the buckets, how do I declare it?

do I do it this way, as normal arrays do?

typedef struct mylist {
  int key;
  struct list *next;
} MYLIST;

MYLIST *hashTable[TABLESIZE];

Is this the right way?

How do I go about inserting the hashed values into my
hashTable?

assuming my hashFunction is a simple one

int hashFunction ( int hash_key ) {
  return( hash_key % HASH_VALUE );
}
> do I do it this way, as normal arrays do?
Yes. A hash table is a _table_ (== array in the usual implementation) and the 'next' member effectively allows the table to become two-dimensional. Since you have declared 'hashTable' to be of type 'MYLIST *' you will have to NULL the array before starting and 'malloc(sizeof(MYLIST))' for each element as you use it for the first time but otherwise your approach is fairly standard.

> How do I go about inserting the hashed values into my
hashTable?
You don't insert the hashed value itself into the table - you use the hashed value to locate the place where the data will be stored. The hash function should generate a value in the range 0 <= hash < TABLESIZE. Imagine a set of large filing cabinets each labelled with a letter from A to Z. Assuming that you file each document in the cabinet which matches the first letter of the title but that otherwise the cabinets are unsorted, you have a pretty good analogy of what hashing is all about ie

* the 'key' is the title of the document
* the hash function is to use the first letter of the title
* the table/array itself is the set of 26 filing cabinets
* each element in 'hashTable' is an individual cabinet
* each new element added to the linked list is akin to a new document being placed in the one cabinet

The point about hashing is that given the key you can quickly identify where it would have been stored. How well this works in practice depends on many factors (notably how well the hash function spreads the use of storage slots and the number of free slots available as the fewer the number of collisions, the faster searched will be)
ASKER CERTIFIED SOLUTION
Avatar of Neelima
Neelima

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Good to see you again, Neelima. Typical, mind, that I'm doing the waffling and you're doing the coding!
Avatar of aagoh

ASKER

lastly,

if I have already inserted the keys into the hash tables,
with buckets, but I want to initialize the hash table, all pointing to NULL again so that I can reuse the same hash table later in my program, do I
1) traverse to each individual node and free the individual buckets?
or
2) just simply
for( i = 0; i < tablesize; i++ )
free(hash_table[i]);
or
3) point the hashtable to NULL
for( i = 0; i < tablesize; i++ )
hashtable[i] = NULL;
or
????????
thanks
If you free the array(hash_table) elements, you wont be able to access them again for reuse.
Option 3 is the one that should be used for the reuse of the same hashtable for the same tablesize.
Hmm. The two things you want to avoid are 'orphaning' malloced memory blocks and picking up a dangling pointer and therby writing into memory which is no longer malloced. There's no problem with reusing elements which have already been been 'malloc'ed but you then need some way of distinguishing between three states ie

1) hashTable[i] not malloced space yet
2) hashTable[i] malloced and in use
3) hashTable[i] malloced but flagged for re-use

I would argue that the best possibilities for re-using the table are :

a) To free all blocks :
Loop through the hashTable, going down the chain, storing the 'next' pointer and then freeing the block. After this you should set the top level ie hashTable[i] to NULL

b) To retain the previously malloced blocks :
Establish some invalid key value and then set the key back to this when a block is allocated but free for reuse ie if you encounter a NULL pointer you malloc, otherwise you check the 'key' member to see whether the block is in use.
Avatar of aagoh

ASKER

so, with the declaration

typedef struct mylist {
int key;
struct mylist *next;
} MYLIST;
...

MYLIST hashtable[TABLESIZE];
...

/* to free the nodes, and reinitialize the table to NULL */
MYLIST *ptr, *temp;
int i;

for( i = 0; i < TABLESIZE; i++ )
 {
 for( ptr = hashtable[i]; ptr != NULL; ptr = ptr->next )
  {
  temp = ptr->next;
  ptr = ptr->next->next;
  free( temp->next );
  }
 hashtable[i] = NULL;
 }

will do the job?
Avatar of aagoh

ASKER

oh, mistake in typing,
the declaration,

MYLIST hashtable[TABLESIZE];

should be

MYLIST *hashtable[TABLESIZE];

apologies.
/* To free the nodes, and reinitialize the table to NULL */
MYLIST *ptr, *temp;
int i;

for (i = 0; i < TABLESIZE; i++ )
{
   ptr = hashtable[i];
   while (ptr != NULL)
   {
      temp = ptr->next;
      free(ptr);
      ptr = temp;
   }
   hashtable[i] = NULL;
}

It is a bit more complicated if you want to keep the top level node but free the remainder of the chain but not much more so - you would want to free the children and then set 'hashtable[i]->next = NULL;'. Of course this would require altering your insertion code so it is your call as whether it is worth the effort...

Note that I'm only bothering to set the top node to NULL but good practice when freeing memory if there is any chance of reuse and/or attempts to 'free' again is :

if (p)
{
   // Free the block
   free(p);
   // Set to NULL to flag as freed
   p = NULL;
}
typedef struct mylist {
int key;
struct mylist *next;
} MYLIST;
...

MYLIST * hashtable[TABLESIZE];
...

/* to free the nodes, and reinitialize the table to NULL */
MYLIST *ptr, *temp;
int i;

for( i = 0; i < TABLESIZE; i++ )
{
ptr=hashtable[i];
while(ptr->next)
{
temp=ptr->next;
ptr->next=temp->next;
free(temp);
}
/*----------Wrong......
for( ptr = hashtable[i]; ptr != NULL; ptr = ptr->next )
 {
 temp = ptr->next;
 ptr = ptr->next->next;//shd have written (ptr=temp->next;)
 free( temp->next );
//once you have freed this memory,you cant access ptr anymore for checking or other purposes since you are incrementing ptr to point to the next of temp and then freeing that next pointer in the next step..
 }
--------------------------*/
hashtable[i] = NULL;
}
Avatar of aagoh

ASKER

oh, seems that I have to split the points between Neelima and BlindSide as N gives a better code, and B gives a better explaination.

thanks to you both!!!
Thanks, aagoh. I hope that you've found EE helpful and fun - and that you carry on doing so! See you around.