Posted on 2002-03-28
Medium Priority
Last Modified: 2010-04-15
Hi, im having a bit of trouble. ok. a lot of trouble.

I am learning C  and i have been given an exercise on arrays and pointers.  

The exercise is to make 1 list to store student names, and another list to store the course they are enrolled on.

I know basically how it works, ie the ascii values are added together and then you use the mod operator to convert the total into a number which becomes the size of the array.

So can anybody give me a bit of help to start me on my way.  I must state that I am not asking anybody to do the work for me, but the more help the better.  Could you please answer asap. I look forward to hearing from youin the near future.


Question by:OutOvSyt
LVL 16

Expert Comment

ID: 6902829
How about I give a slightly more precise description:

Suppose you have 100 entries in your list. Then, for each name you would find the index for it by adding the ASCII values together, the applying mod 100. That is your hash function. So, for example, a student named "Li", consists of ASCII values 0x4c and 0x69. When you add them you get 0xb5 which is 181 decimal. If you take mod 100 then you get 81. So Li should be stored at index 81.

Does that help?

If you're having trouble coding it, please provide whatever attempt you have made so far, complete with compiler errors if you can't solve those yourself, and we'll take it one step at a time.
LVL 85

Expert Comment

ID: 6904602
/* a better hash could be */
int hash(char *s){
    int h=0;
    while( *s ){
     h += (h<<5) + *s++;
    return h;

Expert Comment

ID: 6904702
 For ozo: how is that better?!
  Try your function with the input "Johnny":
  This are the intermediaries at every step:
  This is the result:
  Final hash value:-1263449994
  Am I missing something?

 For imladris. Yes, you are right. And this is just for the sake of argument: isn't, statistically speeking, the hash table distribution better when the length of the entry list is a prime number.

 For OutOvSyt.
 How do you intend to solve conflicts in your implementation. There are at least two approaches, depending very much on the number of entries you estimate in your list.
 If the number is at most the length of the list, you can try adding the conflicting value at the imediate empty position.
 If you have more than that, you should consider your initial array as an array of pointers to some dynamic structure, like a linked list, which holds all the values with the same hash key, or better yet, a binary tree, sorted by name (in your case).

Easily manage email signatures in Office 365

Managing email signatures in Office 365 can be a challenging task if you don't have the right tool. CodeTwo Email Signatures for Office 365 will help you implement a unified email signature look, no matter what email client is used by users. Test it for free!


Expert Comment

ID: 6904705
 Sorry, imladris, forgot the question mark -- ?

Expert Comment

ID: 6905335
  For imladris - forget about my question, goes beyond the scope of this discussion, of course, the bigger the number of entries, the wider the dispersion, but then again, no point making this number too big (what's the point of having 10000 entries when the average number of records is about 100).
   What I had in mind was the way to decide on the optimum number of entries given an estimation of the number of records, but, as I said, I think this is not the place for such debate, unless, of course, OutOvSyt is interested.

  By, the way, OutOvSyt, have you been helped?
  Anything to ask, some code to post?

Author Comment

ID: 6915700
Thanks. You are all being a great help, but its not so much the code i am having trouble with, its more like the whole concept of hashing.

I have read your comments and i appreciate them, but can we try going a bit more basic?

Sorry but the only code i have is the algorithm to get the total and convert it to a range of 0 to MaxNum-1 where MaxNum of the size of the array.

This is basically add the ascii values together and divide them by 24. (I am using 24 as the size of the array).

I would be very grateful if you could post your revised comments after taking this comment into consideration.

Thanking you in advance.



Accepted Solution

Robalitoru earned 1200 total points
ID: 6917555
  Well, I see the others got bored, so I might as well tell you a thing or two.
   Now, "the whole concept of hashing"... Basically, is an alternative to the ordinary sort. Sometimes is preferable to use hash tables instead of sorted lists (arrays) or binary trees in order to retrieve items fast from a collection.
   Why is that? Well, just consider you have a sorted, index based,  array: searching for an element is quite fast with binary search O(lg2 n), but adding or deleting one is O(n).
   Let's say than that you have a binary search tree. Basically on these kind of strucutures the complexity of add/delete/retrieve is around O(lg2 n), but, depending on the order of all operations, the tree might get seriously unbalanced, then you will have to balance it in order to ensure the targeted complexity... and so on.
   So, depending on the size of data being processed, and on the presumed dynamics of the operations, one might choose a hashing table for manipulating the set of records. The discussion on when exactly each approach is preferable is really too involved...

   There are two, as I see it, main use cases for hashing:
1.  You don't have such a large amount of records, but your program acceses them extensively -- you need a way to retrieve the location of one record in almost no time -- let's say you have about 50 records -- making a hash table of about 200 records with pointers to such records -- resolving eventual conflicts with assigning to the imediate free position is the best approach, far better than a sorted array or a binary tree, or any other structure.

2. You have a large amount of data, and adding and deleting are very, very frequent. You don't want a sorted list from reasons of complexity. You don't want a binary tree, because frequent adds and deletes will most likely debalance it.
   You might use a hash table with a quite big number of entries, and then, for the records on the same key decide again on the kind of structure. Migth be just a list, a priority list, a sorted array, a binary tree, or even another hash table on another key...

  Again, if you think the answer is not specific enough, try making your question more specific.

Expert Comment

ID: 6933927
Slight clarify on the above:

Alternative to a sort and search. Data is often sorted so you can do the binary search (throws away half the candidates each time, like a human with a phonebook, open to middle, move 1/2 way in the correct direction each time (because its sorted)).  This search is fast, but not always fast enough...  

The best luck I've had for collision resolution is to have several hash functions, and to use each in turn until a space is found.  But I usually only use it on small known data sets with very carefully chosen functions, not on user entered data...


Author Comment

ID: 8240052
alot more information than i expected. much more detail than i thought i would need but very very helpful. thanx alot

Featured Post

Live webcast with Pinal Dave

Pinal Dave will teach you tricks to help identify the real root cause of database problems rather than red herrings. Attendees will learn scripts that they can use in their environment to immediately figure out their performance Blame Shifters and fix them quickly.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Have you thought about creating an iPhone application (app), but didn't even know where to get started? Here's how: ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Important pre-programming comments: I’ve never tri…
Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address. This address might be address of another variable/address of devices/address of fu…
The goal of this video is to provide viewers with basic examples to understand and use pointers in the C programming language.
The goal of this video is to provide viewers with basic examples to understand opening and reading files in the C programming language.

600 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question