What is the best way to get the Hash value for a String

Here is the Sun's way

    public int hashCode() {
      int h = hash;
      if (h == 0) {
          int off = offset;
          char val[] = value;
          int len = count;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
    }
LVL 14
sudhakar_koundinyaAsked:
Who is Participating?
 
sciuriwareConnect With a Mentor Commented:
In general a hashing algorithm should spread a collection to a flat model.
In other words, it depends. If your initial collection has a majority of similar strings it will be
hard to create any algorithm. The SUN approach will work most of the time,
but in many cases it's easier to look at your data.
For instance you might consider a ZIP code as part of the hash value.
;JOOP!
0
 
CEHJCommented:
Hash algorithms are a trade-off between speed and the ability to prevent code collisions. Why are you questioning Sun's implementation btw?
0
 
CEHJCommented:
Sorry - but i don't really understand that accepted answer - perhaps someone can explain it to me? ;-)
0
Cloud Class® Course: Microsoft Office 2010

This course will introduce you to the interfaces and features of Microsoft Office 2010 Word, Excel, PowerPoint, Outlook, and Access. You will learn about the features that are shared between all products in the Office suite, as well as the new features that are product specific.

 
sciuriwareCommented:
CEHJ, I agree that a split was as acceptable as an A would have been.
Didn't you see in many cases that the award goes to the expert who (accidentally)
   hits a preception? Now you can determine what the question would have been.
Sorry, next question please.
;JOOP!
0
 
CEHJCommented:
>>CEHJ, I agree that a split was as acceptable ...

I think you misunderstood - i was actually saying i didn't understand your answer, not 'i don't understand why that answer has been accepted' (not that i'm ruling out a connection between the two ;-))
0
 
sciuriwareCommented:
Sorry, well, the idea is that if some record seems to contain an unique piece of data, that piece is eligible
for being used as hash (or any other) key.
When the spread of such a key over the total collection is about flat (no accumulation in some spots)
it is ideal for hashing (few collisions).
So, before you apply a general algorithm, look an the nature of your data.
For instance a phone number might be an unique part of a record, but if most of those start with the same digits
you will take some substring from it that doesn't always start the same.
That was my message.

;JOOP!
0
 
CEHJCommented:
I'm not sure it's an answer to the question asked, but .. OK ;-)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.