Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 147
  • Last Modified:

What is the best way to get the Hash value for a String

Here is the Sun's way

    public int hashCode() {
      int h = hash;
      if (h == 0) {
          int off = offset;
          char val[] = value;
          int len = count;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
    }
0
sudhakar_koundinya
Asked:
sudhakar_koundinya
  • 4
  • 3
1 Solution
 
sciuriwareCommented:
In general a hashing algorithm should spread a collection to a flat model.
In other words, it depends. If your initial collection has a majority of similar strings it will be
hard to create any algorithm. The SUN approach will work most of the time,
but in many cases it's easier to look at your data.
For instance you might consider a ZIP code as part of the hash value.
;JOOP!
0
 
CEHJCommented:
Hash algorithms are a trade-off between speed and the ability to prevent code collisions. Why are you questioning Sun's implementation btw?
0
 
CEHJCommented:
Sorry - but i don't really understand that accepted answer - perhaps someone can explain it to me? ;-)
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
sciuriwareCommented:
CEHJ, I agree that a split was as acceptable as an A would have been.
Didn't you see in many cases that the award goes to the expert who (accidentally)
   hits a preception? Now you can determine what the question would have been.
Sorry, next question please.
;JOOP!
0
 
CEHJCommented:
>>CEHJ, I agree that a split was as acceptable ...

I think you misunderstood - i was actually saying i didn't understand your answer, not 'i don't understand why that answer has been accepted' (not that i'm ruling out a connection between the two ;-))
0
 
sciuriwareCommented:
Sorry, well, the idea is that if some record seems to contain an unique piece of data, that piece is eligible
for being used as hash (or any other) key.
When the spread of such a key over the total collection is about flat (no accumulation in some spots)
it is ideal for hashing (few collisions).
So, before you apply a general algorithm, look an the nature of your data.
For instance a phone number might be an unique part of a record, but if most of those start with the same digits
you will take some substring from it that doesn't always start the same.
That was my message.

;JOOP!
0
 
CEHJCommented:
I'm not sure it's an answer to the question asked, but .. OK ;-)
0

Featured Post

Prep for the ITIL® Foundation Certification Exam

December’s Course of the Month is now available! Enroll to learn ITIL® Foundation best practices for delivering IT services effectively and efficiently.

  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now