implement hash table using only arrays (in JAVA)

(This is a potential interview question and I'm not looking for code, just some ideas.)

The only way I've implemented a hash table is using an array of LinkedLists - the lists being buckets for handling collisions.

At the moment the only way I can think of to implement one using Chaining with only arrays would be to basically do the same thing but use arrays for buckets duplicating the arrays and doubling the array sizes as necessary. I don't know if this is stupid - probably, since I can't see any benefit to that over LinkedLists or even just ArrayLists.

Or, what about open-addressing: Linear Probing? Quadratic Probing?

What are the tradeoffs?

Are there other less know techniques?

Thanks,
Kyle
LVL 25
Kyle HamiltonData ScientistAsked:
Who is Participating?
 
dpearsonConnect With a Mentor Commented:
I'd never heard of open-addressing until you mentioned it.  The joys of living in a world where we no longer implement these low level data structures any more :)  If we were all still writing C++ I'd probably be up on it!

Reading up on it a bit it certainly seems relevant to this sort of implementation.  I can't really comment intelligently on the specifics as they're new to me, but if you're looking at this from an interview perspective I think the key would be whether you could explain the different algorithms and their pros and cons (since they seem to form a pretty clear spectrum).  There's never a "right" answer in this sort of situation, just A is better for situation X, B for Y.

If it was me doing the interview what I'd really want to hear is that you'd measure first and then try different approaches and re-measure (if you're down at this level of implementation detail and looking for the last few cycles of performance).  Since the performance will vary a lot with the specific data set and load factor in the hash table, it would be pretty hard to pre-select the correct algorithm.  Better to implement several and get to know your profiler more intimately.

Doug
0
 
dpearsonCommented:
If you're allowed to use an array of arrays, then yes it seems perfectly doable:

Use any hash function and mod to the length of the first array (analogous to the buckets in a regular hashmap).

Then add the item being hashed into the array stored in that bucket.  For a trivial implementation you just copy the old array into the new, larger array.  For a smarter implementation you need to track how much of the array is in use, append until it's full and then copy to a new array that's twice the size.

Searching is just: hash value to locate bucket and then search the list of values in that bucket to see if any match.

Removing is basically the same as adding - either copying the old array into the smaller array or packing the items down into an existing array.

The follow up in an interview would no doubt be to determine the computation complexity (big O notation) for these operations.  Also be ready to talk about the tradeoffs around the size of the initial array for storing the hashes.

You can also talk about the trade-off between allocating more space initially vs doing more copying for the arrays within the map.

Will there be even faster implementations for this?  Sure, they'll exist.  But if you bring those up in an interview, a smart interviewer will know that you've just prepped for the question ahead of time...and tend to ignore the answer :)

Hope that helps,

Doug
0
 
Kyle HamiltonData ScientistAuthor Commented:
Thanks Doug,

It does help :)

Any thoughts on open-addressing?

K.
0
 
Kyle HamiltonData ScientistAuthor Commented:
I guess no amount of theory is going to change the fact that it's "Better to implement several..."

Thank you :)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.