• C

Remove duplicate elements in a array

How to remove duplicate elements in a array using hash tables and also without using hash tables.

this was an interview question.
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

All duplicates will produce the same hash value .. simply search the  hash bucket and elminate it if it is a duplicate. A good hash function will distribute the values evenly in buckets so this should be fairly fast to perform.

Without using hash tables would depend on other constraints .. e.g. given sufficient memory you can allocate an array of range of value in that array .. simply keep marking the index value equal to data value and  consolidate the values in single scan ... This would bebad if range is huge but number of values is small ....
Or you can try sorting the array ... Or try to construct a binary search tree .. basically you need to gauge what is that interviewer is trying to have you optimize .. ask him/her questions to determine the constraints/performance they are looking for.


Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
without hash you can sort the array. Then you walk it and you copy all the unique elements to another array. You even can keep the same array you just have to remove the not unique elements, e.g while setting the to a known value which indicates it "not there".

The problem with hashes is that you can get the same hash for different values. That happens sooner or later, so you can not eliminate it that easy. The only way not to let happen if you can assure that no element in the field yields the same hash code.

So sunnycoder is not fully right on his first suggestion.


Perhaps I was not clear enough. I never said that two different values cannot produce same hash. What I said was duplicate values will have the same hash (unless ofcourse you are using some hash function like time()%value). That was the reason I said you need to search the hash bucket before eliminating an element.

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.