Solved

Hashtable performance 1500 hits per second

Posted on 2006-06-13
15
284 Views
Last Modified: 2012-05-05
Hashtable gets hit 1500 times per second.

Each hit consists of a loop requiring adding/removing entries based upon a numbered index.

Hashtable must be synclocked since multiple threads will be hitting it simaltaneously.

How will the hashtable hold up if speed is of the most importance?

0
Comment
Question by:Hepen
  • 6
  • 5
  • 4
15 Comments
 
LVL 11

Expert Comment

by:anyoneis
ID: 16899490
Just a few ideas - no facts... How many entries? Part of the efficiency of a HashTable depends on the hash function - is the value you are going to be using for a key "well-hashable" (I'm sure there is a real word for that... :-)

What do you mean by "based on a numbered index." If that is really how you want to access it what you really want is an ArrayList? Or better yet, the List generic class.

David
0
 

Author Comment

by:Hepen
ID: 16899492
400000 entries.

The keys will be like:
2
3
4
5
6
7
19
20
21
22
23
28
29
402
403
404
405

I need to use the fastest object. I think it is a hashtable I could be wrong.  I need a "key" that is important.
0
 
LVL 11

Expert Comment

by:anyoneis
ID: 16899737
An integer hash should be pretty efficient. You ar etreating it like a sparse array. Sounds workable to me.
0
 

Author Comment

by:Hepen
ID: 16899829
sparse array? .. hmmm do u suggest something better and more efficient?

Key & Object  

0
 
LVL 4

Expert Comment

by:fffej78
ID: 16900141
So if you know exactly how many entries there are and the keys are known, why aren't you just declaring two arrays of the maximum size and implementing the map interface on top of that?

    class MyMap : IDictionary<int,Object>
    {
        private int[] keys = new int[400000];
        private Object[] values = new Object[400000];

        public void Add(int key, object value)
        {
            keys[key] = key;
            values[key] = value;
        }

       // And so on...
     }

In fact why do you have integer keys?  If you have that why don't you put things in an array?

public class NotReallyAMAp
{
  Object[] object = new Object[ 400000 ];

  Object get( int key )
  {
     return object[ key ];
     //or if you really start from the number 2
     // return object[ key - 2 ];
  }
}

"Hashtable must be synclocked since multiple threads will be hitting it simaltaneously"

Are you sure?  For example, if two threads are competing to remove simulataneously then that doesn't matter (whatever happens one will succeed, the other will do nothing because there is nothing to remove).  If two threads are competing to add the same object, then that doesn't matter either (one will add it, one will then overwrite with the same value).

Of course, if you have the situation where multiple threads may be writing and reading the same key at the same time, then of course you do need synchronisation!
0
 

Author Comment

by:Hepen
ID: 16900395
400,000 at most.

This entire implementation is so I can keep things around 400,000 instead of 64,000,000 preset in an array.  I can save a lot of memory and use it for other things.

This information however is very helpful.. the following..............................

"Hashtable must be synclocked since multiple threads will be hitting it simaltaneously"

Are you sure?  For example, if two threads are competing to remove simulataneously then that doesn't matter (whatever happens one will succeed, the other will do nothing because there is nothing to remove).  If two threads are competing to add the same object, then that doesn't matter either (one will add it, one will then overwrite with the same value).

Of course, if you have the situation where multiple threads may be writing and reading the same key at the same time, then of course you do need synchronisation!

0
 
LVL 4

Expert Comment

by:fffej78
ID: 16900425
To me it seems like a map / hashtable is the wrong structure.  If you have something indexed by integers, then an array is a good structure to use, especially if the size is fixed.  I'd go for the array approach unless there is a really important reason not too.  You can always implement the Map interface on top of a single array if you must have a map for interaction with something else.
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 

Author Comment

by:Hepen
ID: 16900431
so you are recommending a single array instead of a hashtable?

Array or arraylist?
0
 
LVL 4

Expert Comment

by:fffej78
ID: 16900454
Yeah, an array of fixed size instead of a hashtable if:

* You have a fixed size number of elements
* You index via an integer and nothing else

I'd just go for a plain old array.  You have no need for any of the ArrayList extra bits and pieces (unless there is anything else you want from the structure that you haven't specified in the question).

Object[] myObjects = new Object[ 400000 ];

And access it via

Object get( int index )
{
  return myObjects[ index ];
}

Object set( int index, Object value )
{
  assert index>=0 && index<=myObjects.length; //not sure on exact syntax here this is a Java assertion :)
  myObjects[ index ] = value; // remember you have to make sure index is from 0 to 400000
}
0
 
LVL 11

Expert Comment

by:anyoneis
ID: 16904136
fffej78's wrote:
>>Yeah, an array of fixed size instead of a hashtable if:
>>* You have a fixed size number of elements
>>* You index via an integer and nothing else

I think fffej78 is right on.

Hepen wrote:

>>Of course, if you have the situation where multiple threads may be writing and reading
>>the same key at the same time, then of course you do need synchronisation!

This is not quite correct! From online help:

"A Hashtable can support one writer and multiple readers concurrently. To support multiple writers, all operations must be done through this [Synchronized] wrapper only."

Enumeration is another issue:

"Enumerating through a collection is intrinsically not a thread-safe procedure. Even when a collection is synchronized, other threads could still modify the collection, which causes the enumerator to throw an exception. To guarantee thread safety during enumeration, you can either lock the collection [(using SyncRoot)] during the entire enumeration or catch the exceptions resulting from changes made by other threads."

Presumably, in a multiple writer situation not involving enumeration, you could use synclock, but you will get better concurrency using the Synchronized wrapper.

David




0
 
LVL 4

Assisted Solution

by:fffej78
fffej78 earned 250 total points
ID: 16904325
Good point David, my mistake!

I think the point is right though, think through your specific concurrency needs before deciding that something must be sync-locked.  Don't just make assumptions just because you are using multiple threads.
0
 

Author Comment

by:Hepen
ID: 16905139
so for reading i don't have to synclock.

But for writing I do?
0
 
LVL 4

Expert Comment

by:fffej78
ID: 16905165
Is it just static data?  What are you overwriting?  What do "multiple threads hitting it simultaneously" mean?  I think you need to clarify the problem with respect to what's actually in this huge array :)
0
 

Author Comment

by:Hepen
ID: 16905196
multiple threads will be reading & writing to the hashtable and there will be plenty of times where they may hit the hashtable at the same time.

From that description above it looks like a hashtable can be read without synclocking it and without having to worry about thread problems.

But for writing I have to use the syncroot. Is that correct?

The data isn't static its dynamic.  entries will be added & removed .. The size can fluxuate from 1 to 10 to 100 to 4000 to 8231 all the way to 400000 or even 500000 but it will stay between that range somewhat.
0
 
LVL 11

Accepted Solution

by:
anyoneis earned 250 total points
ID: 16908021
(Speaking of multiple threads...)

 On the HashTable approach, I was totally wrong about the Synchronized wrapper:
"Synchronized supports multiple writing threads, provided that no threads are reading the Hashtable. The synchronized wrapper does not provide thread-safe access in the case of one or more readers and one or more writers."

So, it appears to me that SyncRoot locking is required in any mixed reader/writer situation.

>> The size can fluxuate from 1 to 10 to 100 to 4000 to 8231 all the way to 400000 or even 500000 but it will stay between that range somewhat.

What is the range of the indices? The range of the count of entries is pretty large...
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Welcome my friends to the second instalment and follow-up to our Minify and Concatenate Your Scripts and Stylesheets (http://www.experts-exchange.com/Programming/Languages/.NET/ASP.NET/A_4334-Minify-and-Concatenate-Your-Scripts-and-Stylesheets.html)…
Many of us here at EE write code. Many of us write exceptional code; just as many of us write exception-prone code. As we all should know, exceptions are a mechanism for handling errors which are typically out of our control. From database errors, t…
This video discusses moving either the default database or any database to a new volume.
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now