?
Solved

HashTable vs .... !?

Posted on 2004-09-24
8
Medium Priority
?
301 Views
Last Modified: 2010-04-15
Hi,

I need to store a short list of strings, arround 10 to 20, that will be searched several hundreds of thousands times and retrieve an integer value correspoding to that string value, it will be most likely its ordinal position.

I first tried a secuential search on a TArrayList that result too slow.
Then, I tried HashTable but still being too much slow for what I am doing.
Which other methods would be faster ?

Thanks in advance,

0
Comment
Question by:fischermx
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
  • 2
  • +2
8 Comments
 
LVL 18

Expert Comment

by:armoghan
ID: 12149295
HashTable  is the fastest, if it is not constructed again and again..
HashTable  provides fastest access, the addition of an element takes more time but retrival takes much lesser time that traversing the whole array.

0
 
LVL 18

Expert Comment

by:armoghan
ID: 12149309
You may try to use hashcode of string instead of string itself. as that is integer and can compare faster than String

0
 
LVL 1

Author Comment

by:fischermx
ID: 12149313
Sounds interesting, but I have no idea what exactly are you talking about.
Could you show some code, please ?

Thank you.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 5

Expert Comment

by:tzxie2000
ID: 12150350
I have test the code below:

private void button4_Click(object sender, System.EventArgs e)
{

  Hashtable myHT = new Hashtable();
  int i,j,max;
  string[] sa=new string[20];
  Random rand=new Random();

  for(i=0;i<20;i++)
  {
    max=rand.Next(0,20);
    string s="";
    for(j=0;j<20+max;j++)
    {
      char ch;
      ch =(char)rand.Next('a','z'+1);
      s=s+ch;
    }
    sa[i]=s;
    myHT.Add(s,rand.Next(0,1000));
  }
  DateTime dt=DateTime.Now;
  for(i=0;i<1000000;i++)
  {
    int index=rand.Next(0,20);
    int res=(int)myHT[sa[index]];
  }
  TimeSpan ts=DateTime.Now-dt;
  MessageBox.Show(ts.Milliseconds.ToString());
}

and it shows only 406 milliseconds.
so whether you use Hashtable in wrong way?
0
 
LVL 3

Expert Comment

by:KeithWatson
ID: 12151006
Do you know what these 10-20 strings will be at compile time, if so, you'd be better declaring an array and using constants to access them. This is as fast as it will get. Assuming you don't know this, then a hash table will be the fastest; however, you would want to ensure that there are enough "buckets" allocated in the hash table. At best, a hash table performs with o(1) efficiency, at worst, if all the elements hash to the same bucket, then it performs with o(n) efficiency; no better than linearly searching a list. So if you are using a hash table, initialise it with a starting size a good bit bigger than your contents, and with a prime number, which helps the hashing function achieve a more even spread across the buckets (don't ask me why!), so for example:

Hashtable hashTable = new Hashtable(37);


0
 
LVL 10

Expert Comment

by:smegghead
ID: 12156081
You could just contruct a string containing all your strings and integers, separated with delimiters.

For example

~Purple~3~Blue~5~Red~15~Orange~123~

Then, when you want to find the number corresponding to Red, simple search the string using IndexOf..

      string str="~Purple~3~Blue~5~Red~15~Orange~123~";
      string find="Red";
      int foundat=str.IndexOf("~"+find+"~")+find.Length+2;
      int foundend=str.IndexOf("~",foundat);
      int result=Convert.ToInt32(str.Substring(foundat,foundend-foundat));

If there are only 10 or so items, this will be very quick, but if the string gets much bigger, it won't be so efficient.

Smg.
0
 
LVL 1

Author Comment

by:fischermx
ID: 12156522
Thanks,

That was the first try in the very begining. It is like 50% slower than a HashTable approach.

I think my best bet to improve the hash table is to provide my own hash function, but  I found no samples about how to do that.

Regards,

0
 
LVL 5

Accepted Solution

by:
tzxie2000 earned 2000 total points
ID: 12157587
primarycode below

private void button4_Click(object sender, System.EventArgs e)
{
  hashstring provider=new hashstring();<----------------------------------here implement a IHashCodeProvider

  Hashtable myHT = new Hashtable(provider,null);<-------------------------set the provider to hashtable
  int i,j,max;
  string[] sa=new string[20];
  Random rand=new Random();

  for(i=0;i<20;i++)
  {
    max=rand.Next(0,20);
    string s="";
    for(j=0;j<20+max;j++)
    {
      char ch;
      ch =(char)rand.Next('a','z'+1);
      s=s+ch;
    }
    sa[i]=s;
    myHT.Add(s,1);
  }
  DateTime dt=DateTime.Now;
  for(i=0;i<1000000;i++)
  {
    int index=rand.Next(0,20);
    int res=(int)myHT[sa[index]];
  }
  TimeSpan ts=DateTime.Now-dt;
  MessageBox.Show(ts.Milliseconds.ToString());
  // Displays the properties and values of the Hashtable.
}

//the implement part
public class hashstring :IHashCodeProvider  
  {
    int IHashCodeProvider.GetHashCode(object obj) <---write for gethashcode for your function,here I return the hashcode of string with the sum of all char

    {
      string s=obj.ToString();
      char []ca=s.ToCharArray();
      int res=0;
      int i;
      for(i=0;i<ca.Length;i++)
      {
        res+=ca[i];
      }
      return res;
    }
  }
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Although it is an old technology, serial ports are still being used by many hardware manufacturers. If you develop applications in C#, Microsoft .NET framework has SerialPort class to communicate with the serial ports.  I needed to…
This article aims to explain the working of CircularLogArchiver. This tool was designed to solve the buildup of log file in cases where systems do not support circular logging or where circular logging is not enabled
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…
Visualize your data even better in Access queries. Given a date and a value, this lesson shows how to compare that value with the previous value, calculate the difference, and display a circle if the value is the same, an up triangle if it increased…
Suggested Courses

800 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question