Solved

Fastest Method of Searching

Posted on 1999-01-29
5
167 Views
Last Modified: 2010-04-15
Hi !

I have a text file about 3 Meg in size.  I have to search for a string of about 7 characters in length at the beginning of each line.  Each line can be well over 300 characters.  Currently I am opening the file and reading the whole line with fgets and comparing the first 7 characters.  What would be the best method for this as I need to increase the speed.

Marvin.
0
Comment
Question by:checkin
5 Comments
 
LVL 1

Expert Comment

by:newexpert
ID: 1258504
For 32bit apps obtain 3MB of memory

char *membuf = (char *)malloc(sizeof(char) * 3145728);
int filelen = fread(membuf, sizeof(char), 3145728, fileptr);

and search through membuf will be a lot faster.
0
 
LVL 84

Expert Comment

by:ozo
ID: 1258505
Build a hash index, and jump directly to the line.
0
 

Author Comment

by:checkin
ID: 1258506
Hi !

I do not want to play with malloc as the system we use could possible swap allot if under pressure.  This could cause problems.

I am interested in building a hash index, where can I get some info about this as I have never touched these before.

Regards,

Marvin.
0
 
LVL 10

Accepted Solution

by:
rbr earned 100 total points
ID: 1258507
Short example for a hash


typedef struct hash_entry {
      unsigned char key[8];
      struct hash_entry *pnext;
      long file_pos;
} HASH_ENTRY;

void init_hash (HASH_ENTRY ***ppphash, long anz)
{
      *ppphash= (HASH_ENTRY **)calloc (sizeof(HASH_ENTRY *),anz);
}
int hash_function (unsigned char *pkey)
{
      unsigned long hash=0;
      
      for (i=0;i<strlen(pkey);i++)
            hash+=pkey[i];
      return (hash);
}

void add_to_hash (HASH_ENTRY **phash, long anz, unsigned char *pkey,unsigned long file_pos)
{
      
      HASH_ENTRY *p,*plast=NULL,*pcurr;

      p= (HASH_ENTRY *)calloc(sizeof(HASH_ENTRY),1);

      pcurr = phash[hash_function (pkey)%anz];
      while (pcurr != NULL) {
            plast=pcurr;
            pcurr=pcurr->pnext;
      }

      if (plast==NULL) {
            phash[pos]=p;
      } else {
            plast->pnext=p;
      }
      p->filepos=filepos;
      strcpy (p->pkey,pkey);
}      

int find_in_hash (HASH_ENTRY **pphash,long anz,unsigned char *pkey, unsigned long *pfilepos)      
{

      HASH_ENTRY *pcurr;      

      pcurr = phash[hash_function (pkey)%anz];
                                    
      while (pcurr!=NULL) {
            if (strcmp (pcurr->key,pkey)==0) {
                  *pfilepos=pcurr->filepos;
                  return (0); /* Found */
            }
            pcurr=pcurr->pnext;
      }
      return(1); /* not found*/
}

/* Main routine */

HASH_ENTRY **pphash;
long anz=1000L; /* The greater the faster */

init_hash (&pphash,anz);

/* loop over all lines and for each line use */
add_to_hash (pphash,anz,key,filepos); /* Where key is your search key and filepos the line in the file or position of ftell */

/* Now you can search a key with find_in_hash */

            
0
 
LVL 10

Expert Comment

by:rbr
ID: 1258508
int hash_function should be unsigned long hash_function
0

Featured Post

Salesforce Made Easy to Use

On-screen guidance at the moment of need enables you & your employees to focus on the core, you can now boost your adoption rates swiftly and simply with one easy tool.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address. This address might be address of another variable/address of devices/address of fu…
The goal of this video is to provide viewers with basic examples to understand recursion in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use conditional statements in the C programming language.

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question