Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Fastest Method of Searching

Posted on 1999-01-29
5
Medium Priority
?
173 Views
Last Modified: 2010-04-15
Hi !

I have a text file about 3 Meg in size.  I have to search for a string of about 7 characters in length at the beginning of each line.  Each line can be well over 300 characters.  Currently I am opening the file and reading the whole line with fgets and comparing the first 7 characters.  What would be the best method for this as I need to increase the speed.

Marvin.
0
Comment
Question by:checkin
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
5 Comments
 
LVL 1

Expert Comment

by:newexpert
ID: 1258504
For 32bit apps obtain 3MB of memory

char *membuf = (char *)malloc(sizeof(char) * 3145728);
int filelen = fread(membuf, sizeof(char), 3145728, fileptr);

and search through membuf will be a lot faster.
0
 
LVL 84

Expert Comment

by:ozo
ID: 1258505
Build a hash index, and jump directly to the line.
0
 

Author Comment

by:checkin
ID: 1258506
Hi !

I do not want to play with malloc as the system we use could possible swap allot if under pressure.  This could cause problems.

I am interested in building a hash index, where can I get some info about this as I have never touched these before.

Regards,

Marvin.
0
 
LVL 10

Accepted Solution

by:
rbr earned 400 total points
ID: 1258507
Short example for a hash


typedef struct hash_entry {
      unsigned char key[8];
      struct hash_entry *pnext;
      long file_pos;
} HASH_ENTRY;

void init_hash (HASH_ENTRY ***ppphash, long anz)
{
      *ppphash= (HASH_ENTRY **)calloc (sizeof(HASH_ENTRY *),anz);
}
int hash_function (unsigned char *pkey)
{
      unsigned long hash=0;
      
      for (i=0;i<strlen(pkey);i++)
            hash+=pkey[i];
      return (hash);
}

void add_to_hash (HASH_ENTRY **phash, long anz, unsigned char *pkey,unsigned long file_pos)
{
      
      HASH_ENTRY *p,*plast=NULL,*pcurr;

      p= (HASH_ENTRY *)calloc(sizeof(HASH_ENTRY),1);

      pcurr = phash[hash_function (pkey)%anz];
      while (pcurr != NULL) {
            plast=pcurr;
            pcurr=pcurr->pnext;
      }

      if (plast==NULL) {
            phash[pos]=p;
      } else {
            plast->pnext=p;
      }
      p->filepos=filepos;
      strcpy (p->pkey,pkey);
}      

int find_in_hash (HASH_ENTRY **pphash,long anz,unsigned char *pkey, unsigned long *pfilepos)      
{

      HASH_ENTRY *pcurr;      

      pcurr = phash[hash_function (pkey)%anz];
                                    
      while (pcurr!=NULL) {
            if (strcmp (pcurr->key,pkey)==0) {
                  *pfilepos=pcurr->filepos;
                  return (0); /* Found */
            }
            pcurr=pcurr->pnext;
      }
      return(1); /* not found*/
}

/* Main routine */

HASH_ENTRY **pphash;
long anz=1000L; /* The greater the faster */

init_hash (&pphash,anz);

/* loop over all lines and for each line use */
add_to_hash (pphash,anz,key,filepos); /* Where key is your search key and filepos the line in the file or position of ftell */

/* Now you can search a key with find_in_hash */

            
0
 
LVL 10

Expert Comment

by:rbr
ID: 1258508
int hash_function should be unsigned long hash_function
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use for-loops in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use nested-loops in the C programming language.

604 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question