Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 298
  • Last Modified:

Seeking Positions in a File

Hi !

I am trying to seek positions in a file.  The file is about 5 meg in size, and in one process need to seek about 200 times.  I am currently using fopen and fseek to do this and it is taking about 6-8 seconds to complete.  I need to speed this up to be as fast as possible.  Also when this is being done simultaneously by about 10 people it can take longer than 20 seconds per process.  I would be grateful if you could advise me on any faster ways of doing this.

Regards,

Marvin.
0
checkin
Asked:
checkin
  • 6
  • 3
1 Solution
 
rbrCommented:
Can you post your code since a fseek should not need so much time!
0
 
checkinAuthor Commented:
Below is the chunk that records the time elapsed.

char tmpStr[4096];

a = time(&a);

idxFile = fopen("MainData.txt","r");
for(i=1;i<=posCounter;i++) {
  tmpPosition = atoi(read_record(inBuff,i,','));
  fseek(idxFile,tmpPosition, 0);
  fgets(tmpStr,sizeof(tmpStr),idxFile);
}
fclose(idxFile);

b = time(&b);
diff = b - a;
printf("Get Records Time = [%d] Seconds<br>\n",diff);



0
 
rbrCommented:
I guess posCounter goes up to 200. 5-6 secs seems a long time for me for this function. What does read_record do. What kind of computer (processor, OS) do you use?
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
rbrCommented:
Also fseek in a non binary file could be dangerous!
0
 
checkinAuthor Commented:
OS is Solaris on a SUN Ultra 1 with 256Mb Ram

read_record is a function to return a specific field from a delimted line.  Here is it below :-

char* read_record(char *rec, int fieldNum, char delimin) {

  int a;
  char *chPtr1;
  char localrec[4096];
  char delimeter[40];
  char tmpbuf[40];

  memset(tmpbuf,'\0',sizeof(tmpbuf));
  memset(delimeter,'\0',sizeof(delimeter));
  memset(localrec,'\0',sizeof(localrec));

  sprintf(delimeter,"%c",delimin);

  for(a=0;rec[a]!='\0';a++) {
    if(a != 0) {
      if(rec[a-1]==delimin && rec[a]==delimin) {
        sprintf(localrec,"%s ",localrec);
      }
    }
    sprintf(localrec,"%s%c",localrec,rec[a]);
  }

  chPtr1 = localrec;
  chPtr1 = strtok((char*)localrec,delimeter);
  for(a=0;a<fieldNum;a++) {
    if(chPtr1+1==delimeter) { a++; }
    chPtr1 = strtok(NULL,delimeter);
  }
  if(chPtr1==NULL)
    return("NULL");
  return(chPtr1);
}

0
 
cpa802Commented:
I think your best bet, I dont know your situation, would be to translate the ascii file, which seems to contain only indecies back in to the same file, translate it in to a binary file.
That would make the file smaller (MUCH smaller). Then you could read the whole file in to memmory and scan through it. Then you wouldn't need to do any seeks only calculate an offset in to an array.
I know thats some what general but its the best I can do without knowing what the file contains.
If the file contains some data other than offsets back in to the same file you might want to consider an index file. Keep the data in one file with no offset information, just null terminated strings. Then have a second file, binary, that contains only offsets in to the data file. You can read the index file in to memmory and then do One seek to the data you want in the data file.
Again I dont know what the data file contains.
0
 
rbrCommented:
I think in these function most of the time is spend
remove all memset . You will not need it in this function. You only need
localrec[0]='\0';

replace
sprintf(delimeter,"%c",delimin);
by
delimeter[0]= delimin;
delimeter[1]= '\0';
sprintf is slow

sprintf(localrec,"%s ",localrec); could be replace by
strcat(localrec," ");

sprintf(localrec,"%s%c",localrec,rec[a]); replace with
strncat (localrec,&rec[a],1);

       
 
           
0
 
rbrCommented:
I don't understand, why your read_record is soo difficult. Can you post the contents of inBuff and what you want to do with it. You loop everytime over the whole buffer. This can be made faster.
0
 
checkinAuthor Commented:
Basically any string that I have for example :-

field1,field2,field3,field4

I made this function so that I could pass it any delimeted string with the field number I wanted to retreive and the delimeter used.  So in the above example to retrieve field3 I would call it like this

read_record(string,2,',')

Marvin.
0
 
rbrCommented:
Change

for(i=1;i<=posCounter;i++) {
  tmpPosition = atoi(read_record(inBuff,i,','));
  fseek(idxFile,tmpPosition, 0);
  fgets(tmpStr,sizeof(tmpStr),idxFile);
}

to
tmpstr = read_record(inBuff,0,',');

for(i=1;i<=posCounter;i++) {
  tmpstr = read_record(tmpstr,0,',');
  tmpPosition = atoi(tmpstr);
  fseek(idxFile,tmpPosition, 0);
  fgets(tmpStr,sizeof(tmpStr),idxFile);
}

Why don't you use the first field?


0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 6
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now