• C

how to open files in loops

Greetings,
The attached code snippet is a loop from a program I'm working on. To summarize:
- environment is Windows 2000.
- the value "ID" is read from a comma separated value (csv) input file
- it is used as input to program.exe
- program.exe is a program which takes ID and outputs name and dept (in name,dept format).
- output of program.exe is piped to another csv file (csvfile.csv)
- csvfile.csv is opened, the first line is read and parsed by comma to get name and dept
- csvfile.csv is closed
- repeat for next line in input file

The code works fine if the input file does not have too many lines. But when it grows large (>200 lines)  the program crashes at line 200 (as indicated by the loop counter ii) with an error stating that the file csvfile.csv can not be opened (the function "fileopen" in the snippet is just the ANSI fopen command with error checking added).

I know there is a limit on number of files that can be opened in a C program, but since I'm closing the file shortly after opening it I didn't think it would be a problem.

That being the case, why would the file not be opened after a certain point?

Thank you.

strcpy(command, "Program.exe ");
for (ii=0; fgets(line1, BUFFERSIZE, fp_input) != NULL ;ii++){
	printf("\nIndex:%d\n", ii+1);
 
	strcpy(ID, strtok(line1, ","));
	strcat(command, ID);
	strcat(command, " > csvfile.csv");
	system(command);
	strcpy(command, "Program.exe "); //
 
	fp_textfile = fileopen("csvfile.csv", "r");
		fgets(line2, BUFFERSIZE, fp_textfile);
		strcpy(name, strtok(line2, ","));
		strcpy(dept, strtok(NULL, ","));
	fclose(fp_textfile);
}

Open in new window

TAI-Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

evilrixSenior Software Engineer (Avast)Commented:
Ok, off the top of my head, I'm just wondering if what's happening is that the OS is caching the data being written by Program.exe and even though that has finished and the system() function has returned it may be the OS still has a lock placed on the file that prevents it from being opened whilst the cache is being flushed to disk.

This is just a hypothisis :)
0
evilrixSenior Software Engineer (Avast)Commented:
BTW: Can't you close fp_textfile as soon as the fgets has completed?
fp_textfile = fileopen("csvfile.csv", "r");
fgets(line2, BUFFERSIZE, fp_textfile);
fclose(fp_textfile);
 
strcpy(name, strtok(line2, ","));
strcpy(dept, strtok(NULL, ","));

Open in new window

0
Infinity08Commented:
What happens if you add a short pause between the system(command) and the fileopen ? Giving enough time to properly close the file before it's opened again.


Also, does this always happen at the 200-th line ? If so, then check that line, and see if there's something wrong with it. Try running Program.exe manually with the ID from the problematic line, and see if the output is correct.
0
The Ultimate Tool Kit for Technolgy Solution Provi

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy for valuable how-to assets including sample agreements, checklists, flowcharts, and more!

evilrixSenior Software Engineer (Avast)Commented:
The code below works with no problem for me. It's not completely the same as what you have but it does, ostensibly, rule out the possibility that it is the number of files being opened.

This would indicate to me that the problem is possibly is a timing issue as I've stated above. As I8 has suggested, you might want to try introducing an artificial pause for each loop just to see if it resolves the issue. If it does then at least we can be confident this is the problem. If it is I'm not really sure what you can do to get around it, except for the pause. Not really idea. Do you have the source code for Program.exe? If so maybe you could build that into your program?

-Rx.
#include <stdio.h>
#include <assert.h>
 
void test(char const * mode)
{
	printf("Testing with mode %s\n", mode);
	for(int cnt = 0 ; cnt < 5000 ; ++cnt)
	{
		FILE * fp = fopen("c:\\temp\\data.txt", mode);
		assert(fp);
		fclose(fp);
	}
	printf("Done\n", mode);
}
 
int main()
{
	test("r");
	test("w");
	test("a");
	test("r+");
	test("w+");
	test("a+");
 
	return 0;
}

Open in new window

0
TAI-Author Commented:
I've tried the following:
- Copied the known working lines from the beginning of input to lines after 190 (it is not consistently always line 200).
- I added the line:  if (ii>190) system("pause"); right after the system call to program.exe. Program.exe usually takes 1 second to finish. I manually waited about 5 seconds after each pause before letting the program continue.
- I added fflush(fp_textfile); to right before fclose();
- I moved the fflush-fclose duo to right after fgets.

Nevertheless the program still crashed in the 190-200 range.

I can still redirect the output of Program.exe with double pipe ,i.e., Program.exe >> csvfile.csv,  to get one large csvfile.csv file instead of the single line one I get now. Then process this file line by line  in another loop. But this would require me to convert some single variables in the original loop to arrays, and I want to avoid this.

0
evilrixSenior Software Engineer (Avast)Commented:
What about if you create all the files first, then process them and then, if necessary, delete them? You'd have to ensure each file was a unique name of course. cvsfile1,.cvs, cvsfile2.cvs ...

Again, not ideal as you'll end up with a lot of small files but at this time I think your options are limited.
0
Infinity08Commented:
>> Nevertheless the program still crashed in the 190-200 range.

I performed a test myself with these three programs :

        1) gentestfile.c : generates the input file with 500 records
        2) Program.c : creates a simple Program.exe that emulates its behavior
        3) testprog.c : is the actual application that will run ...

I've run it several times, and never had it crash like you said. I even tried increasing the size of the input file to 2000 (took 2000 seconds to process lol).

Can you point out whether I missed something obvious that is different in your setup ? Can you try this same test with this same code ?


/* ---- gentestfile.c ---- */
 
#include <stdio.h>
 
int main(void) {
  FILE *fp = fopen("test_in.txt", "w");
  int i = 0;
  for (i = 0; i < 500; ++i) {
    fprintf(fp, "%04d,blaaa\n", i + 1);
  }
  fclose(fp);
  return 0;
}
 
 
/* ---- Program.c ---- */
 
#include <stdio.h>
#include <windows.h>
 
int main(int argc, char *argv[]) {
  if (argc <= 1) {
    fprintf(stderr, "Program : ERROR : not enough arguments !!\n");
    return 1;
  }
  
  fprintf(stderr, "Program : getting info for ID %s ...\n", argv[1]);
  Sleep(1000); /* artificial delay ... */
  fprintf(stdout, "name%s,dept%s,blabla\n", argv[1], argv[1]);
  
  return 0;
}
 
 
/* ---- testprog.c ---- */
 
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
 
#define BUFFERSIZE 128
 
FILE *fileopen(const char *filename, const char *mode) {
  FILE *fp = fopen(filename, mode);
  if (!fp) {
    printf("ERROR : could not open file !!!\n");
    exit(1);
  }
  return fp;
}
 
int main(void) {
  char command[BUFFERSIZE] = { 0 };
  int ii = 0;
  char line1[BUFFERSIZE] = { 0 };
  char line2[BUFFERSIZE] = { 0 };
  char ID[BUFFERSIZE] = { 0 };
  char name[BUFFERSIZE] = { 0 };
  char dept[BUFFERSIZE] = { 0 };
 
  FILE *fp_input = fileopen("test_in.txt", "r");
  FILE *fp_textfile = 0;
 
  strcpy(command, "Program.exe ");
  for (ii=0; fgets(line1, BUFFERSIZE, fp_input) != NULL ;ii++){
        printf("Index:%d\n", ii+1);
 
        strcpy(ID, strtok(line1, ","));
        strcat(command, ID);
        strcat(command, " > csvfile.csv");
        system(command);
        strcpy(command, "Program.exe ");
 
        fp_textfile = fileopen("csvfile.csv", "r");
        fgets(line2, BUFFERSIZE, fp_textfile);
        strcpy(name, strtok(line2, ","));
        strcpy(dept, strtok(NULL, ","));
        fclose(fp_textfile);
        printf("  -> found name \"%s\" and dept \"%s\" for ID \"%s\"\n", name, dept, ID);
  }
  fclose(fp_input);
 
  return 0;
}

Open in new window

0
TAI-Author Commented:
Ok, here is the progress report :)

Both evilrx's code and Infinity's testprog.c executed on my system without errors. So I thought it's time to attach the whole code instead of  the section I think is faulty.

The input values are in order the input file name, output file name, and an .ini file name.

You'll notice a ReadIni function. It's a function I wrote and is in a separate c file. The code for this is also supplied in the snippet after main prog. It is basically Windows' GetPrivateProfileString function with a few behavioural differences. Given a Key, it reads the corresponding Value from a .ini file. But surprise! It has to open and close this .ini file. In main it is called in the loop twice in succession and it may well be the problem.

During my trials
-If I commented out one ReadIni 500+ lines processed without problems (but crashed before 600)
-If I commented out both ReadIni's 1500+ lines processed without problems

The .ini file is your standard ini file in the form of
[PROJECTS]
ID1=Project5
ID2=Project8
etc...
[DEPTS]
Dept1=Accounting
Dept2=Purchasing
etc...
#include <stdio.h>
#include <taicustom.h>
 
#define BUFFERSIZE 1024
 
int main (int argc, char **argv){
	
	int ii=0;
	char line1[BUFFERSIZE], line2[BUFFERSIZE];
	char ID[12], openno[12], avgtime[12], tottime[12], name[128];
         char deptno[12], deptname[128], project[16];
	char inifile[2048], command[2048];
 
	FILE *fp_input, *fp_output, *fp_csvfile;
	
	fp_input = _fileopen(argv[1], "r");
	fp_output= _fileopen(argv[2], "a");
	strcpy(inifile, argv[3]);
	strcpy(command, "Program.exe ");
 
	//Print header line to output
	fprintf(fp_output, "ID, name, deptno, deptname, project, openno, avgtime, tottime\n");
	
	//Process inout line by line
	for (ii=0; fgets(line1, BUFFERSIZE, fp_input) !=NULL ;ii++){
		printf("\nIndex:%d\n", ii+1);
 
		//Parse input file line
		strcpy(ID		, strtok(line1, ","));
		strcpy(openno	, strtok(NULL, ","));
		strcpy(avgtime	, strtok(NULL, ","));
		strcpy(tottime	, strtok(NULL, ","));
 
		//Run Program.exe to write name and dept no to csv
		strcat(command, ID);
		strcat(command, " > csvfile.csv");
		system(command);
		strcpy(command, "Program.exe ");
		
		//Open csv and read line
		fp_csvfile = _fileopen("csvfile.csv", "r");
		fgets(line2, BUFFERSIZE, fp_csvfile);
		fflush(fp_csvfile);
		fclose(fp_csvfile);
		
		//Parse csv line to get name and dept no
		strcpy(name	, strtok(line2, ","));
		
		if ( strcmp(name,"No user found.\n") == 0 ){
			name[strlen(name)-1]= '\0'; //erase eol character
			strcpy(deptno,"No dept number.");
		}
		else {
			strcpy(deptno, strtok(NULL, ","));
			deptno[strlen(deptno)-1]= '\0'; //erase eol character
		}
	
		//from ini file: using ID get project , using deptno get deptname
		ReadIni("PROJECTS", ID, "no default", BUFFERSIZE, inifile, project);
		ReadIni("DEPTS", deptno, "no default", BUFFERSIZE, inifile, deptname);
		
		//print everything to output(another csv file)
		fprintf(fp_output, "%s,%s,%s,%s, %s,%s,%s,%s\n", ID, name, deptno, deptname, project, openno, avgtime, tottime);
 
	}
 
	fclose(fp_output);
	fclose(fp_input);
	
	return 0;
 
}
 
 
==========================READINI FUNCTION============================
int ReadIni(char *SectionName, char *KeyName, char *DefaultKeyValue, int BufferSize, char *SettingsFile, char *ReturnValue){
	
	int ii, jj;
	FILE *fp_SettingsFile;
	char FileBuffer[BufferSize], SectionBuffer[BufferSize];
	char *SectionNameBraced, *Key, *Value;
 
	*ReturnValue = '\0';
	fp_SettingsFile 	= (FILE *)_fileopen(SettingsFile, "r");
	SectionNameBraced 	= _memalloc(BufferSize);
	Key					= _memalloc(BufferSize);
	Value				= _memalloc(BufferSize);
 
	SectionNameBraced = _concat("[", SectionName, "]", NULL);
	
	for (ii=0; fgets(FileBuffer, BufferSize, fp_SettingsFile) != NULL; ii++){
 
		//Skip lines beginning with # (comment lines)
		if (FileBuffer[0] == '#')
			continue;
			
		if ( strncmp(FileBuffer, SectionNameBraced, strlen(SectionNameBraced)) == 0 ) {
			
			//Section reached, start copying lines to section buffer
			for (jj=0; fgets(SectionBuffer, BufferSize, fp_SettingsFile) != NULL; jj++){
				
				if (SectionBuffer[0] == '#')
					continue;
				
				//if no keyname given return all lines until next section or EOF
				if ( KeyName == '\0' ) {
					Key = strtok(SectionBuffer, "=");
					if ( *Key != '['){
						strcat(ReturnValue, Key);
						continue;
					}
					else {
						break;
					}
				}
				//if keyname given search for corresponding value
				else  {
					 if( strstr(SectionBuffer, KeyName) != NULL) {	
						Key = strtok(SectionBuffer, "=");
						Value = strtok(NULL,  "\n");
						//if no value specified return default key
						if ( strcmp(Value, "\n") == 0 ){
							strcpy(ReturnValue, DefaultKeyValue);
							return 0;
						}
						else {
							strcpy(ReturnValue, Value);
							return 0;
						}
					}
				}
			}
		}
	}
	
	fflush(fp_SettingsFile);
	fclose(fp_SettingsFile);
 
	return 0;
}

Open in new window

0
TAI-Author Commented:
Oh, I should mention  I added pause statements around the ReadIni's to test for time issues but still crahshed at "expected" locations.
0
evilrixSenior Software Engineer (Avast)Commented:
TAI-

I'm not sure how this code could ever compile. I spotted the issue below and decided to check with you first.

int ReadIni(char *SectionName, char *KeyName, char *DefaultKeyValue, int BufferSize, char *SettingsFile, char *ReturnValue){
      ...
      char FileBuffer[BufferSize], SectionBuffer[BufferSize];

You can't use BufferSize to sie an array as it is not a known constant at compile time. Did you post the correct code?

NB. I didn't look at anything after spotting this.
0
evilrixSenior Software Engineer (Avast)Commented:
Do these need to be freed?

      SectionNameBraced       = _memalloc(BufferSize);
      Key                  = _memalloc(BufferSize);
      Value                  = _memalloc(BufferSize);

I can't see where this might happen.
0
evilrixSenior Software Engineer (Avast)Commented:
I've had a quick glance through the code and although there are a number of things I'd change to make it more robust I can't see any obvious reason for your problem --yet!

One simple way to try and track these kind of non-obvious issues down is to use the divide and conquer method. Basically, comment out as much of the code as you can and then start uncommenting logical chunks until the problem returns. Once you've managed to isolate a logical chunk that fails perform the same comment out and then start uncommenting on this specific chunk (but with finer granularity) until; hopefully, the problem will jump out at you. It doesn't always work but I've managed to track down some really nasty defects using this simple method, usually in a pretty quick time-scale. Think of it as binary chopping your code for the defect.

-Rx.
0
evilrixSenior Software Engineer (Avast)Commented:
BTW: I take it you are confident your input data is well formed? Otherwise, lines like the following are a crash waiting to happen...

strcpy(ID            , strtok(line1, ","));

strtok: -

Return Value
A pointer to the last token found in string.
A null pointer is returned if there are no tokens left to retrieve.

strcpy will likely crash if passed NULL for either parameter.

As a rule, assume all input is evil and validate it before use!
0
Infinity08Commented:
>> I can't see any obvious reason for your problem --yet!

If allocated memory is not freed like evilrix pointed out, then that might maybe be the cause of this behavior.
0
Infinity08Commented:
Also note that in your ReadIni function, you have a few locations where it returns WITHOUT closing the file !!!!

Fix that, and you'll likely see the problem go away ;)
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
evilrixSenior Software Engineer (Avast)Commented:
>> you have a few locations where it returns WITHOUT closing the file !!!!
Yes, that would probably explain it :)
0
evilrixSenior Software Engineer (Avast)Commented:
In ReadIni(), try changing...

if ( strcmp(Value, "\n") == 0 ){
      strcpy(ReturnValue, DefaultKeyValue);
      return 0;
}
else {
      strcpy(ReturnValue, Value);
      return 0;

to


if ( strcmp(Value, "\n") == 0 ){
      strcpy(ReturnValue, DefaultKeyValue);
      break;
}
else {
      strcpy(ReturnValue, Value);
      break;


And don't forget to free that heap allocated memory on exit either!
0
Infinity08Commented:
Just replacing the returns with breaks won't help, as there's a double loop around them (it can only break out of one loop).

So, you'll either have to introduce a "stop flag", or duplicate the file close code.
0
evilrixSenior Software Engineer (Avast)Commented:
>> Just replacing the returns with breaks won't help, as there's a double loop
Oops, so there is.
That's what I get for trying to debug my own code and review someone else's at the same time :(

0
TAI-Author Commented:
It indeed was the return statements that didn't close the file before returning. I added the close statements (as well as the memory free statements) and the code executed well beyond 1000 lines.

Other notes:
- The main reason for writing ReadIni was the initial part, i.e, send a NULL Key, get all Keys under a certain section. I never until now needed to use the latter part of the function and had done only minor testing of that section - none involving opening the ini file hundreds of times.

- The char FileBuffer[BufferSize], SectionBuffer[BufferSize]; compiling surprises me also. When I compile the same code on UNIX it immediately returns this as an error. This line also looks like an error waiting to happen, I might as well change this line too while I'm at it.

- I'm assuming for now the input is always correct. I wanted to get an initial version up and running before going into error checking detail. I can say now that the initial version is up and running. :)

Thank you both for your effort

Thank you both.
0
evilrixSenior Software Engineer (Avast)Commented:
You are very welcome TAI.
0
Infinity08Commented:
>> It indeed was the return statements that didn't close the file before returning. I added the close statements (as well as the memory free statements) and the code executed well beyond 1000 lines.

Nice :)


>> The char FileBuffer[BufferSize], SectionBuffer[BufferSize]; compiling surprises me also.

I think compiler optimization is the key here, because it ultimately refers to :

        #define BUFFERSIZE 1024

which is a compile time constant.


>> This line also looks like an error waiting to happen, I might as well change this line too while I'm at it.

I think that's best yes :)


>> Thank you both for your effort

No problem. Always glad to be of assistance !
0
evilrixSenior Software Engineer (Avast)Commented:
>> I think compiler optimization is the key here, because it ultimately refers to :
Frankly, it just should not compile! What compiler are you using?
0
Infinity08Commented:
>> Frankly, it just should not compile!

I agree.

But my guess is that the compiler removed the BufferSize argument from the function, and used BUFFERSIZE directly internally. So, I think it really interpreted the code something like this :

    int ReadIni(char *SectionName, char *KeyName, char *DefaultKeyValue, char *SettingsFile, char *ReturnValue){

        /* <SNIP> */

      char FileBuffer[BUFFERSIZE], SectionBuffer[BUFFERSIZE];

        /* <SNIP> */

    }

however wrong that is ...
0
TAI-Author Commented:
I'm using PellesC compiler.

But I would expect the compiler to interpret BUFFERSIZE and BufferSize differently due to uppercase and lowercase difference. At least it has for everything else :)

Maybe it has something to do with BufferSize actually being the 4th input parameter of ReadIni. When I change the name BufferSize in the parameter list, it starts giving an "undefined" error to the lower case BufferSize used as string size declaration.




int ReadIni(char *SectionName, char *KeyName, char *DefaultKeyValue, int BufferSize, char *SettingsFile, char *ReturnValue)

Open in new window

0
evilrixSenior Software Engineer (Avast)Commented:
I think you'll find it's probably a compiler bug since it really should not be able to compile :)
0
TAI-Author Commented:

Most likely. Neither cc compiler of UNIX nor MS Visual Studio compiles.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C

From novice to tech pro — start learning today.