Solved

Problem to generate the file

Posted on 2014-12-17
69
176 Views
Last Modified: 2015-01-11
Hi,
when running the following codes
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <string.h>
#include <fstream>
#include <string>
#include <ctype.h>
#include <time.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval2.h"   
#include <iomanip>
#include <algorithm>
using namespace std;
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
nameval binrec;
int main()
{
    std::set<nameval> records;
    int cnt;
	//
	for (cnt=0;cnt<100000000;cnt++)
	{
		try 
		{
			nameval val={0};
			int j;
			//
			for (j=0;j<20;j++)
			{
				//
				val.fld_nm[j] += (char)(rand () % 26 + ((rand()%2)? 65 : 97));
				//
			}
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			val.fld_val=cnt;
			//
			records.insert(val);
		}
		catch (exception& e)
		{
			cout << e.what() << '\n';
		}
	}
	//
	std::ofstream ostrm("c:\\dp4\\flout.bin", std::ios::binary | std::ios::out );
	if (ostrm.is_open())
	{
		//
		for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
		{
			//
			//
			//
			//
			try
			{
				ostrm.write((char *)&(*it), sizeof(nameval));
			}
			catch (exception& e)
			{
				cout << e.what() << '\n';
			}
		}
	}
	//
	//
    //
	std::wofstream ostrm2("c:\\dp4\\flout.ord", std::ios::out );
	if (ostrm2.is_open())
	{
		//
		for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
		{
			//
			//
			wchar_t wtemp[100] = { L'\0' };
			it->get_uni_nm(wtemp, 100);			
			//
			ostrm2 << "\"" << it->fld_nm << "\" " << it->get_len() << ' '  <<wtemp << ' '<< it->fld_val << '\0' << '\n';
			//
			//
		}
	}
 	//
	return 0;
}

Open in new window

using VS 2010, it cannot generate the relevant binary file. why?
0
Comment
Question by:HuaMinChen
  • 34
  • 34
69 Comments
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
using VS 2010, it cannot generate the relevant binary file. why?

can you describe what is going wrong? also post the nameval2.h please. is it still an x64 project or did you switch to win32?

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks.
Here is nameval2.h

// nameval.h
#ifndef NAME_VAL_H
#define NAME_VAL_H

struct nameval
{
     char fld_nm[100];
     int    fld_val;

     int  get_len() const 
     {   
         int len = strlen(fld_nm);
         if (len < (int)sizeof(fld_nm))
             return len;
         return (int)sizeof(fld_nm);
     }
     void get_uni_nm(wchar_t nm_uni[], int sizfld) const
     {
         int len = get_len();
         if (len > sizfld)
             len = sizfld;
         mbstowcs(nm_uni, fld_nm, len);
     }
     bool operator< (const nameval & a2) const
     {
           if(strcmp(fld_nm, a2.fld_nm) < 0) return true;
           if(strcmp(fld_nm, a2.fld_nm) > 0) return false;
           if (fld_val < a2.fld_val) return true;
           return false;
     }
};

#endif

Open in new window


and I'm still using the same 64-bit project (from the other thread closed yesterday).

Have a great weekend and Christmas!
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
and what goes wrong? didn't you get a proper file at all or is the problem that you can't read it with the second program?

Sara

p.s. merry Christmas to you.
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
It does not touch that file "c:\dp4\...", after some time, like 35 minutes. I don't know why it cannot generate the files as expected. Many thanks to you.
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
after some time, like 35 minutes

Open in new window

10 million of names is a lot. so it might last some more time.

you may add the following statements at end of the for loop (after statement with records.insert ...)

if ((cnt+1)%10000 == 0)
{
        std::cout << val.fld_val << " | " << val.fld_nm << std::endl;
}

Open in new window


that should write a message every 10,000 names to the shell window where you can see whether the program still runs.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many many thanks Sara.
I checked that it is becoming slower and slower (see the attached), while now it is creating 100 millions records to the file. I know most probably OS memory is becoming exhausted at that moment. is there any better way to conquer with such big list, within C++ project?
t891.png
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
yes. you may create smaller list files - say 1 million entries - per file and choose a different seed for the rand function by calling srand(<some odd number lik3 765758513> you may get an individual number for each run by calling DWORD dwseed = (DWORD)time(); where time() returns the number of seconds since 1.1.1870.

after you have 10 files (each of them has 1 million of sorted names) you need a new program (it can be 32-bit or 64-bit, it doesn't matter) which opens all files and read 1st record of each file into an array of 10 records. than you find out which of the 10 keys is minimum (you can use the < operator of struct nameval for this) and write this minimum nameval struct to a new output file. after that you read one new record from that file where the minimum record came from. you repeat that until all records of all files were read. you have to handle the case where files have run to eof, but finally you have one big sorted file and memory is not an issue (though it will last some time till all 10 million records are processed and newly written. to speed-up writing a 10 million records file you may create an empty file first where you write in bigger chunks - say 1000 empty records. after that you open the existing file again and start to write from 1st record.

Sara
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
to speed-up your current program you could reboot your system, then go into task manager and kill all processes whic need a lot of memory (the process list could be sorted by that value). if you don't know whether the program is essential for running you may google for the name,

after that, start your creating program from commandline (not from studio since it would also need much memory).

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Thanks a lot.

after you have 10 files (each of them has 1 million of sorted names) you need a new program (it can be 32-bit or 64-bit, it doesn't matter) which opens all files and read 1st record of each file into an array of 10 records. than you find out which of the 10 keys is minimum (you can use the < operator of struct nameval for this) and write this minimum nameval struct to a new output file. after that you read one new record from that file where the minimum record came from. you repeat that until all records of all files were read. you have to handle the case where files have run to eof, but finally you have one big sorted file and memory is not an issue (though it will last some time till all 10 million records are processed and newly written. to speed-up writing a 10 million records file you may create an empty file first where you write in bigger chunks - say 1000 empty records. after that you open the existing file again and start to write from 1st record.

Suppose it is fine to handle files each of which is having 1 millions, does it mean I then have to combine 10 such files together, if I expect the final output file would have 10 millions records?
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
yes, and because each of them was sorted the final file also was sorted. the only thing where you would have to care for are duplicate keys which sould be avoided.

file1         file2          file3     ....
AAABBaX...    AABAAAx...     AAAAHbd.. ....
AAAfGVV...    AACaJJ...      AAAAiu... ....

Open in new window


in the sample the final file would takes names from file3 two times, then from file1, and so on.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks.
It is actually creating the final file having 100 millions records. within the loop below
	for (cnt=0;cnt<100000000;cnt++)
	{
		try 
		{
			...

Open in new window

how to each time, write one bundle of 10 millions into one specific file?
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
the 100 million keys would need to using 64-bit integers every where.

with the current programs you may be safe for 40 million and i would use 40 files.

40*1000000*104 == 4160000000

104 is the sizeof namvel. and the file size is below unsigned integer maximum what makes sense cause files > 4GB cannot be handled by all programs.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Thanks.
is there any demonstration to produce 20 or more files, and then to combine them together into the one with expected size?
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
std::ofstream files[40];
for (int f = 0; f < 40; ++f)
{
     srand(int)time(NULL));
     std::ostringstream filename;
     filename << "c:\\dp4\\flout" << f << ".bin";
     if (!files[f].open(filename.str().c_str(), std::ios::binary| std::ios::out))
           return GetLastError();  // handle error somehow
     // here add you current code for creating a 1 million names file 
     // but use files[f] as the output file
     ...

Open in new window


Sara
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
the merge program would use std::ifstream and would open all 40 input files and the one output file in advance. then define an array of nameval struct with 40 elements, say 'namesarr'. have a second loop from 0 to 39 where you read from each file into the array. then find minimum key, write nameval[min_index] into the big output file and read again from files[min_index] to get a new nameval.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
sorry, I get these
1>------ Rebuild All started: Project: SaveBinaryFile, Configuration: Release x64 ------
1>  stdafx.cpp
1>  SaveBinaryFile.cpp
1>c:\dp4\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(12): warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data
1>c:\dp4\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(22): warning C4996: 'mbstowcs': This function or variable may be unsafe. Consider using mbstowcs_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h(498) : see declaration of 'mbstowcs'
1>SaveBinaryFile.cpp(63): error C2144: syntax error : 'int' should be preceded by ')'
1>SaveBinaryFile.cpp(63): error C2661: 'srand' : no overloaded function takes 0 arguments
1>SaveBinaryFile.cpp(63): error C2059: syntax error : ')'
1>SaveBinaryFile.cpp(63): error C2059: syntax error : ')'
1>SaveBinaryFile.cpp(64): error C2079: 'filename' uses undefined class 'std::basic_ostringstream<_Elem,_Traits,_Alloc>'
1>          with
1>          [
1>              _Elem=char,
1>              _Traits=std::char_traits<char>,
1>              _Alloc=std::allocator<char>
1>          ]
1>SaveBinaryFile.cpp(65): error C2297: '<<' : illegal, right operand has type 'const char [13]'
1>SaveBinaryFile.cpp(65): error C2297: '<<' : illegal, right operand has type 'const char [5]'
1>SaveBinaryFile.cpp(65): warning C4552: '<<' : operator has no effect; expected operator with side-effect
1>SaveBinaryFile.cpp(66): error C2228: left of '.str' must have class/struct/union
1>          type is 'int'
1>SaveBinaryFile.cpp(66): error C2228: left of '.c_str' must have class/struct/union
1>SaveBinaryFile.cpp(67): error C3861: 'GetLastError': identifier not found
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window


with these codes
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <string.h>
#include <fstream>
#include <string>
#include <ctype.h>
#include <time.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval2.h"   
#include <iomanip>
#include <algorithm>
using namespace std;
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
nameval binrec;
int main()
{
    std::set<nameval> records;
    int cnt;
	//
	std::ofstream files[40];
	for (int f = 0; f < 40; ++f)
	{
		 srand(int)time(NULL));
		 std::ostringstream filename;
		 filename << "c:\\dp4\\flout" << f << ".bin";
		 if (!files[f].open(filename.str().c_str(), std::ios::binary| std::ios::out))
			   return GetLastError();  //
		 //
		 //
		for (cnt=0;cnt<100000000;cnt++)
		{
			try 
			{
				nameval val={0};
				int j;
				//
				for (j=0;j<20;j++)
				{
					//
					val.fld_nm[j] += (char)(rand () % 26 + ((rand()%2)? 65 : 97));
					//
				}
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				val.fld_val=cnt;
				//
				records.insert(val);
				if ((cnt+1)%10000 == 0)
				{
						std::cout << val.fld_val << " | " << val.fld_nm << std::endl;
				}
			}
			catch (exception& e)
			{
				cout << e.what() << '\n';
			}
		}
	}
	//
	std::ofstream ostrm("c:\\dp4\\flout.bin", std::ios::binary | std::ios::out );
	if (ostrm.is_open())
	{
		//
		for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
		{
			//
			//
			//
			//
			try
			{
				ostrm.write((char *)&(*it), sizeof(nameval));
			}
			catch (exception& e)
			{
				cout << e.what() << '\n';
			}
		}
	}
	//
	//
    //
	std::wofstream ostrm2("c:\\dp4\\flout.ord", std::ios::out );
	if (ostrm2.is_open())
	{
		//
		for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
		{
			//
			//
			wchar_t wtemp[100] = { L'\0' };
			it->get_uni_nm(wtemp, 100);			
			//
			ostrm2 << "\"" << it->fld_nm << "\" " << it->get_len() << ' '  <<wtemp << ' '<< it->fld_val << '\0' << '\n';
			//
			//
		}
	}
 	system("pause>null");
	return 0;
}

Open in new window

0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
srand(int)time(NULL));

should be

srand((int)time(NULL));

Sara
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
error C2079: 'filename' uses undefined class 'std::basic_ostringstream

you have to include <sstream>

Sara
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
std::ofstream ostrm("c:\\dp4\\flout.bin", std::ios::binary | std::ios::out );
you have to omit that and use files[f] instead of ostrm.

   std::set<nameval> records;
should be moved into the f loop as we need an empty std::set for each files[f]

for (cnt=0;cnt<100000000;cnt++)
you should read my comments more thoroughly. each of the files should have a maximum of 1 million and 100,000,000 would blow up your memory to no return for swapping reasons.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.
I get these
1>------ Rebuild All started: Project: SaveBinaryFile, Configuration: Release x64 ------
1>  stdafx.cpp
1>  SaveBinaryFile.cpp
1>c:\dp4\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(12): warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data
1>c:\dp4\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(22): warning C4996: 'mbstowcs': This function or variable may be unsafe. Consider using mbstowcs_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h(498) : see declaration of 'mbstowcs'
1>SaveBinaryFile.cpp(67): error C2171: '!' : illegal on operands of type 'void'
1>SaveBinaryFile.cpp(67): error C2451: conditional expression of type 'void' is illegal
1>          Expressions of type void cannot be converted to other types
1>SaveBinaryFile.cpp(68): error C3861: 'GetLastError': identifier not found
1>SaveBinaryFile.cpp(117): error C2065: 'f' : undeclared identifier
1>SaveBinaryFile.cpp(117): error C2369: 'files' : redefinition; different subscripts
1>          SaveBinaryFile.cpp(60) : see declaration of 'files'
1>SaveBinaryFile.cpp(117): error C2075: 'files' : array initialization needs curly braces
1>SaveBinaryFile.cpp(118): error C2065: 'f' : undeclared identifier
1>SaveBinaryFile.cpp(118): error C2228: left of '.is_open' must have class/struct/union
1>SaveBinaryFile.cpp(121): error C2065: 'records' : undeclared identifier
1>SaveBinaryFile.cpp(121): error C2228: left of '.begin' must have class/struct/union
1>          type is ''unknown-type''
1>SaveBinaryFile.cpp(121): error C2065: 'records' : undeclared identifier
1>SaveBinaryFile.cpp(121): error C2228: left of '.end' must have class/struct/union
1>          type is ''unknown-type''
1>SaveBinaryFile.cpp(129): error C2065: 'f' : undeclared identifier
1>SaveBinaryFile.cpp(129): error C2228: left of '.write' must have class/struct/union
1>SaveBinaryFile.cpp(146): error C2065: 'records' : undeclared identifier
1>SaveBinaryFile.cpp(146): error C2228: left of '.begin' must have class/struct/union
1>          type is ''unknown-type''
1>SaveBinaryFile.cpp(146): error C2065: 'records' : undeclared identifier
1>SaveBinaryFile.cpp(146): error C2228: left of '.end' must have class/struct/union
1>          type is ''unknown-type''
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window


due to these codes
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <string.h>
#include <fstream>
#include <string>
#include <ctype.h>
#include <time.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval2.h"   
#include <iomanip>
#include <algorithm>
#include <sstream>
using namespace std;
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
nameval binrec;
int main()
{
    int cnt;
	//
	std::ofstream files[70];
	for (int f = 0; f < 70; ++f)
	{
	    std::set<nameval> records;
		srand((int)time(NULL));
		std::ostringstream filename;
		filename << "c:\\dp4\\flout" << f << ".bin";
		if (!files[f].open(filename.str().c_str(), std::ios::binary| std::ios::out))
			return GetLastError();  //
		 //
		 //
		for (cnt=0;cnt<1000000;cnt++)
		{
			try 
			{
				nameval val={0};
				int j;
				//
				for (j=0;j<20;j++)
				{
					//
					val.fld_nm[j] += (char)(rand () % 26 + ((rand()%2)? 65 : 97));
					//
				}
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				val.fld_val=cnt;
				//
				records.insert(val);
				if ((cnt+1)%10000 == 0)
				{
						std::cout << val.fld_val << " | " << val.fld_nm << std::endl;
				}
			}
			catch (exception& e)
			{
				cout << e.what() << '\n';
			}
		}
	}
	//
	std::ofstream files[f]("c:\\dp4\\flout.bin", std::ios::binary | std::ios::out );
	if (files[f].is_open())
	{
		//
		for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
		{
			//
			//
			//
			//
			try
			{
				files[f].write((char *)&(*it), sizeof(nameval));
			}
			catch (exception& e)
			{
				cout << e.what() << '\n';
			}
		}
	}
	//
	//
    //
	std::wofstream ostrm2("c:\\dp4\\flout.ord", std::ios::out );
	if (ostrm2.is_open())
	{
		//
		for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
		{
			//
			//
			wchar_t wtemp[100] = { L'\0' };
			it->get_uni_nm(wtemp, 100);			
			//
			ostrm2 << "\"" << it->fld_nm << "\" " << it->get_len() << ' '  <<wtemp << ' '<< it->fld_val << '\0' << '\n';
			//
			//
		}
	}
 	system("pause>null");
	return 0;
}

Open in new window


after I've done some correction.
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
error C2171: '!' : illegal on operands of type 'void'
open member has void return, so use that;

#include <errno.h>
...
files[f].open(filename.str().c_str(), std::ios::binary| std::ios::out);
if (!files[f].is_open()) return errno;

all code from
      
      
std::ofstream files[f]("c:\\dp4\\flout.bin", std::ios::binary | std::ios::out );
	if (files[f].is_open())

Open in new window

and and lines below must be moved into the loop which begins with 'for (int f=0; ...

since we want to perform the code to generate a file with 1 million of entries 40 times.

Hua, i can only help. i cannot write all the code for you as it is against the rules of ee.

i know you have little experience with visual c++ but most of the errors here are really trivial and i am convinced you would have solved them by yourself if you have read the error descriptions thoroughly and check the statement which caused the error.

for example if the error says that 'f' is not defined, it should be possible for you to see that you have to move some statements into the above for loop where the f was valid.

note, merging of 40 files to create one big file is not really necessary for your task, if you could increase the virtual memory of your system and start your already working program on a freshly booted system where you killed all other processes which were not essential for  a while.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks.
I did correct the codes and I have these now
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <string.h>
#include <fstream>
#include <string>
#include <ctype.h>
#include <time.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval2.h"   
#include <iomanip>
#include <algorithm>
#include <sstream>
#include <errno.h>
using namespace std;
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
nameval binrec;
int main()
{
    int cnt;
	//
	std::ofstream files[70];
	for (int f = 0; f < 70; ++f)
	{
	    std::set<nameval> records;
		srand((int)time(NULL));
		std::ostringstream filename;
		filename << "c:\\dp4\\flout" << f << ".bin";
		files[f].open(filename.str().c_str(), std::ios::binary| std::ios::out);
		if (!files[f].is_open()) return errno;
		 //
		 //
		for (cnt=0;cnt<1000000;cnt++)
		{
			try 
			{
				nameval val={0};
				int j;
				//
				for (j=0;j<20;j++)
				{
					//
					val.fld_nm[j] += (char)(rand () % 26 + ((rand()%2)? 65 : 97));
					//
				}
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				val.fld_val=cnt;
				//
				records.insert(val);
				if ((cnt+1)%10000 == 0)
				{
						std::cout << val.fld_val << " | " << val.fld_nm << std::endl;
				}
			}
			catch (exception& e)
			{
				cout << e.what() << '\n';
			}
		}
		//
		std::ofstream files[f](std::string(std::string("c:\\dp4\\flout")+f)+".bin", std::ios::binary | std::ios::out );
		if (files[f].is_open())
		{
			//
			for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
			{
				//
				//
				//
				//
				try
				{
					files[f].write((char *)&(*it), sizeof(nameval));
				}
				catch (exception& e)
				{
					cout << e.what() << '\n';
				}
			}
		}
		//
		//
		//
		std::wofstream files[f]("c:\\dp4\\flout"+f+".ord", std::ios::out );
		if (files[f].is_open())
		{
			//
			for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
			{
				//
				//
				wchar_t wtemp[100] = { L'\0' };
				it->get_uni_nm(wtemp, 100);			
				//
				ostrm2 << "\"" << it->fld_nm << "\" " << it->get_len() << ' '  <<wtemp << ' '<< it->fld_val << '\0' << '\n';
				//
				//
			}
		}
	}
	system("pause>null");
	return 0;
}

Open in new window



how to correct the following problems?

1>------ Rebuild All started: Project: SaveBinaryFile, Configuration: Release x64 ------
1>  stdafx.cpp
1>  SaveBinaryFile.cpp
1>c:\dp4\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(12): warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data
1>c:\dp4\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(22): warning C4996: 'mbstowcs': This function or variable may be unsafe. Consider using mbstowcs_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h(498) : see declaration of 'mbstowcs'
1>SaveBinaryFile.cpp(117): error C2057: expected constant expression
1>SaveBinaryFile.cpp(117): error C2466: cannot allocate an array of constant size 0
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Vb_iterator<_Alloc> std::operator +(_Alloc::difference_type,std::_Vb_iterator<_Alloc>)' : could not deduce template argument for 'std::_Vb_iterator<_Alloc>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\vector(1985) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Vb_const_iterator<_Alloc> std::operator +(_Alloc::difference_type,std::_Vb_const_iterator<_Alloc>)' : could not deduce template argument for 'std::_Vb_const_iterator<_Alloc>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\vector(1878) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Vector_iterator<_Myvec> std::operator +(_Vector_iterator<_Myvec>::difference_type,std::_Vector_iterator<_Myvec>)' : could not deduce template argument for 'std::_Vector_iterator<_Myvec>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\vector(407) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Vector_const_iterator<_Myvec> std::operator +(_Vector_const_iterator<_Myvec>::difference_type,std::_Vector_const_iterator<_Myvec>)' : could not deduce template argument for 'std::_Vector_const_iterator<_Myvec>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\vector(276) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2782: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(std::basic_string<_Elem,_Traits,_Alloc> &&,const _Elem)' : template parameter '_Elem' is ambiguous
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(153) : see declaration of 'std::operator +'
1>          could be 'int'
1>          or       'char'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(std::basic_string<_Elem,_Traits,_Alloc> &&,const _Elem *)' : could not deduce template argument for 'const _Elem *' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(143) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const _Elem,std::basic_string<_Elem,_Traits,_Alloc> &&)' : could not deduce template argument for 'std::basic_string<_Elem,_Traits,_Alloc> &&' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(133) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const _Elem *,std::basic_string<_Elem,_Traits,_Alloc> &&)' : could not deduce template argument for 'const _Elem *' from 'std::basic_string<_Elem,_Traits,_Ax>'
1>          with
1>          [
1>              _Elem=char,
1>              _Traits=std::char_traits<char>,
1>              _Ax=std::allocator<char>
1>          ]
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(123) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(std::basic_string<_Elem,_Traits,_Alloc> &&,std::basic_string<_Elem,_Traits,_Alloc> &&)' : could not deduce template argument for 'std::basic_string<_Elem,_Traits,_Alloc> &&' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(109) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(std::basic_string<_Elem,_Traits,_Alloc> &&,const std::basic_string<_Elem,_Traits,_Alloc> &)' : could not deduce template argument for 'const std::basic_string<_Elem,_Traits,_Alloc> &' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(99) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const std::basic_string<_Elem,_Traits,_Alloc> &,std::basic_string<_Elem,_Traits,_Alloc> &&)' : could not deduce template argument for 'std::basic_string<_Elem,_Traits,_Alloc> &&' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(89) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2782: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const std::basic_string<_Elem,_Traits,_Alloc> &,const _Elem)' : template parameter '_Elem' is ambiguous
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(75) : see declaration of 'std::operator +'
1>          could be 'int'
1>          or       'char'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const std::basic_string<_Elem,_Traits,_Alloc> &,const _Elem *)' : could not deduce template argument for 'const _Elem *' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(61) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const _Elem,const std::basic_string<_Elem,_Traits,_Alloc> &)' : could not deduce template argument for 'const std::basic_string<_Elem,_Traits,_Alloc> &' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(47) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const _Elem *,const std::basic_string<_Elem,_Traits,_Alloc> &)' : could not deduce template argument for 'const _Elem *' from 'std::basic_string<_Elem,_Traits,_Ax>'
1>          with
1>          [
1>              _Elem=char,
1>              _Traits=std::char_traits<char>,
1>              _Ax=std::allocator<char>
1>          ]
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(33) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const std::basic_string<_Elem,_Traits,_Alloc> &,const std::basic_string<_Elem,_Traits,_Alloc> &)' : could not deduce template argument for 'const std::basic_string<_Elem,_Traits,_Alloc> &' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(19) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_String_iterator<_Elem,_Traits,_Alloc> std::operator +(_String_iterator<_Elem,_Traits,_Alloc>::difference_type,std::_String_iterator<_Elem,_Traits,_Alloc>)' : could not deduce template argument for 'std::_String_iterator<_Elem,_Traits,_Alloc>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xstring(434) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_String_const_iterator<_Elem,_Traits,_Alloc> std::operator +(_String_const_iterator<_Elem,_Traits,_Alloc>::difference_type,std::_String_const_iterator<_Elem,_Traits,_Alloc>)' : could not deduce template argument for 'std::_String_const_iterator<_Elem,_Traits,_Alloc>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xstring(293) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Array_iterator<_Ty,_Size> std::operator +(_Array_iterator<_Ty,_Size>::difference_type,std::_Array_iterator<_Ty,_Size>)' : could not deduce template argument for 'std::_Array_iterator<_Ty,_Size>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xutility(2068) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Array_const_iterator<_Ty,_Size> std::operator +(_Array_const_iterator<_Ty,_Size>::difference_type,std::_Array_const_iterator<_Ty,_Size>)' : could not deduce template argument for 'std::_Array_const_iterator<_Ty,_Size>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xutility(1929) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::reverse_iterator<_RanIt> std::operator +(_Diff,const std::reverse_iterator<_RanIt> &)' : could not deduce template argument for 'const std::reverse_iterator<_RanIt> &' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xutility(1323) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Revranit<_RanIt,_Base> std::operator +(_Diff,const std::_Revranit<_RanIt,_Base> &)' : could not deduce template argument for 'const std::_Revranit<_RanIt,_Base> &' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xutility(1136) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2676: binary '+' : 'std::basic_string<_Elem,_Traits,_Ax>' does not define this operator or a conversion to a type acceptable to the predefined operator
1>          with
1>          [
1>              _Elem=char,
1>              _Traits=std::char_traits<char>,
1>              _Ax=std::allocator<char>
1>          ]
1>SaveBinaryFile.cpp(117): error C2075: 'files' : array initialization needs curly braces
1>SaveBinaryFile.cpp(142): error C2057: expected constant expression
1>SaveBinaryFile.cpp(142): error C2466: cannot allocate an array of constant size 0
1>SaveBinaryFile.cpp(142): error C2110: '+' : cannot add two pointers
1>SaveBinaryFile.cpp(142): error C2371: 'files' : redefinition; different basic types
1>          SaveBinaryFile.cpp(117) : see declaration of 'files'
1>SaveBinaryFile.cpp(142): error C2075: 'files' : array initialization needs curly braces
1>SaveBinaryFile.cpp(153): error C2065: 'ostrm2' : undeclared identifier
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window

0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
std::ofstream files[f](std::string(std::string("c:\\dp4\\flout")+f)+".bin", std::ios::binary | std::ios::out );

this statement obviously is wrong. std::ofstream files already is defined above the f-loop. you can't redefine variables in c++. that's why i posted code where i used std::ostringstream to build a filename like c:\\dp4\floutxx.bin where xx is a string out of loop variable f and used open function to open files[f] rather than using the constructor to open the file(s).

std::ofstream files[70];
i explained that 40 files is the maximum you should use because otherwise you would go beyond the 32-bit boundary what would add some difficulties for the create program but also for the merge and query program. if you could search in 40 millions of names you probably have more names than an average google database. why you want to search in 70 million names? note, if you decrease the fld_nm in nameval to 20 bytes, you even could process 70 million of keys without any change to your current programs. but again i am asking, why? your programs aren't better if you use miilions of keys or only 100. you were only fighting against issues which are not due to the code but were violating limitations of your environments.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Thanks. I adjust it to have 40 files below
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <string.h>
#include <fstream>
#include <string>
#include <ctype.h>
#include <time.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval2.h"   
#include <iomanip>
#include <algorithm>
#include <sstream>
#include <errno.h>
using namespace std;
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
nameval binrec;
int main()
{
    int cnt;
	//
	std::ofstream files[40];
	for (int f = 0; f < 40; ++f)
	{
	    std::set<nameval> records;
		srand((int)time(NULL));
		std::ostringstream filename;
		filename << "c:\\dp4\\flout" << f << ".bin";
		files[f].open(filename.str().c_str(), std::ios::binary| std::ios::out);
		if (!files[f].is_open()) return errno;
		 //
		 //
		for (cnt=0;cnt<1000000;cnt++)
		{
			try 
			{
				nameval val={0};
				int j;
				//
				for (j=0;j<20;j++)
				{
					//
					val.fld_nm[j] += (char)(rand () % 26 + ((rand()%2)? 65 : 97));
					//
				}
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				val.fld_val=cnt;
				//
				records.insert(val);
				if ((cnt+1)%10000 == 0)
				{
						std::cout << val.fld_val << " | " << val.fld_nm << std::endl;
				}
			}
			catch (exception& e)
			{
				cout << e.what() << '\n';
			}
		}
		//
		std::ofstream files[f](std::string(std::string("c:\\dp4\\flout")+f)+".bin", std::ios::binary | std::ios::out );
		if (files[f].is_open())
		{
			//
			for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
			{
				//
				//
				//
				//
				try
				{
					files[f].write((char *)&(*it), sizeof(nameval));
				}
				catch (exception& e)
				{
					cout << e.what() << '\n';
				}
			}
		}
		//
		//
		//
		std::wofstream files[f]("c:\\dp4\\flout"+f+".ord", std::ios::out );
		if (files[f].is_open())
		{
			//
			for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
			{
				//
				//
				wchar_t wtemp[100] = { L'\0' };
				it->get_uni_nm(wtemp, 100);			
				//
				ostrm2 << "\"" << it->fld_nm << "\" " << it->get_len() << ' '  <<wtemp << ' '<< it->fld_val << '\0' << '\n';
				//
				//
			}
		}
	}
	system("pause>null");
	return 0;
}

Open in new window


can you please advise to correct these?
1>------ Rebuild All started: Project: SaveBinaryFile, Configuration: Release x64 ------
1>  stdafx.cpp
1>  SaveBinaryFile.cpp
1>c:\dp4\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(12): warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data
1>c:\dp4\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(22): warning C4996: 'mbstowcs': This function or variable may be unsafe. Consider using mbstowcs_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h(498) : see declaration of 'mbstowcs'
1>SaveBinaryFile.cpp(117): error C2057: expected constant expression
1>SaveBinaryFile.cpp(117): error C2466: cannot allocate an array of constant size 0
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Vb_iterator<_Alloc> std::operator +(_Alloc::difference_type,std::_Vb_iterator<_Alloc>)' : could not deduce template argument for 'std::_Vb_iterator<_Alloc>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\vector(1985) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Vb_const_iterator<_Alloc> std::operator +(_Alloc::difference_type,std::_Vb_const_iterator<_Alloc>)' : could not deduce template argument for 'std::_Vb_const_iterator<_Alloc>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\vector(1878) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Vector_iterator<_Myvec> std::operator +(_Vector_iterator<_Myvec>::difference_type,std::_Vector_iterator<_Myvec>)' : could not deduce template argument for 'std::_Vector_iterator<_Myvec>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\vector(407) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Vector_const_iterator<_Myvec> std::operator +(_Vector_const_iterator<_Myvec>::difference_type,std::_Vector_const_iterator<_Myvec>)' : could not deduce template argument for 'std::_Vector_const_iterator<_Myvec>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\vector(276) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2782: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(std::basic_string<_Elem,_Traits,_Alloc> &&,const _Elem)' : template parameter '_Elem' is ambiguous
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(153) : see declaration of 'std::operator +'
1>          could be 'int'
1>          or       'char'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(std::basic_string<_Elem,_Traits,_Alloc> &&,const _Elem *)' : could not deduce template argument for 'const _Elem *' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(143) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const _Elem,std::basic_string<_Elem,_Traits,_Alloc> &&)' : could not deduce template argument for 'std::basic_string<_Elem,_Traits,_Alloc> &&' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(133) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const _Elem *,std::basic_string<_Elem,_Traits,_Alloc> &&)' : could not deduce template argument for 'const _Elem *' from 'std::basic_string<_Elem,_Traits,_Ax>'
1>          with
1>          [
1>              _Elem=char,
1>              _Traits=std::char_traits<char>,
1>              _Ax=std::allocator<char>
1>          ]
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(123) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(std::basic_string<_Elem,_Traits,_Alloc> &&,std::basic_string<_Elem,_Traits,_Alloc> &&)' : could not deduce template argument for 'std::basic_string<_Elem,_Traits,_Alloc> &&' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(109) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(std::basic_string<_Elem,_Traits,_Alloc> &&,const std::basic_string<_Elem,_Traits,_Alloc> &)' : could not deduce template argument for 'const std::basic_string<_Elem,_Traits,_Alloc> &' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(99) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const std::basic_string<_Elem,_Traits,_Alloc> &,std::basic_string<_Elem,_Traits,_Alloc> &&)' : could not deduce template argument for 'std::basic_string<_Elem,_Traits,_Alloc> &&' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(89) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2782: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const std::basic_string<_Elem,_Traits,_Alloc> &,const _Elem)' : template parameter '_Elem' is ambiguous
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(75) : see declaration of 'std::operator +'
1>          could be 'int'
1>          or       'char'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const std::basic_string<_Elem,_Traits,_Alloc> &,const _Elem *)' : could not deduce template argument for 'const _Elem *' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(61) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const _Elem,const std::basic_string<_Elem,_Traits,_Alloc> &)' : could not deduce template argument for 'const std::basic_string<_Elem,_Traits,_Alloc> &' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(47) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const _Elem *,const std::basic_string<_Elem,_Traits,_Alloc> &)' : could not deduce template argument for 'const _Elem *' from 'std::basic_string<_Elem,_Traits,_Ax>'
1>          with
1>          [
1>              _Elem=char,
1>              _Traits=std::char_traits<char>,
1>              _Ax=std::allocator<char>
1>          ]
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(33) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::basic_string<_Elem,_Traits,_Alloc> std::operator +(const std::basic_string<_Elem,_Traits,_Alloc> &,const std::basic_string<_Elem,_Traits,_Alloc> &)' : could not deduce template argument for 'const std::basic_string<_Elem,_Traits,_Alloc> &' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string(19) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_String_iterator<_Elem,_Traits,_Alloc> std::operator +(_String_iterator<_Elem,_Traits,_Alloc>::difference_type,std::_String_iterator<_Elem,_Traits,_Alloc>)' : could not deduce template argument for 'std::_String_iterator<_Elem,_Traits,_Alloc>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xstring(434) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_String_const_iterator<_Elem,_Traits,_Alloc> std::operator +(_String_const_iterator<_Elem,_Traits,_Alloc>::difference_type,std::_String_const_iterator<_Elem,_Traits,_Alloc>)' : could not deduce template argument for 'std::_String_const_iterator<_Elem,_Traits,_Alloc>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xstring(293) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Array_iterator<_Ty,_Size> std::operator +(_Array_iterator<_Ty,_Size>::difference_type,std::_Array_iterator<_Ty,_Size>)' : could not deduce template argument for 'std::_Array_iterator<_Ty,_Size>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xutility(2068) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Array_const_iterator<_Ty,_Size> std::operator +(_Array_const_iterator<_Ty,_Size>::difference_type,std::_Array_const_iterator<_Ty,_Size>)' : could not deduce template argument for 'std::_Array_const_iterator<_Ty,_Size>' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xutility(1929) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::reverse_iterator<_RanIt> std::operator +(_Diff,const std::reverse_iterator<_RanIt> &)' : could not deduce template argument for 'const std::reverse_iterator<_RanIt> &' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xutility(1323) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2784: 'std::_Revranit<_RanIt,_Base> std::operator +(_Diff,const std::_Revranit<_RanIt,_Base> &)' : could not deduce template argument for 'const std::_Revranit<_RanIt,_Base> &' from 'int'
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\xutility(1136) : see declaration of 'std::operator +'
1>SaveBinaryFile.cpp(117): error C2676: binary '+' : 'std::basic_string<_Elem,_Traits,_Ax>' does not define this operator or a conversion to a type acceptable to the predefined operator
1>          with
1>          [
1>              _Elem=char,
1>              _Traits=std::char_traits<char>,
1>              _Ax=std::allocator<char>
1>          ]
1>SaveBinaryFile.cpp(117): error C2075: 'files' : array initialization needs curly braces
1>SaveBinaryFile.cpp(142): error C2057: expected constant expression
1>SaveBinaryFile.cpp(142): error C2466: cannot allocate an array of constant size 0
1>SaveBinaryFile.cpp(142): error C2110: '+' : cannot add two pointers
1>SaveBinaryFile.cpp(142): error C2371: 'files' : redefinition; different basic types
1>          SaveBinaryFile.cpp(117) : see declaration of 'files'
1>SaveBinaryFile.cpp(142): error C2075: 'files' : array initialization needs curly braces
1>SaveBinaryFile.cpp(153): error C2065: 'ostrm2' : undeclared identifier
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window

0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
here again is the code which you should use at begin of f loop to create the output file:

for (int f = 0; f < 40; ++f)
{
     // seed the rand function with a new start value
     srand((int)time(NULL));
     // we create a string stream for output
     std::ostringstream filename;
      // string streams can take strings or integers or any other type where the operator<< is implemented for
     filename << "c:\\dp4\\flout" << f << ".bin";
     // the std::ofstream::open function allows to create the output file by using a file stream already constructed
     // that is necessary since we have an array of 40 file streams where the default constructor already was performed
     std::string strfilename = filename.str();
     files[f].open(strfilename.c_str(), std::ios::binary| std::ios::out);
     if (!files[f].is_open())
     {
           // error 

Open in new window

   

the code you posted is not the one which has the compiler errors. the error in line 117 can not be corrected. you have to delete the statement. it is not necessary if you use the code above at begin of the for loop with loop variable f.

if you post again code, please remove all empty lines and all lines which have only //.

compile the source agin after you have 'purified' it such that the error lines fit to the code posted.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.
I did remove the original 117 line and have these

//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <string.h>
#include <fstream>
#include <string>
#include <ctype.h>
#include <time.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval2.h"   
#include <iomanip>
#include <algorithm>
#include <sstream>
#include <errno.h>
using namespace std;
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
nameval binrec;
int main()
{
    int cnt;
	//
	std::ofstream files[40];
	for (int f = 0; f < 40; ++f)
	{
	    std::set<nameval> records;
		srand((int)time(NULL));
		std::ostringstream filename;
		filename << "c:\\dp4\\flout" << f << ".bin";
		std::string strfilename = filename.str();
		files[f].open(strfilename.c_str(), std::ios::binary| std::ios::out);
		if (!files[f].is_open()) return errno;
		 //
		 //
		for (cnt=0;cnt<1000000;cnt++)
		{
			try 
			{
				nameval val={0};
				int j;
				//
				for (j=0;j<20;j++)
				{
					//
					val.fld_nm[j] += (char)(rand () % 26 + ((rand()%2)? 65 : 97));
					//
				}
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				val.fld_val=cnt;
				//
				records.insert(val);
				if ((cnt+1)%10000 == 0)
				{
						std::cout << val.fld_val << " | " << val.fld_nm << std::endl;
				}
			}
			catch (exception& e)
			{
				cout << e.what() << '\n';
			}
		}
		//
		//
		if (files[f].is_open())
		{
			//
			for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
			{
				//
				//
				//
				//
				try
				{
					files[f].write((char *)&(*it), sizeof(nameval));
				}
				catch (exception& e)
				{
					cout << e.what() << '\n';
				}
			}
		}
		//
		//
		//
		std::wofstream files[f]("c:\\dp4\\flout"+f+".ord", std::ios::out );
		if (files[f].is_open())
		{
			//
			for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
			{
				//
				//
				wchar_t wtemp[100] = { L'\0' };
				it->get_uni_nm(wtemp, 100);			
				//
				ostrm2 << "\"" << it->fld_nm << "\" " << it->get_len() << ' '  <<wtemp << ' '<< it->fld_val << '\0' << '\n';
				//
				//
			}
		}
	}
	system("pause>null");
	return 0;
}

Open in new window


how to correct the above, due to these?

1>------ Rebuild All started: Project: SaveBinaryFile, Configuration: Release x64 ------
1>  stdafx.cpp
1>  SaveBinaryFile.cpp
1>c:\dp4\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(12): warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data
1>c:\dp4\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(22): warning C4996: 'mbstowcs': This function or variable may be unsafe. Consider using mbstowcs_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h(498) : see declaration of 'mbstowcs'
1>SaveBinaryFile.cpp(143): error C2057: expected constant expression
1>SaveBinaryFile.cpp(143): error C2466: cannot allocate an array of constant size 0
1>SaveBinaryFile.cpp(143): error C2110: '+' : cannot add two pointers
1>SaveBinaryFile.cpp(143): error C2075: 'files' : array initialization needs curly braces
1>SaveBinaryFile.cpp(154): error C2065: 'ostrm2' : undeclared identifier
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window


(Actually I have some old codes right after "//" which have been removed, before I post the above codes, and this is why I have extra "//" in above.)
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
std::wofstream files[f]("c:\\dp4\\flout"+f+".ord", std::ios::out );
this statement has the same error as the one you removed. you can not 'add' an integer f to a string witout converting it to a string before.

you may replace it by

int len = strfilename.length(); // strfilename contains now c:\dp4\floutxx.bin
strfilename.resize(len-3); // we remove the "bin" from filename
strfilename += "ord";  // now add the new file extension
std::ofstream ostrm2(strfilename.c_str());  // std::ios::out is default for ofstream
if (ostrm2.is_open())
{
     ...

Open in new window


Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.
I want to know if we must use files having 1 million records inside, as the base file, when handling very big vector list. Other than this, is there any other way, to handle with the list, that probably can have billion of records inside?
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
actually your programs also could handle files with a billion of names. you would need to change to a 64-bit integer type, for example size_t, wherever you now use an int or unsigned int.

but it is not so easy to get a billion of names which have some meaning. even the Britannica Encyclopedia has "only" about 40 million words on half a million topics.

so when your goal is to have 1 billion of names you may ask yourself why not trying to go for 10 billion or 100 billion of names?

you also should be aware that the nameval struct still has 80 bytes unused. instead of 104 bytes you could use 24 bytes and so be able to use 4 times more names (about 160 million of names) than now.

i think it would be fine if you finish the current project and finally have a working file with 40 million names. that would be a good result where it is not easy to find "exercises" which would have been as ambitious as this.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Thanks a lot Sara.
I know it is a problem to handle files having a few millions records inside. how about that I expect to store a big list having 100 millions records inside, in alphabetical order? is there one better way to divide it into smaller files?

Merry Christmas to you!
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Is there any other help to this? Thanks.
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
is there one better way to divide it into smaller files?
no. the limitation is the virtual memory to creating 1 file. you can merge 40 files or 70 files or even 1000 files without limitation granted that you were using 64-bit integers everywhere. note, also the fld_val should be turned to a 64-bit integer.

the problem with your requirement is not the code which could handle 100 million  keys. the problem is that it will last hours if not days, to create all the files and finally merge it to the one huge file with 100 million keys inside.

you would need a lot experience and patience to make it work, because noone can help you with that because of the huge dimensions.

so my suggestion is to decrease the fld_nam to size of 21 what allows to add a termination zero character and to increase the fld_val to type 'long long'. the you can use 100 files with 1 million of keys each and still keep below 4 gb output file.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Thanks a lot Sara.

When handling with very big list of vectors, is it fine that we use "size_t" to fld_val? And the point is that, we're expecting to keep only 1 millions records to each "basic" file, and there won't be any other mechanism to combine all these "basic" files for further processing, right?
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
is it fine that we use "size_t" to fld_val?
size_t is of variable size same as int. but, in visual c++ it is a 64-bit long integer (type 'long long' or '_int64'). you should verify this by viewing the expression 'sizeof(size_t)' with the debugger.

fld_val should be turned from 32-bit int to 64-bit int if you intend to use integers greater than 2.1 billion.

so, as long as "only" have 100 million of keys, the fld_val could be an 'int'. however, your program should use a counter which was initialized above al and incremented with the outer for loop:

int val_cnt = 1;   // used for fld_val
for (int f = 0; f < 40; ++f ) 
{
     ...
     for (cnt=0;cnt<1000000;cnt++, ++val_cnt) // increments both cnt and val_cnt
     {
          ...
          val.fld_val=val_cnt;  // now the fld_val is unique for all files.

Open in new window


for combining the files to one you do not Need necessarily a seperate program. you may do the merge after all files were created.

you have now
 
           ...
	}
	system("pause>null");
	return 0;

Open in new window


at end of main. replace that by

             files[f].close();  // that closes the output file 
	}

        std::ifstream inputfiles[40];
        nameval names[40] = { 0 };
        bool eof_reached[40] = { false };
        int num = 40;   // if an inputfile has run to eof, decrement the num
	for (int f = 0; f < 40; ++f)
	{
		 std::ostringstream filename;
		 filename << "c:\\dp4\\flout" << f << ".bin";
		 if (!inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in))
			   return -3; // error opening the inputfile. add a better error handling
                 if (!inputfiles.read((char*)&names[f], sizeof(nameval)))
                           return -4; // error reading from file. add a better error handling

        }
        // here, all the input files are opened and the first record was read into names array
        // open a new output file flout.bin 
        // then open a new loop while (num > 0)  
        // where you determine the minimum names[m].fld_nm out of all names 
        //     where the eof_reached flag was not set
        // write the minimum record to output file    
        // read a new nameval record from the file where the minimum came from
        // if read returns with error and error is eof
        //    decrement num, 
        //     do either
        //        set current names[m].fld_nm[0] to '~'  (what is greater 'z')
        //        set the eof_reached[m] to true.
        //     close the inputfiles[m]
        //     continue while loop (which should determine next minimum)
 
        ...	
        system("pause>null");
	return 0;

Open in new window


there won't be any other mechanism to combine all these "basic" files for further processing, right?
why do you need another mechanism, if you already have one, that works?

another mechanism would be to not merging the files and recode the find program to operate on multiple files instead of one huge file.

Sara
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.
I get these
1>------ Rebuild All started: Project: SaveBinaryFile, Configuration: Release x64 ------
1>  stdafx.cpp
1>  SaveBinaryFile.cpp
1>c:\dev_proj_old\visual studio 2010\projects\savebinaryfile (binary search fine using vector)\savebinaryfile\..\..\include\nameval2.h(12): warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data
1>c:\dev_proj_old\visual studio 2010\projects\savebinaryfile (binary search fine using vector)\savebinaryfile\..\..\include\nameval2.h(22): warning C4996: 'mbstowcs': This function or variable may be unsafe. Consider using mbstowcs_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h(498) : see declaration of 'mbstowcs'
1>SaveBinaryFile.cpp(174): error C2171: '!' : illegal on operands of type 'void'
1>SaveBinaryFile.cpp(174): error C2451: conditional expression of type 'void' is illegal
1>          Expressions of type void cannot be converted to other types
1>SaveBinaryFile.cpp(176): error C2228: left of '.read' must have class/struct/union
1>          type is 'std::ifstream [40]'
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window


with these codes
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <string.h>
#include <fstream>
#include <string>
#include <ctype.h>
#include <time.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval2.h"   
#include <iomanip>
#include <algorithm>
#include <sstream>
#include <errno.h>
using namespace std;
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
nameval binrec;
int main()
{
    int cnt;
	//
	std::ofstream files[40];
	for (int f = 0; f < 40; ++f)
	{
	    std::set<nameval> records;
		srand((int)time(NULL));
		std::ostringstream filename;
		filename << "c:\\dp4\\flout" << f << ".bin";
		std::string strfilename = filename.str();
		files[f].open(strfilename.c_str(), std::ios::binary| std::ios::out);
		if (!files[f].is_open()) return errno;
		 //
		 //
		for (cnt=0;cnt<1000000;cnt++)
		{
			try 
			{
				nameval val={0};
				int j;
				//
				for (j=0;j<20;j++)
				{
					//
					val.fld_nm[j] += (char)(rand () % 26 + ((rand()%2)? 65 : 97));
					//
				}
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				val.fld_val=cnt;
				//
				records.insert(val);
				if ((cnt+1)%10000 == 0)
				{
						std::cout << val.fld_val << " | " << val.fld_nm << std::endl;
				}
			}
			catch (exception& e)
			{
				cout << e.what() << '\n';
			}
		}
		//
		//
		if (files[f].is_open())
		{
			//
			for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
			{
				//
				//
				//
				//
				try
				{
					files[f].write((char *)&(*it), sizeof(nameval));
				}
				catch (exception& e)
				{
					cout << e.what() << '\n';
				}
			}
		}
		//
		//
		//
		size_t len = strfilename.length(); //
		strfilename.resize(len-3); //
		strfilename += "ord";  //
		std::ofstream ostrm2(strfilename.c_str());  //
		if (ostrm2.is_open())
		{
			//
			for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
			{
				//
				//
				wchar_t wtemp[100] = { L'\0' };
				it->get_uni_nm(wtemp, 100);			
				//
				ostrm2 << "\"" << it->fld_nm << "\" " << it->get_len() << ' '  <<wtemp << ' '<< it->fld_val << '\0' << '\n';
				//
				//
			}
		}
		files[f].close();  //
	}
        std::ifstream inputfiles[40];
        nameval names[40] = { 0 };
        bool eof_reached[40] = { false };
        int num = 40;   //
	for (int f = 0; f < 40; ++f)
	{
		 std::ostringstream filename;
		 filename << "c:\\dp4\\flout" << f << ".bin";
		 if (!inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in))
			   return -3; //
                 if (!inputfiles.read((char*)&names[f], sizeof(nameval)))
                           return -4; //
        }
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
	system("pause>null");
	return 0;
}

Open in new window


and here is .h file

// nameval.h
#ifndef NAME_VAL_H
#define NAME_VAL_H

struct nameval
{
     char fld_nm[100];
     size_t    fld_val;

     int  get_len() const 
     {   
         int len = strlen(fld_nm);
         if (len < (int)sizeof(fld_nm))
             return len;
         return (int)sizeof(fld_nm);
     }
     void get_uni_nm(wchar_t nm_uni[], int sizfld) const
     {
         int len = get_len();
         if (len > sizfld)
             len = sizfld;
         mbstowcs(nm_uni, fld_nm, len);
     }
     bool operator< (const nameval & a2) const
     {
           if(strcmp(fld_nm, a2.fld_nm) < 0) return true;
           if(strcmp(fld_nm, a2.fld_nm) > 0) return false;
           if (fld_val < a2.fld_val) return true;
           return false;
     }
};

#endif

Open in new window

0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
use the following:

            
 inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in);
                 if (!inputfiles[f].is_open())
			   return -3; //
                 if (!inputfiles[f].read((char*)&names[f], sizeof(nameval)))
                           return -4; //

Open in new window


note, it was my fault, but i can only give code snippets here that were not tested. you should try to solve simple compile errors yourself.

always check the error messages thoroughly. if you clicked at the first error you would directed to the following  statement:

if (!inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in)

Open in new window

where the error message is "error C2171: '!' : illegal on operands of type 'void' ". the mistake is that the ifstream::open has a void return type, while i assumed it would return a 'bool'.

if you clicked at the second error the following statement would show up:

if (!inputfiles.read((char*)&names[f], sizeof(nameval)))

Open in new window

and the message is "error C2228: left of '.read' must have class/struct/union "

you see that "left of .read" there is 'inputfiles'. but inputfiles is an array of ifstream while we want to read from a single ifstream element at index 'f'. so, the mistake is that [f] was missing.

ok?

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.
Does it mean I can further use the mechanism you showed above 2 days ago, to search against multiple files?
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
to search against multiple files could be done in two ways. one is to have n files and each is sorted. then the query would have to search in up to n files until it could return whether the searched key was found or not. a binary search of one file with 1 million keys needs a maximum of 20 records to read. if you have 100 files one search would need to read up to 2000 records.

a 2nd way would be to have n files but the keys were sorted over all files. then it is more that you try to avoid one huge file which badly can be handled and have a split of that file into n units. for 100 files it would allow the query to get a result by reading a maximum of  26 or 27 records because the 100 files can be be used same way as one huge file with little to no penalty. however, creating the sorted files would require a merge process same as with the huge file.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.

        // here, all the input files are opened and the first record was read into names array
        // open a new output file flout.bin 
        // then open a new loop while (num > 0)  
        // where you determine the minimum names[m].fld_nm out of all names 
        //     where the eof_reached flag was not set
        // write the minimum record to output file    
        // read a new nameval record from the file where the minimum came from
        // if read returns with error and error is eof
        //    decrement num, 
        //     do either
        //        set current names[m].fld_nm[0] to '~'  (what is greater 'z')
        //        set the eof_reached[m] to true.
        //     close the inputfiles[m]
        //     continue while loop (which should determine next minimum)
 

Open in new window


In above, what mechanism are you showing there?
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Sara,
Big appreciations.

a 2nd way would be to have n files but the keys were sorted over all files. then it is more that you try to avoid one huge file which badly can be handled and have a split of that file into n units. for 100 files it would allow the query to get a result by reading a maximum of  26 or 27 records because the 100 files can be be used same way as one huge file with little to no penalty. however, creating the sorted files would require a merge process same as with the huge file.

It seems 2nd way is quicker, right? Is there any example or demonstration to the merging process?
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
the merging process would be the same as described above for one huge file. the only difference is that after 1 million of entries written to the ouput file you would close it and open a new file.

the query program would open all output files and do a binary search by directly reading records from files same as in the current query program. but it now has to compute two parameters: the file number and the relative record position in that file.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.
I did already adjust the codes to create files having 1M records, when totally generating 40 M records, and the process did slow down the whole machine. Any advice to this?
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
the process did slow down the whole machine
did you clear the std::set after each file?

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.
I have these
		if ((cnt+1)%1000000==0)
		{
			files[fl_cnt].close();  // that closes the output file 
			fl_cnt++;
			std::set<nameval> records;
			srand((int)time(NULL));
			std::ostringstream filename;
			filename << "c:\\dp4\\flout" << fl_cnt << ".bin";
			...

Open in new window

to close the relevant file having 1 M records inside. Is there anything missed in above?
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
Is there anything missed in above?
hmm. the if statement is strange. you should not use the counter to determine when a file was closed but you should have an outer loop for the files and an inner loop for the records per file. the counter from 0 to 100 million which was incremented in the inner loop only should be used for the fld_val and nothing else.

can you post the full code?

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.
Please see the attached file and advise what to adjust.
SaveBinaryFile.cpp
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
for (cnt=0;cnt<40000000;cnt++)
the problem is that you ignored my suggestions and code I posted but are using only one big loop to create the output file. because of that the std::set grows bigger and bigger and finally your program does nothing but swapping.

you definitively need to go back to code where two loops were used and where the std::set was created in the outer loop:

// define the counter above loops:
int cnt_val = 0;
// outer loop for files
for (int f = 0; f < 40; ++f)
{
        // here open a new file flout<f>.bin
        ...
        // here create the std::set 
        ... 
        // the inner loop writes 1 million records
        // note, both counters were incremented, but only the record counter was initialized

        for (int cnt_rec = 0; cnt_rec < 1000000; ++cnt_rec, ++cnt_val) //         {
             ...
             val.fld_val=cnt_val;   // here we use the cnt_val
             ...
        }
        // close the file after 1 million records
        ...
}  // here the std::set was cleared and memory was released

Open in new window


Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.
Can you please advise the way to resolve this
1>------ Rebuild All started: Project: SaveBinaryFile, Configuration: Release x64 ------
1>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\Microsoft.CppClean.targets(74,5): warning : The process cannot access the file 'C:\dev_proj_old\Visual Studio 2010\Projects\SaveBinaryFile (Binary search using Vector' because it is being used by another process.
1>  stdafx.cpp
1>  SaveBinaryFile.cpp
1>c:\dev_proj_old\visual studio 2010\projects\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(12): warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data
1>c:\dev_proj_old\visual studio 2010\projects\savebinaryfile\savebinaryfile\..\..\include\nameval2.h(22): warning C4996: 'mbstowcs': This function or variable may be unsafe. Consider using mbstowcs_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h(498) : see declaration of 'mbstowcs'
1>  Generating code
1>  Finished generating code
1>  SaveBinaryFile.vcxproj -> C:\dev_proj_old\Visual Studio 2010\Projects\SaveBinaryFile\SaveBinaryFile\x64\Release\SaveBinaryFile.exe
1>  
1>mt.exe : command line error c1010010: -inputresource or -outputresource or -updateresource specified with multiple semicolons.
1>  
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window

due to these?
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <string.h>
#include <fstream>
#include <string>
#include <ctype.h>
#include <time.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval2.h"   
#include <iomanip>
#include <algorithm>
#include <sstream>
#include <errno.h>
using namespace std;
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
nameval binrec;
int main()
{
    int cnt;
	int cnt_val = 0;
	//
	std::ofstream files[40];
	for (int f = 0; f < 40; ++f)
	{
	    std::set<nameval> records;
		srand((int)time(NULL));
		std::ostringstream filename;
		filename << "c:\\dp4\\flout" << f << ".bin";
		std::string strfilename = filename.str();
		files[f].open(strfilename.c_str(), std::ios::binary| std::ios::out);
		if (!files[f].is_open()) return errno;
		 //
		 //
		for (cnt=0;cnt<1000000;cnt++)
		{
			++cnt_val;
			try 
			{
				nameval val={0};
				int j;
				//
				for (j=0;j<20;j++)
				{
					//
					val.fld_nm[j] += (char)(rand () % 26 + ((rand()%2)? 65 : 97));
					//
				}
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				val.fld_val=cnt_val;
				//
				records.insert(val);
				if ((cnt+1)%10000 == 0)
				{
						std::cout << val.fld_val << " | " << val.fld_nm << std::endl;
				}
			}
			catch (exception& e)
			{
				cout << e.what() << '\n';
			}
		}
		//
		//
		if (files[f].is_open())
		{
			//
			for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
			{
				//
				//
				//
				//
				try
				{
					files[f].write((char *)&(*it), sizeof(nameval));
				}
				catch (exception& e)
				{
					cout << e.what() << '\n';
				}
			}
		}
		//
		//
		//
		size_t len = strfilename.length(); //
		strfilename.resize(len-3); //
		strfilename += "ord";  //
		std::ofstream ostrm2(strfilename.c_str());  //
		if (ostrm2.is_open())
		{
			//
			for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
			{
				//
				//
				wchar_t wtemp[100] = { L'\0' };
				it->get_uni_nm(wtemp, 100);			
				//
				ostrm2 << "\"" << it->fld_nm << "\" " << it->get_len() << ' '  <<wtemp << ' '<< it->fld_val << '\0' << '\n';
				//
				//
			}
		}
		files[f].close();  //
	}
        std::ifstream inputfiles[40];
        nameval names[40] = { 0 };
        bool eof_reached[40] = { false };
        int num = 40;   //
	for (int f = 0; f < 40; ++f)
	{
		 std::ostringstream filename;
		 filename << "c:\\dp4\\flout" << f << ".bin";
		 inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in);
		 if(inputfiles[f].is_open())
			   return -3; //
                 if (!inputfiles[f].read((char*)&names[f], sizeof(nameval)))
                           return -4; //
        }
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
	system("pause>null");
	return 0;
}

Open in new window

0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
mt.exe is the 'manifest tool' to creating/using a manifest.

I don't think that the error is caused by the source code you posted.

I made a research for the error and found the following solution:

You just check your setting that if your code is native c++ and in property window -> General -> Common Language Runtime support as selected as (/clr) then this error max came so you just remove as /clr to no common language runtime support or if you want to select as (/clr) then you have to change C/C++ -> General-> Debug Information (Format as Program database (/Zi)).

or may this link can help you to understand for this error
http://msdn.microsoft.com/en-us/library/windows/desktop/aa375649%28v=vs.85%29.aspx

if your program configuration has clr support, you might remove that since you don't use managed code as far as I see.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Thanks a lot Sara.
I can see the relevant setting is "No Common language runtime support" shown as attached.
t925.png
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
yes, it looks fine.

you may try to delete debug and release folder of your project and do a rebuild after that.

if you still have the error with mt.exe try to use a backup of your project if available.

last resort is to create a new project and only copy the source code from current project.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Sara,
Appreciate a lot to you.
I am generating the files now and see the file create time and get that it needs about 20 minutes to handle 1 M records. Do you think such speed is slow?
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
I see that it is now fine with the system resources/memory. Is there any other way to quicken the process?
0
 
LVL 32

Assisted Solution

by:sarabande
sarabande earned 500 total points
Comment Utility
Do you think such speed is slow?
yes. you could speed up by using files on an ssd drive. or you create 1 file, then make 39 copies using windows explorer with the correct names.

then define the files as std::fstream (what is both ifstream and ofstream) and open them by using flag value  'std::ios::binary | std::ios::in | std::ios::out'.

if you do so, the files will not need to be created as empty files and then would grow (what adds overhead) but it would open existing files.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Thanks a lot Sara. Do you mean to adjust this line?

		files[f].open(strfilename.c_str(), std::ios::binary| std::ios::out);
		...

Open in new window

0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Please disregard my current reply few minutes ago in above. Thanks.
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.
I do adjust the codes to be
	std::fstream files[40];
	for (int f = 0; f < 40; ++f)
	{
	    std::set<nameval> records;
		srand((int)time(NULL));
		std::ostringstream filename;
		filename << "c:\\dp4\\flout" << f << ".bin";
		std::string strfilename = filename.str();
		files[f].open(strfilename.c_str(), std::ios::binary|std::ios::in | std::ios::out);
		if (!files[f].is_open()) return errno;
		 ...

Open in new window

while I see it still does need around 19 minutes to create one .bin and one .ord file, of which each is having 1M records. do you think the speed is good or not?
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
why do you need the .ord file? it is a sequential text file which causes the file system to slowly add file extensions whenever the current extension was full. there is no mor information inside as it is in the .bin file. at least you should also use existing files what should quicken the writing more than it was with .bin file.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Thanks a lot Sara.
I've already removed the codes to generate .ord file, but now it still needs around 17 minutes to generate one .bin file having 1M records inside.
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
you should check this. the .ord file should need at least double of the time the .bin needs. if you open existing files which would not grow, you should be able to write the 100 MB files within a few minutes.

you may write a message at begin and end of the loop to be able to verify that.

as long as you don't know where the time was lost, it makes little sense to suggest further improvements (which could be made by writing blocks of more than 1 record).

Sara
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
do you have enough space on the disk? is it an ssd drive?

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Thanks a lot Sara.
There're lots of space on the disk, that is one normal hard-disk. Can you please advise where I can improve to the codes
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <string.h>
#include <fstream>
#include <string>
#include <ctype.h>
#include <time.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval2.h"   
#include <iomanip>
#include <algorithm>
#include <sstream>
#include <errno.h>
using namespace std;
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
nameval binrec;
int main()
{
    int cnt;
	int cnt_val;
	cnt_val=0;
	//
	//
	std::fstream files[40];
	for (int f = 0; f < 40; ++f)
	{
	    std::set<nameval> records;
		srand((int)time(NULL));
		std::ostringstream filename;
		filename << "c:\\dp4\\flout" << f << ".bin";
		std::string strfilename = filename.str();
		files[f].open(strfilename.c_str(), std::ios::binary|std::ios::in | std::ios::out);
		if (!files[f].is_open()) return errno;
		 //
		 //
		for (cnt=0;cnt<1000000;cnt++)
		{
			cnt_val++;
			try 
			{
				nameval val={0};
				int j;
				//
				for (j=0;j<20;j++)
				{
					//
					val.fld_nm[j] += (char)(rand () % 26 + ((rand()%2)? 65 : 97));
					//
				}
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				//
				val.fld_val=cnt_val;
				//
				records.insert(val);
				if ((cnt+1)%10000 == 0)
				{
						std::cout << val.fld_val << " | " << val.fld_nm << std::endl;
				}
			}
			catch (exception& e)
			{
				cout << e.what() << '\n';
			}
		}
		//
		//
		if (files[f].is_open())
		{
			//
			for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
			{
				//
				//
				//
				//
				try
				{
					files[f].write((char *)&(*it), sizeof(nameval));
				}
				catch (exception& e)
				{
					cout << e.what() << '\n';
				}
			}
		}
		//
		//
		//
		//
		//
		//
		//
		//
		//
			//
			//
			//
				//
				//
				//
				//
				//
				//
				//
				//
			//
		//
		files[f].close();  //
	}
        std::ifstream inputfiles[40];
        nameval names[40] = { 0 };
        bool eof_reached[40] = { false };
        int num = 40;   //
	for (int f = 0; f < 40; ++f)
	{
		 std::ostringstream filename;
		 filename << "c:\\dp4\\flout" << f << ".bin";
		 inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in);
		 if(inputfiles[f].is_open())
			   return -3; //
                 if (!inputfiles[f].read((char*)&names[f], sizeof(nameval)))
                           return -4; //
        }
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
        //
	system("pause>null");
	return 0;
}

Open in new window


as it does need around 17 minutes to have one 1M records file generated?
0
 
LVL 32

Accepted Solution

by:
sarabande earned 500 total points
Comment Utility
i debugged the code and also needed more than 15 minutes (though i was writing to an ssd) before the next file was written. i found out that 99 percent of the time needed was lost when the std::set was cleared between two files in debug mode. i made a release built and now it needed 7 seconds for one file. that is because in debug mode each key was deleted in single mode what means that every one of 1 million keys requires a rearrangement of the binary tree on delete.

i made some "purifying" on your code without changing statements.

//
#include <set>
#include <fstream>
#include <string>
#include <time.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include <iomanip>
#include <sstream>
#include <errno.h>

#include "..\..\include\nameval2.h"   

int main()
{
    int cnt_val = 0;
    std::ofstream files[40];
    for (int f = 0; f < 40; ++f)
    {
        std::set<nameval> records;
        srand((int)time(NULL));
        std::ostringstream filename;
        filename << "c:\\dp4\\flout" << f << ".bin";
        std::string strfilename = filename.str();
        files[f].open(strfilename.c_str(), std::ios::binary|/*std::ios::in |*/ std::ios::out);
        if (!files[f].is_open()) 
            return errno;
        for (int cnt=0;cnt<1000000;cnt++)
        {
            cnt_val++;
            try 
            {
                nameval val={0};
                int j;
                //
                for (j=0;j<20;j++)
                {
                    //
                    val.fld_nm[j] += (char)(rand () % 26 + ((rand()%2)? 65 : 97));
                    //
                }
                val.fld_val=cnt_val;
                //
                records.insert(val);
                if ((cnt+1)%10000 == 0)
                {
                    std::cout << val.fld_val << " | " << val.fld_nm << std::endl;
                }
            }
            catch (std::exception& e)
            {
                std::cout << e.what() << '\n';
            }
        }

        for (std::set<nameval>::iterator it = records.begin(); it != records.end(); ++it)
        {
            try
            {
                files[f].write((char *)&(*it), sizeof(nameval));
            }
            catch (std::exception& e)
            {
                std::cout << e.what() << '\n';
            }
        }
        records.clear();
        files[f].close();  //
    }
    std::ifstream inputfiles[40];
    nameval names[40] = { 0 };
    bool eof_reached[40] = { false };
    int num = 40;   //
    for (int f = 0; f < 40; ++f)
    {
        std::ostringstream filename;
        filename << "c:\\dp4\\flout" << f << ".bin";
        inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in);
        if(inputfiles[f].is_open())
            return -3; //
        if (!inputfiles[f].read((char*)&names[f], sizeof(nameval)))
            return -4; //

        // here the merge logic still needs to be implemented

    }
    system("pause>null");
    return 0;
}

Open in new window


Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many many thanks Sara.

I do enhance the codes to be the same as yours. And I can see the file does grow to be complete to have totally 1M records, with the size of 101,563 KB, within 20 seconds. But then it does wait for around 13/14 minutes to be able to create 2nd, also have 1M records, with the same file size. Does it mean 13/14 minutes have been really consumed to have the physical file "completely" written to the disk?
0
 
LVL 32

Assisted Solution

by:sarabande
sarabande earned 500 total points
Comment Utility
no. the time was consumed in the debugger to deleting and reordering the std::set of 1M. if you build a release version the issue is solved. i get all 40 files within 3 minutes.

Sara
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
And the speed is nice, right?

Many many thanks Sara!
0
 
LVL 10

Author Comment

by:HuaMinChen
Comment Utility
Many thanks Sara.
I've missed one thing, as I see each file is sorted inside (from A to Z), like
https://dl.dropboxusercontent.com/u/40211031/flout0.zip
https://dl.dropboxusercontent.com/u/40211031/flout1.zip
Does it mean I have to search each file, one after another, when I'm to find out one specific record?
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
Does it mean I have to search each file, one after another, when I'm to find out one specific record?
that would be one of the solutions.

the other one is to merge the files into one big huge file at end of file generation. there is already code in the current source where i made a comment that there is still something open.

Sara
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

When writing generic code, using template meta-programming techniques, it is sometimes useful to know if a type is convertible to another type. A good example of when this might be is if you are writing diagnostic instrumentation for code to generat…
Many admins will agree: WSUS is is a nice invention but using it on the client side when updating a newly installed computer is still time consuming as you have to do several reboots and furthermore, the procedure of installing updates, rebooting an…
The viewer will learn additional member functions of the vector class. Specifically, the capacity and swap member functions will be introduced.
With the advent of Windows 10, Microsoft is pushing a Get Windows 10 icon into the notification area (system tray) of qualifying computers. There are many reasons for wanting to remove this icon. This two-part Experts Exchange video Micro Tutorial s…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now