Link to home
Start Free TrialLog in
Avatar of Peter Chan
Peter ChanFlag for Hong Kong

asked on

Way to scan the path

Hi,
I expect to have a loop to scan one drive like

c:\dp4

for the relevant file names, with the format like

flout0.bin
flout1.bin
...


is there any example to this?
Avatar of Zoppo
Zoppo
Flag of Germany image

Hi HuaMinChen,

the common (Win32 API-) way is to use FindFirstFile and FindNextFile to iterate through files and folders. Since this function only works on single folders one has to implement a function which calls itself recursively for each directory found (here folders '.' and '..' has to be skipped). It isn't difficult to write such a loop, you can find samples in the internet like i.e. http://www.codeproject.com/Articles/93141/Iterative-Implementation-of-Recursively-Enumeratin

If you already use BOOST you could use some of its filesystem classes as alternative, i.e. boost::filesystem::recursive_directory_iterator - you can find samples in the internet too like i.e. point 3. at www.technical-recipes.com/2014/using-boostfilesystem/

Hope that helps,

ZOPPO
Avatar of Peter Chan

ASKER

Thanks Zoppo.
Sorry, in your example, I do not see the way to validate whether the file name is something

"fl_out*.bin"

?
As there can be other files in different file name format, within the same path.
sorry, I thought this is no real problem when you understand FindFirstFile/FindNextFile.

Generally you have two options how to implement the recursive called function:

1. Loop through all files/folders in the passed path with a call to FindFirstFile with a '*' placeholder at the end of the passed path, for found files compare filenames (returned in the WIN32_FIND_DATA passed to every FindNextFile call) with the wanted pattern, for folders execpt '.' and '..' call the function recursively.

2. First search the passed path for all wanted files using FindFirstFile with a 'fl_out*.bin' placeholder at the end, then in a second loop iterate through all files again in a similar way as in 1.) but only handle folders to call the function recursive.

It's up to you which way to choose. The first one might be a bit more complicated depending on how the comparsion between your pattern and the file names has to be done, the second one will be a bit overhead since behind the scenes each folder will be scanned twice.

ZOPPO
I want to validate the given argument for file path is correct, and also to see if there're existing files called "flout*.bin" by the following
	if (INVALID_FILE_ATTRIBUTES == GetFileAttributes( argv[1] ))
	{
		cout << "File does not exist!\n";
		exit(EXIT_FAILURE);
	}

    WIN32_FIND_DATA ffd;
    TCHAR szDir[MAX_PATH];
    HANDLE hFind = INVALID_HANDLE_VALUE;

    StringCchCopy(szDir, MAX_PATH, lpcszFolder);
    StringCchCat(szDir, MAX_PATH, TEXT("\\flout*.bin"));
    ...

Open in new window


how to adjust the codes in above and correct these errors

1>ReadBinaryFile.cpp(80): error C2065: 'lpcszFolder' : undeclared identifier
1>ReadBinaryFile.cpp(80): error C3861: 'StringCchCopy': identifier not found
1>ReadBinaryFile.cpp(81): error C3861: 'StringCchCat': identifier not found
1>ReadBinaryFile.cpp(157): warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string.h(105) : see declaration of 'strcpy'
1>ReadBinaryFile.cpp(203): warning C4244: '=' : conversion from 'double' to 'float', possible loss of data

Open in new window

Try add the header Strsafe.h.
And you need a var type LPCTSTR named lpcszFolder.

HRESULT StringCchCopy(
  _Out_  LPTSTR pszDest,
  _In_   size_t cchDest,
  _In_   LPCTSTR pszSrc
);

Open in new window

I do adjust the codes to include .h file, like
//
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <comutil.h>
#include <sys/stat.h>
#include <string>
#include <string.h>
#include <fstream>
#include <atlbase.h>
#include <ctype.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval.h"
#include "..\..\include\Strsafe.h"
#include <iomanip>
#include <Windows.h>
using namespace std;
//
struct stat fs = { 0 };
int ret; //
int numRecords;
//
//
	//
	//
	//
     //
     //
     //
     //
	//
     //
     //
     //
     //
     //
     //
	//
	//
//
nameval binrec;
//
//
  //
  //
  //
  //
//
bool LessComp(const nameval& a1, const nameval& a2)
{
  if(strcmp(a1.fld_nm, a2.fld_nm) < 0) return true;
  if(strcmp(a1.fld_nm, a2.fld_nm) > 0) return false;
  if(a1.fld_val < a2.fld_val) return true;
  return false;
}
int _tmain(int argc,_TCHAR* argv[])
{
	if (argc < 2)
	{
		return ERROR;
	}
	if (INVALID_FILE_ATTRIBUTES == GetFileAttributes( argv[1] ))
	{
		cout << "File does not exist!\n";
		exit(EXIT_FAILURE);
	}
    WIN32_FIND_DATA ffd;
    TCHAR szDir[MAX_PATH];
    HANDLE hFind = INVALID_HANDLE_VALUE;
    StringCchCopy(szDir, MAX_PATH, lpcszFolder);
    StringCchCat(szDir, MAX_PATH, TEXT("\\flout*"));
    ...

Open in new window

while .h file is

HRESULT StringCchCopy(
  _Out_  LPTSTR pszDest,
  _In_   size_t cchDest,
  _In_   LPCTSTR pszSrc
);

Open in new window


but I still get these

1>ReadBinaryFile.cpp(81): error C2065: 'lpcszFolder' : undeclared identifier
1>ReadBinaryFile.cpp(82): error C3861: 'StringCchCat': identifier not found
1>ReadBinaryFile.cpp(158): warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string.h(105) : see declaration of 'strcpy'
1>ReadBinaryFile.cpp(204): warning C4244: '=' : conversion from 'double' to 'float', possible loss of data

Open in new window

why do you want to search for the files? you know that the folder c:\dp4 contains files flout0.bin to flout39.bin, didn't you?

look at the code in savebinary project. at end after file creation there already is a loop which opens all .bin files for reading.

error C2065: 'lpcszFolder' : undeclared identifier
you try to copy a string from a variable that was neither declared nor defined.

try  either

StringCchCopy(szDir, MAX_PATH, TEXT("C:\\dp4\\"));

or simpler

TCHAR szDir[] = TEXT("C:\\dp4\\");


 
error C3861: 'StringCchCat': identifier not found
as told by dm0000 try to include strsafe.h.

Sara
Many thanks Sara.
I now want to search all files for one given string by these
//
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <comutil.h>
#include <sys/stat.h>
#include <string>
#include <string.h>
#include <fstream>
#include <atlbase.h>
#include <ctype.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval.h"
//
#include <iomanip>
#include <Windows.h>
using namespace std;
//
struct stat fs = { 0 };
int ret; //
int numRecords;
//
//
	//
	//
	//
     //
     //
     //
     //
	//
     //
     //
     //
     //
     //
     //
	//
	//
//
nameval binrec;
//
//
  //
  //
  //
  //
//
bool LessComp(const nameval& a1, const nameval& a2)
{
  if(strcmp(a1.fld_nm, a2.fld_nm) < 0) return true;
  if(strcmp(a1.fld_nm, a2.fld_nm) > 0) return false;
  if(a1.fld_val < a2.fld_val) return true;
  return false;
}
int _tmain(int argc,_TCHAR* argv[])
{
	if (argc < 1)
	{
		return ERROR;
	}
	//
	//
	//
	//
	//
    //
    //
    //
    //
    //
	unsigned int nbegin=0;
	unsigned int nend=numRecords-1;
	unsigned int nmid;
	unsigned int nstop=0;
	char nm_got[100];
	unsigned int val_got;
    //
    //
    //
    //
    //
    //
	time_t timev,currtime;
	//
	float sec;
	timev=time(0);
        std::ifstream inputfiles[40];
        nameval names[40] = { 0 };
        bool eof_reached[40] = { false };
        int num = 40;   //
		for (int f = 0; f < 40; ++f)
		{
			 std::ostringstream filename;
			 filename << "c:\\dp4\\flout" << f << ".bin";
			std::set<nameval> records;
			std::set<nameval>::iterator iter;
			//
			//
			//
			 inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in);
			 if(inputfiles[f].is_open())
				   return -3; //
					 if (!inputfiles[f].read((char*)&names[f], sizeof(nameval)))
                           return -4; //
			ret = stat(CT2A(argv[1]), &fs);
			numRecords = (int)(fs.st_size/sizeof(nameval));
			nbegin=0;
			nend=numRecords-1;
			nstop=0;
			//
			//
			//
			//
			//
				//
				//
				//
				//
				//
				//
				//
				//
				//
			//
			//
			//
			//
			//
			//
			char szArgv2[512] = { 0 };
			wcstombs(szArgv2, argv[1], sizeof(szArgv2));
			while (nbegin<=nend && nstop!=-1)
			{
				nmid= (nbegin + nend)/2;
				nameval rec = { 0 };
				istrm.seekg(nmid* sizeof(nameval));
				istrm.read((char*)&rec, sizeof(nameval)); 
				if (strcmp(szArgv2,rec.fld_nm)<0)
				{
					nend=nmid-1;
					nmid = (nbegin+nend)/2;
				}
				else
				{
					if (strcmp(szArgv2,rec.fld_nm)>0)
					{
						nbegin=nmid+1;
						nmid = (nbegin+nend)/2;
					}
					else
					{
						nstop=-1;
						strcpy(rec.fld_nm,nm_got);
						val_got=rec.fld_val;
					}
				}
			}
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
				//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			if (nstop==-1)
			  {
				  cout << "\nFound it!\n";
				  //
					//
				  //
					//
				  cout << "(From vector record: " << nm_got
					  << ' ' << val_got << ")\n";
			  }
			else cout << "\nDidn't find it!\n";
		}
	time(&currtime);
	sec=difftime(currtime,timev);
	cout << "Search finishes with only " << sec << " seconds";
	system("pause>null");
	//
    return 0;
}

Open in new window

can you please advise how to resolve these?
1>------ Rebuild All started: Project: ReadBinaryFile, Configuration: Release x64 ------
1>  stdafx.cpp
1>  ReadBinaryFile.cpp
1>c:\readbinaryfile\readbinaryfile\..\..\include\nameval.h(10): warning C4267: 'return' : conversion from 'size_t' to 'int', possible loss of data
1>c:\readbinaryfile\readbinaryfile\..\..\include\nameval.h(13): warning C4996: 'mbstowcs': This function or variable may be unsafe. Consider using mbstowcs_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h(498) : see declaration of 'mbstowcs'
1>ReadBinaryFile.cpp(110): error C2079: 'filename' uses undefined class 'std::basic_ostringstream<_Elem,_Traits,_Alloc>'
1>          with
1>          [
1>              _Elem=char,
1>              _Traits=std::char_traits<char>,
1>              _Alloc=std::allocator<char>
1>          ]
1>ReadBinaryFile.cpp(111): error C2297: '<<' : illegal, right operand has type 'const char [13]'
1>ReadBinaryFile.cpp(111): error C2297: '<<' : illegal, right operand has type 'const char [5]'
1>ReadBinaryFile.cpp(111): warning C4552: '<<' : operator has no effect; expected operator with side-effect
1>ReadBinaryFile.cpp(118): error C2228: left of '.str' must have class/struct/union
1>          type is 'int'
1>ReadBinaryFile.cpp(118): error C2228: left of '.c_str' must have class/struct/union
1>ReadBinaryFile.cpp(158): error C2065: 'istrm' : undeclared identifier
1>ReadBinaryFile.cpp(158): error C2228: left of '.seekg' must have class/struct/union
1>          type is ''unknown-type''
1>ReadBinaryFile.cpp(159): error C2065: 'istrm' : undeclared identifier
1>ReadBinaryFile.cpp(159): error C2228: left of '.read' must have class/struct/union
1>          type is ''unknown-type''
1>ReadBinaryFile.cpp(176): warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string.h(105) : see declaration of 'strcpy'
1>ReadBinaryFile.cpp(225): warning C4244: '=' : conversion from 'double' to 'float', possible loss of data
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window


Here is
// nameval.h
#ifndef NAME_VAL_H
#define NAME_VAL_H

struct nameval
{
     char fld_nm[100];
     int    fld_val;

     int  get_len() { return min(strlen(fld_nm), sizeof(fld_nm) ) ; }
     void get_uni_nm(wchar_t nm_uni[], int sizfld)
     {
            mbstowcs(nm_uni, fld_nm, min(sizfld-1, strlen(fld_nm)));
     }
     bool operator< (const nameval & a2) const
     {
           if(strcmp(fld_nm, a2.fld_nm) < 0) return true;
           if(strcmp(fld_nm, a2.fld_nm) > 0) return false;
           if (fld_val < a2.fld_val) return true;
           return false;
     }
};

#endif

Open in new window

error C2079: 'filename' uses undefined class 'std::basic_ostringstream<_Elem,_Traits,_Alloc>'
you should take the errors literally. if the compiler complains that somewhat is not defined, you either have made a type error or you forgot to include the appropriate header file. here it is the second reason. for std::ostringstream which translates to std::basic_ostream<_Elem,_Traits,_Alloc> you need to include header file <sstream>.  for std::ifstream or std::ofstream the header <fstream> needs to be included. 'seekg' is a member of std::basic_istream what is a baseclass of std::ifstream. you can't use it for a std::ostream or std::ofstream.

all other errors seem to be following errors.

note, the warnings also could be solved. if 'size_t' (what is 64-bit integer) needs to be truncated to 'int' (what is 32-bit), the compiler complains. you could actively make a cast to 'int' to get rid of the warning, for example by

return (int)sizeof(fld_nm);

Open in new window

the 'size_t' returned by sizeof  explicitly was casted to 'int' what pleases the compiler.

to get rid of warnings that strcpy or mbstowcs are not safe you could use strcpy_s or mbstowcs_s instead.

Sara
Many thanks Sara.
where should I put this

return (int)sizeof(fld_nm);

?
the compiler has told you:
c:\readbinaryfile\readbinaryfile\..\..\include\nameval.h(10): warning C4267: 'return' : conversion from 'size_t' to 'int', possible loss of data

it is in nameval.h at line 10.

Sara
Many thanks Sara.
Sorry, I still get these
1>------ Rebuild All started: Project: ReadBinaryFile, Configuration: Release x64 ------
1>  stdafx.cpp
1>  ReadBinaryFile.cpp
1>c:\dev_proj_old\visual studio 2010\projects\readbinaryfile\readbinaryfile\..\..\include\nameval.h(10): warning C4267: 'return' : conversion from 'size_t' to 'int', possible loss of data
1>c:\dev_proj_old\visual studio 2010\projects\readbinaryfile\readbinaryfile\..\..\include\nameval.h(13): warning C4996: 'mbstowcs': This function or variable may be unsafe. Consider using mbstowcs_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h(498) : see declaration of 'mbstowcs'
1>ReadBinaryFile.cpp(153): warning C4996: 'wcstombs': This function or variable may be unsafe. Consider using wcstombs_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\stdlib.h(533) : see declaration of 'wcstombs'
1>ReadBinaryFile.cpp(159): error C2065: 'istrm' : undeclared identifier
1>ReadBinaryFile.cpp(159): error C2228: left of '.seekg' must have class/struct/union
1>          type is ''unknown-type''
1>ReadBinaryFile.cpp(160): error C2065: 'istrm' : undeclared identifier
1>ReadBinaryFile.cpp(160): error C2228: left of '.read' must have class/struct/union
1>          type is ''unknown-type''
1>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\Platforms\x64\Microsoft.Cpp.x64.Targets(152,5): error MSB6006: "CL.exe" exited with code 2.
1>ReadBinaryFile.cpp(177): warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string.h(105) : see declaration of 'strcpy'
1>ReadBinaryFile.cpp(226): warning C4244: '=' : conversion from 'double' to 'float', possible loss of data
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window

to these codes
//
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <comutil.h>
#include <sys/stat.h>
#include <string>
#include <string.h>
#include <fstream>
#include <sstream>
#include <atlbase.h>
#include <ctype.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval.h"
//
#include <iomanip>
#include <Windows.h>
using namespace std;
//
struct stat fs = { 0 };
int ret; //
int numRecords;
//
//
	//
	//
	//
     //
     //
     //
     //
	//
     //
     //
     //
     //
     //
     //
	//
	//
//
nameval binrec;
//
//
  //
  //
  //
  //
//
bool LessComp(const nameval& a1, const nameval& a2)
{
  if(strcmp(a1.fld_nm, a2.fld_nm) < 0) return true;
  if(strcmp(a1.fld_nm, a2.fld_nm) > 0) return false;
  if(a1.fld_val < a2.fld_val) return true;
  return false;
}
int _tmain(int argc,_TCHAR* argv[])
{
	if (argc < 1)
	{
		return ERROR;
	}
	//
	//
	//
	//
	//
    //
    //
    //
    //
    //
	unsigned int nbegin=0;
	unsigned int nend=numRecords-1;
	unsigned int nmid;
	unsigned int nstop=0;
	char nm_got[100];
	unsigned int val_got;
    //
    //
    //
    //
    //
    //
	time_t timev,currtime;
	//
	float sec;
	timev=time(0);
        std::ifstream inputfiles[40];
        nameval names[40] = { 0 };
        bool eof_reached[40] = { false };
        int num = 40;   //
		for (int f = 0; f < 40; ++f)
		{
			 std::ostringstream filename;
			 filename << "c:\\dp4\\flout" << f << ".bin";
			std::set<nameval> records;
			std::set<nameval>::iterator iter;
			//
			//
			//
			 inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in);
			 if(inputfiles[f].is_open())
				   return -3; //
					 if (!inputfiles[f].read((char*)&names[f], sizeof(nameval)))
                           return -4; //
			ret = stat(CT2A(argv[1]), &fs);
			numRecords = (int)(fs.st_size/sizeof(nameval));
			nbegin=0;
			nend=numRecords-1;
			nstop=0;
			//
			//
			//
			//
			//
				//
				//
				//
				//
				//
				//
				//
				//
				//
			//
			//
			//
			//
			//
			//
			char szArgv2[512] = { 0 };
			wcstombs(szArgv2, argv[1], sizeof(szArgv2));
			while (nbegin<=nend && nstop!=-1)
			{
				nmid= (nbegin + nend)/2;
				nameval rec = { 0 };
				istrm.seekg(nmid* sizeof(nameval));
				istrm.read((char*)&rec, sizeof(nameval)); 
				if (strcmp(szArgv2,rec.fld_nm)<0)
				{
					nend=nmid-1;
					nmid = (nbegin+nend)/2;
				}
				else
				{
					if (strcmp(szArgv2,rec.fld_nm)>0)
					{
						nbegin=nmid+1;
						nmid = (nbegin+nend)/2;
					}
					else
					{
						nstop=-1;
						strcpy(rec.fld_nm,nm_got);
						val_got=rec.fld_val;
					}
				}
			}
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
				//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			if (nstop==-1)
			  {
				  cout << "\nFound it!\n";
				  //
					//
				  //
					//
				  cout << "(From vector record: " << nm_got
					  << ' ' << val_got << ")\n";
			  }
			else cout << "\nDidn't find it!\n";
		}
	time(&currtime);
	sec=difftime(currtime,timev);
	cout << "Search finishes with only " << sec << " seconds";
	system("pause>null");
	//
    return 0;
}

Open in new window

and here is .h file
// nameval.h
#ifndef NAME_VAL_H
#define NAME_VAL_H

struct nameval
{
     char fld_nm[100];
     int    fld_val;

     int  get_len() { return min(strlen(fld_nm), (int)sizeof(fld_nm) ) ; }
     void get_uni_nm(wchar_t nm_uni[], int sizfld)
     {
            mbstowcs(nm_uni, fld_nm, min(sizfld-1, strlen(fld_nm)));
     }
     bool operator< (const nameval & a2) const
     {
           if(strcmp(fld_nm, a2.fld_nm) < 0) return true;
           if(strcmp(fld_nm, a2.fld_nm) > 0) return false;
           if (fld_val < a2.fld_val) return true;
           return false;
     }
};

#endif

Open in new window

I already told you that you need to cast the return value to int to get rid of the warning:
int  get_len() { return (int)min(strlen(fld_nm), sizeof(fld_nm) ) ; }

Open in new window


I also told you to replace mbstowcs by mbstowcs_s
size_t ncharsConverted = 0;
mbstowcs_s(&ncharsConverted, nm_uni, sizfld, fld_nm, min(sizfld-1, (int)strlen(fld_nm)));

Open in new window


same would apply to wcstombs_s:
size_t ncharsConverted = 0;
wcstombs((&ncharsConverted, szArgv2, sizeof(szArgv2), argv[1], min(sizeof(szArgv2)-1, wcslen(argv[1])));

Open in new window


the istrm variable is not defined and therefore you got a lot of errors. if you want to read keys of 40 files instead of one, you would need a loop varying f from 0 to 39 and use inputfiles[f] instead of istrm.

the statement 'if(inputfiles[f].is_open()) return -3;' is wrong. it must be

 if(!inputfiles[f].is_open()) return -3;

Open in new window




note, the source code of readbinaryfile.cpp is still very ugly with a number of unneeded include files, unused global variables, empty comment lines and more. also the errors and warnings do not correspond to the code you are posting. all that makes it very difficult to help you with your questions especially as I mostly have to repeat what I have explained already before.

reading 40 files instead of one surely is not the best solution. if a key cannot be found you would have 20 reads for each file, what sums up to 800 reads totally. if you would merge the keys into one big file of 40 million records, you would have a maximum of 27 reads for one search. in the savebinaryfile.cpp you already have code which opens all 40 files for reading. it would be a rather simple loop to merge the keys of the file to one big file. the savebinaryfile.cpp already was cleaned by me and you easily could find out what I have done and do similar to the readbinaryfile.cpp.

Sara
Many thanks Sara.

size_t ncharsConverted = 0;
wcstombs((&ncharsConverted, szArgv2, sizeof(szArgv2), argv[1], min(sizeof(szArgv2)-1, wcslen(argv[1])));

Sorry, do you mean to adjust these 2 lines below, or not?

			char szArgv2[512] = { 0 };
			wcstombs(szArgv2, argv[1], sizeof(szArgv2));

Open in new window

Sorry, please disregard my last reply in above.
Sorry Sara.

Can you please advise which unnecessary "include" lines I should remove?

I do post the exact codes while I've just removed the old codes starting with "\\".

How to resolve these
1>------ Rebuild All started: Project: ReadBinaryFile, Configuration: Release x64 ------
1>  stdafx.cpp
1>  ReadBinaryFile.cpp
1>c:\dev_proj_old\visual studio 2010\projects\readbinaryfile\readbinaryfile\..\..\include\nameval.h(13): error C2660: 'mbstowcs_s' : function does not take 3 arguments
1>ReadBinaryFile.cpp(154): error C2065: 'nm_uni' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'sizfld' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'fld_nm' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'sizfld' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'fld_nm' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'sizfld' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'fld_nm' : undeclared identifier
1>ReadBinaryFile.cpp(161): error C2065: 'istrm' : undeclared identifier
1>ReadBinaryFile.cpp(161): error C2228: left of '.seekg' must have class/struct/union
1>          type is ''unknown-type''
1>ReadBinaryFile.cpp(162): error C2065: 'istrm' : undeclared identifier
1>ReadBinaryFile.cpp(162): error C2228: left of '.read' must have class/struct/union
1>          type is ''unknown-type''
1>ReadBinaryFile.cpp(179): warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string.h(105) : see declaration of 'strcpy'
1>ReadBinaryFile.cpp(228): warning C4244: '=' : conversion from 'double' to 'float', possible loss of data
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window


due to these?
//
//
#include "stdafx.h"
#include <set>
#include <stdio.h>
#include <comutil.h>
#include <sys/stat.h>
#include <string>
#include <string.h>
#include <fstream>
#include <sstream>
#include <atlbase.h>
#include <ctype.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval.h"
//
#include <iomanip>
#include <Windows.h>
using namespace std;
//
struct stat fs = { 0 };
int ret; //
int numRecords;
//
//
	//
	//
	//
     //
     //
     //
     //
	//
     //
     //
     //
     //
     //
     //
	//
	//
//
nameval binrec;
//
//
  //
  //
  //
  //
//
bool LessComp(const nameval& a1, const nameval& a2)
{
  if(strcmp(a1.fld_nm, a2.fld_nm) < 0) return true;
  if(strcmp(a1.fld_nm, a2.fld_nm) > 0) return false;
  if(a1.fld_val < a2.fld_val) return true;
  return false;
}
int _tmain(int argc,_TCHAR* argv[])
{
	if (argc < 1)
	{
		return ERROR;
	}
	//
	//
	//
	//
	//
    //
    //
    //
    //
    //
	unsigned int nbegin=0;
	unsigned int nend=numRecords-1;
	unsigned int nmid;
	unsigned int nstop=0;
	char nm_got[100];
	unsigned int val_got;
    //
    //
    //
    //
    //
    //
	time_t timev,currtime;
	//
	float sec;
	timev=time(0);
        std::ifstream inputfiles[40];
        nameval names[40] = { 0 };
        bool eof_reached[40] = { false };
        int num = 40;   //
		for (int f = 0; f < 40; ++f)
		{
			 std::ostringstream filename;
			 filename << "c:\\dp4\\flout" << f << ".bin";
			std::set<nameval> records;
			std::set<nameval>::iterator iter;
			//
			//
			//
			 inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in);
			 if(!inputfiles[f].is_open())
				   return -3; //
					 if (!inputfiles[f].read((char*)&names[f], sizeof(nameval)))
                           return -4; //
			ret = stat(CT2A(argv[1]), &fs);
			numRecords = (int)(fs.st_size/sizeof(nameval));
			nbegin=0;
			nend=numRecords-1;
			nstop=0;
			//
			//
			//
			//
			//
				//
				//
				//
				//
				//
				//
				//
				//
				//
			//
			//
			//
			//
			//
			//
			char szArgv2[512] = { 0 };
			size_t ncharsConverted = 0;
			mbstowcs_s(&ncharsConverted, nm_uni, sizfld, fld_nm, min(sizfld-1, (int)strlen(fld_nm)));
			//
			while (nbegin<=nend && nstop!=-1)
			{
				nmid= (nbegin + nend)/2;
				nameval rec = { 0 };
				istrm.seekg(nmid* sizeof(nameval));
				istrm.read((char*)&rec, sizeof(nameval)); 
				if (strcmp(szArgv2,rec.fld_nm)<0)
				{
					nend=nmid-1;
					nmid = (nbegin+nend)/2;
				}
				else
				{
					if (strcmp(szArgv2,rec.fld_nm)>0)
					{
						nbegin=nmid+1;
						nmid = (nbegin+nend)/2;
					}
					else
					{
						nstop=-1;
						strcpy(rec.fld_nm,nm_got);
						val_got=rec.fld_val;
					}
				}
			}
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
				//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			if (nstop==-1)
			  {
				  cout << "\nFound it!\n";
				  //
					//
				  //
					//
				  cout << "(From vector record: " << nm_got
					  << ' ' << val_got << ")\n";
			  }
			else cout << "\nDidn't find it!\n";
		}
	time(&currtime);
	sec=difftime(currtime,timev);
	cout << "Search finishes with only " << sec << " seconds";
	system("pause>null");
	//
    return 0;
}

Open in new window


Here is .h file
// nameval.h
#ifndef NAME_VAL_H
#define NAME_VAL_H

struct nameval
{
     char fld_nm[100];
     int    fld_val;

     int  get_len() { return (int)min(strlen(fld_nm), sizeof(fld_nm) ) ; }
     void get_uni_nm(wchar_t nm_uni[], int sizfld)
     {
            mbstowcs_s(nm_uni, fld_nm, min(sizfld-1, strlen(fld_nm)));
     }
     bool operator< (const nameval & a2) const
     {
           if(strcmp(fld_nm, a2.fld_nm) < 0) return true;
           if(strcmp(fld_nm, a2.fld_nm) > 0) return false;
           if (fld_val < a2.fld_val) return true;
           return false;
     }
};

#endif

Open in new window


After having corrected the problem, I'll further change per your advice.
Can you please advise which unnecessary "include" lines I should remove?
you may compare it with the savebinaryfile.cpp I posted. generally, you don't need any c header like stdio.h, comutil.h, string.h, ... more beside of sys/stat.h. to find out whether a header was needed, comment the include statement and compile. if you don't get any new 'undeclared identifier' error from a statement which is really necessary, it is ok.

error C2660: 'mbstowcs_s'

I posted code with a correct call of mbstowcs_s. all other errors are because the nameval.h could not be compiled (as far as I see).

you also can comment the 'using namespace std;' clause. however, you will have to add std:: prefix to 'cout' then, to get it compiled.

you could proceed much faster if you would try to solve the compiler errors yourself before posting them. most of them are pretty simple and the error message exactly tells what is wrong.

Sara
Many thanks Sara.

I posted code with a correct call of mbstowcs_s. all other errors are because the nameval.h could not be compiled (as far as I see).

how to ensure the .h file has been compiled?
all included h files will be compiled. if there is a compile error within a header file, the defined class type might be incomplete and because of that it may come to following errors in the cpp where  the header was included.

the solution to this is to correct the errors in the order they were reported.

Sara
Many thanks Sara.
I did the change like
//
//
#include "stdafx.h"
#include <set>
//
//
#include <sys/stat.h>
#include <string>
//
#include <fstream>
#include <sstream>
#include <atlbase.h>
#include <ctype.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval.h"
//
#include <iomanip>
#include <Windows.h>
//
//
struct stat fs = { 0 };
int ret; //
int numRecords;
//
//
	//
	//
	//
     //
     //
     //
     //
	//
     //
     //
     //
     //
     //
     //
	//
	//
//
nameval binrec;
//
//
  //
  //
  //
  //
//
bool LessComp(const nameval& a1, const nameval& a2)
{
  if(strcmp(a1.fld_nm, a2.fld_nm) < 0) return true;
  if(strcmp(a1.fld_nm, a2.fld_nm) > 0) return false;
  if(a1.fld_val < a2.fld_val) return true;
  return false;
}
int _tmain(int argc,_TCHAR* argv[])
{
	if (argc < 1)
	{
		return ERROR;
	}
	//
	//
	//
	//
	//
    //
    //
    //
    //
    //
	unsigned int nbegin=0;
	unsigned int nend=numRecords-1;
	unsigned int nmid;
	unsigned int nstop=0;
	char nm_got[100];
	unsigned int val_got;
    //
    //
    //
    //
    //
    //
	time_t timev,currtime;
	//
	float sec;
	timev=time(0);
        std::ifstream inputfiles[40];
        nameval names[40] = { 0 };
        bool eof_reached[40] = { false };
        int num = 40;   //
		for (int f = 0; f < 40; ++f)
		{
			 std::ostringstream filename;
			 filename << "c:\\dp4\\flout" << f << ".bin";
			std::set<nameval> records;
			std::set<nameval>::iterator iter;
			//
			//
			//
			 inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in);
			 if(!inputfiles[f].is_open())
				   return -3; //
					 if (!inputfiles[f].read((char*)&names[f], sizeof(nameval)))
                           return -4; //
			ret = stat(CT2A(argv[1]), &fs);
			numRecords = (int)(fs.st_size/sizeof(nameval));
			nbegin=0;
			nend=numRecords-1;
			nstop=0;
			//
			//
			//
			//
			//
				//
				//
				//
				//
				//
				//
				//
				//
				//
			//
			//
			//
			//
			//
			//
			char szArgv2[512] = { 0 };
			size_t ncharsConverted = 0;
			mbstowcs_s(&ncharsConverted, nm_uni, sizfld, fld_nm, min(sizfld-1, (int)strlen(fld_nm)));
			//
			while (nbegin<=nend && nstop!=-1)
			{
				nmid= (nbegin + nend)/2;
				nameval rec = { 0 };
				istrm.seekg(nmid* sizeof(nameval));
				istrm.read((char*)&rec, sizeof(nameval)); 
				if (strcmp(szArgv2,rec.fld_nm)<0)
				{
					nend=nmid-1;
					nmid = (nbegin+nend)/2;
				}
				else
				{
					if (strcmp(szArgv2,rec.fld_nm)>0)
					{
						nbegin=nmid+1;
						nmid = (nbegin+nend)/2;
					}
					else
					{
						nstop=-1;
						strcpy(rec.fld_nm,nm_got);
						val_got=rec.fld_val;
					}
				}
			}
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
				//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			if (nstop==-1)
			  {
				  std::cout << "\nFound it!\n";
				  //
					//
				  //
					//
				  std::cout << "(From vector record: " << nm_got
					  << ' ' << val_got << ")\n";
			  }
			else std::cout << "\nDidn't find it!\n";
		}
	time(&currtime);
	sec=difftime(currtime,timev);
	std::cout << "Search finishes with only " << sec << " seconds";
	system("pause>null");
	//
    return 0;
}

Open in new window


how to correct these?
1>------ Rebuild All started: Project: ReadBinaryFile, Configuration: Release x64 ------
1>  stdafx.cpp
1>  ReadBinaryFile.cpp
1>c:\dev_proj_old\visual studio 2010\projects\readbinaryfile\readbinaryfile\..\..\include\nameval.h(13): error C2660: 'mbstowcs_s' : function does not take 3 arguments
1>ReadBinaryFile.cpp(154): error C2065: 'nm_uni' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'sizfld' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'fld_nm' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'sizfld' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'fld_nm' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'sizfld' : undeclared identifier
1>ReadBinaryFile.cpp(154): error C2065: 'fld_nm' : undeclared identifier
1>ReadBinaryFile.cpp(161): error C2065: 'istrm' : undeclared identifier
1>ReadBinaryFile.cpp(161): error C2228: left of '.seekg' must have class/struct/union
1>          type is ''unknown-type''
1>ReadBinaryFile.cpp(162): error C2065: 'istrm' : undeclared identifier
1>ReadBinaryFile.cpp(162): error C2228: left of '.read' must have class/struct/union
1>          type is ''unknown-type''
1>ReadBinaryFile.cpp(179): warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string.h(105) : see declaration of 'strcpy'
1>ReadBinaryFile.cpp(228): warning C4244: '=' : conversion from 'double' to 'float', possible loss of data
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window

and here is .h file
// nameval.h
#ifndef NAME_VAL_H
#define NAME_VAL_H

struct nameval
{
     char fld_nm[100];
     int    fld_val;

     int  get_len() { return (int)min(strlen(fld_nm), sizeof(fld_nm) ) ; }
     void get_uni_nm(wchar_t nm_uni[], int sizfld)
     {
            mbstowcs_s(nm_uni, fld_nm, min(sizfld-1, strlen(fld_nm)));
     }
     bool operator< (const nameval & a2) const
     {
           if(strcmp(fld_nm, a2.fld_nm) < 0) return true;
           if(strcmp(fld_nm, a2.fld_nm) > 0) return false;
           if (fld_val < a2.fld_val) return true;
           return false;
     }
};

#endif

Open in new window

Hi HuaMinChen,

sorry that I have to say this, but IMO it's more and more tedious how persistant you refuse to follow Sara's suggestions/explanations.

You're asking the same again and again, i.e. in your last post this error occured the fifth time:
error C2065: 'istrm' : undeclared identifier

Open in new window

This is one of the easiest error which can happen in C/C++ and allthough it is completeley selfdescriptive Sara told you what its cause is and how to solve it and that you first should fix all those errors.

And to be honest: I don't want to offend you in any way but IMO if you want to write real programs it is a must to be able to find and fix such trivial errors.

Have a nice day,

best regards,

ZOPPO

PS @Sara: respect for your patience ...
ZOPPO, thanks for your support.

in my opinion the problem is that Hua is not a programmer and therefore has problems to solve even basic issues ...

nevertheless, the goal to write two programs where the one writes a dictionary with 40 million of keys and the other is able to search for keys in the dictionary is an ambitious and worthwile project which could be accomplished with some efforts. the lack of c++ knowledge is a problem but perhaps could be compensated by willingness.

the speed of this process can only be increased by Hua himself as i only can give pointers and will neither repeat complete explanations nor do the work for him.

so in the moment we are stuck cause i already explained that the errors must be solved in the order they were reported by the compiler and that i already posted how the first compiler error of the error list could/should be solved. by reading all my posts thoroughly it should be possible to come out of this.

Sara
Sara,
I really appreciate a lot to your help, as I hope I can give you less trouble as possible as I can.

To the problem of this line

			mbstowcs_s(&ncharsConverted, nm_uni, sizfld, fld_nm, min(sizfld-1, (int)strlen(fld_nm)));
			...

Open in new window


what I should adjust against "nm_uni", within the .h file?
what I should adjust against "nm_uni", within the .h file?
the nm_uni is an argument passed to the get_uni_nm function. it is not the reason for the compiler error which states that that the number of arguments to mbstowcs function is wrong. you may recall the "history" of this error. the former function was mbstowcs without _s suffix. this function has 3 arguments and produces a warning of the compiler stating that writing to a buffer with unknown size possibly is unsafe what is accepted by some experts and denied by others. if you go with the others you simply could switch off the message by defining at top of source:

#pragma warning (disable: 4996) 

Open in new window


or  you define _CRT_SECURE_NO_WARNINGS un the preprocessor macros.

the solution i suggested was to replace mbstowcs by mbstowcs_s and i already posted code how to call that function, you replaced the function name but not the arguments. hence the compiler complained that the call has not enough arguments. in your last post you used the correct call but did not define a variable for 'ncharsConverted'which was used for the first argument. if you look for the code i posted you surely will see the difference and can solve the error.

Sara
Hi Sara,
I have these 2 lines
			size_t ncharsConverted = 0;
			mbstowcs_s(&ncharsConverted, nm_uni, sizfld, fld_nm, min(sizfld-1, (int)strlen(fld_nm)));
			...

Open in new window


to my source. Are you talking about these 2 lines? Do I need to adjust

mbstowcs_s

Open in new window


to .h file?
yes. i mean those 2 lines and if you want to replace mbstowcs by mbstowcs_s you have to do the adjustment in the h file and recompile.

Sara
Thanks Sara.
I see .h file now using
mbstowcs_s

Open in new window


like

            mbstowcs_s(nm_uni, fld_nm, min(sizfld-1, strlen(fld_nm)));
            ...

Open in new window

what change should be applied to the above line?
the change you posted in https://www.experts-exchange.com/questions/28594715/Way-to-scan-the-path.html?anchorAnswerId=40557069#a40557069

there is really no difficulty and i have explained any detail more than once.

Hua, it makes less sense to proceed if you were not able to replace one statement by two other statements, then compile and check if the error has gone.

Sara
Sorry, I adjust .h file to be

// nameval.h
#ifndef NAME_VAL_H
#define NAME_VAL_H

struct nameval
{
     char fld_nm[100];
     int    fld_val;

     int  get_len() { return (int)min(strlen(fld_nm), sizeof(fld_nm) ) ; }
     void get_uni_nm(wchar_t nm_uni[], int sizfld)
     {
            //mbstowcs_s(nm_uni, fld_nm, min(sizfld-1, strlen(fld_nm)));
			size_t ncharsConverted = 0;
			mbstowcs_s(&ncharsConverted, nm_uni, sizfld, fld_nm, min(sizfld-1, (int)strlen(fld_nm)));
     }
     bool operator< (const nameval & a2) const
     {
           if(strcmp(fld_nm, a2.fld_nm) < 0) return true;
           if(strcmp(fld_nm, a2.fld_nm) > 0) return false;
           if (fld_val < a2.fld_val) return true;
           return false;
     }
};

#endif

Open in new window


but I still get these
1>------ Rebuild All started: Project: ReadBinaryFile, Configuration: Release x64 ------
1>  stdafx.cpp
1>  ReadBinaryFile.cpp
1>ReadBinaryFile.cpp(155): error C2065: 'nm_uni' : undeclared identifier
1>ReadBinaryFile.cpp(155): error C2065: 'sizfld' : undeclared identifier
1>ReadBinaryFile.cpp(155): error C2065: 'fld_nm' : undeclared identifier
1>ReadBinaryFile.cpp(155): error C2065: 'sizfld' : undeclared identifier
1>ReadBinaryFile.cpp(155): error C2065: 'fld_nm' : undeclared identifier
1>ReadBinaryFile.cpp(155): error C2065: 'sizfld' : undeclared identifier
1>ReadBinaryFile.cpp(155): error C2065: 'fld_nm' : undeclared identifier
1>ReadBinaryFile.cpp(162): error C2065: 'istrm' : undeclared identifier
1>ReadBinaryFile.cpp(162): error C2228: left of '.seekg' must have class/struct/union
1>          type is ''unknown-type''
1>ReadBinaryFile.cpp(163): error C2065: 'istrm' : undeclared identifier
1>ReadBinaryFile.cpp(163): error C2228: left of '.read' must have class/struct/union
1>          type is ''unknown-type''
1>ReadBinaryFile.cpp(180): warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
1>          c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\string.h(105) : see declaration of 'strcpy'
1>ReadBinaryFile.cpp(229): warning C4244: '=' : conversion from 'double' to 'float', possible loss of data
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

Open in new window

SOLUTION
Avatar of sarabande
sarabande
Flag of Luxembourg image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks a lot Sara.

can I still use "mbstowcs_s" on line 155? If yes, how to delcare these

nm_uni, sizfld, fld_nm

Open in new window


well, as they are unrecognized.

Or should I change the line to use "wcstombs_s"?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks a lot.
I did generate the .exe file.

when I do one search like
ReadBinaryFile "zzwhRnVMJBSAInPFMrcz"

Open in new window


while the above string does exist one file within "c:\dp4". There is no result got. why?
I mean the above string does exist on one file within "c:\dp4" folder.
while the above string does exist one file within "c:\dp4"
how do you know? in which file? how long does the query run? did you add a message after each file where you don't found the key? did you reset the nstop flag to 0 for each file? otherwise only the first file was searched and all other files would break immediately.

can you post the final code (purified)?

did you apply my suggestions regarding searching thru all 40 files as opposed to the former code where you had only one file?

Sara
Many thanks Sara.
Here is the file
https://dl.dropboxusercontent.com/u/40211031/flout5.bin
inside which the string does exist and such file does exist within c:\dp4.

Here are the current codes
//
//
#pragma warning (disable: 4996) 
#include "stdafx.h"
#include <set>
//
//
#include <sys/stat.h>
#include <string>
//
#include <fstream>
#include <sstream>
#include <atlbase.h>
#include <ctype.h>
#include <process.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include "..\..\include\nameval.h"
//
#include <iomanip>
#include <Windows.h>
//
//
struct stat fs = { 0 };
int ret; //
int numRecords;
//
//
	//
	//
	//
     //
     //
     //
     //
	//
     //
     //
     //
     //
     //
     //
	//
	//
//
nameval binrec;
//
//
  //
  //
  //
  //
//
bool LessComp(const nameval& a1, const nameval& a2)
{
  if(strcmp(a1.fld_nm, a2.fld_nm) < 0) return true;
  if(strcmp(a1.fld_nm, a2.fld_nm) > 0) return false;
  if(a1.fld_val < a2.fld_val) return true;
  return false;
}
int _tmain(int argc,_TCHAR* argv[])
{
	if (argc < 1)
	{
		return ERROR;
	}
	//
	//
	//
	//
	//
    //
    //
    //
    //
    //
	unsigned int nbegin=0;
	unsigned int nend=numRecords-1;
	unsigned int nmid;
	unsigned int nstop=0;
	char nm_got[100];
	unsigned int val_got;
    //
    //
    //
    //
    //
    //
	time_t timev,currtime;
	//
	float sec;
	timev=time(0);
        std::ifstream inputfiles[40];
        nameval names[40] = { 0 };
        bool eof_reached[40] = { false };
        int num = 40;   //
		for (int f = 0; f < 40; ++f)
		{
			 std::ostringstream filename;
			 filename << "c:\\dp4\\flout" << f << ".bin";
			std::set<nameval> records;
			std::set<nameval>::iterator iter;
			//
			//
			//
			 inputfiles[f].open(filename.str().c_str(), std::ios::binary| std::ios::in);
			 if(!inputfiles[f].is_open())
				   return -3; //
					 if (!inputfiles[f].read((char*)&names[f], sizeof(nameval)))
                           return -4; //
			ret = stat(CT2A(argv[1]), &fs);
			numRecords = (int)(fs.st_size/sizeof(nameval));
			nbegin=0;
			nend=numRecords-1;
			nstop=0;
			//
			//
			//
			//
			//
				//
				//
				//
				//
				//
				//
				//
				//
				//
			//
			//
			//
			//
			//
			//
			char szArgv2[512] = { 0 };
			size_t ncharsConverted = 0;
			//
			wcstombs(szArgv2, argv[1], sizeof(szArgv2));
			while (nbegin<=nend && nstop!=-1)
			{
				nmid= (nbegin + nend)/2;
				nameval rec = { 0 };
				inputfiles[f].seekg(nmid* sizeof(nameval));
				inputfiles[f].read((char*)&rec, sizeof(nameval)); 
				if (strcmp(szArgv2,rec.fld_nm)<0)
				{
					nend=nmid-1;
					nmid = (nbegin+nend)/2;
				}
				else
				{
					if (strcmp(szArgv2,rec.fld_nm)>0)
					{
						nbegin=nmid+1;
						nmid = (nbegin+nend)/2;
					}
					else
					{
						nstop=-1;
						strcpy(rec.fld_nm,nm_got);
						val_got=rec.fld_val;
					}
				}
			}
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
				//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			//
			if (nstop==-1)
			  {
				  std::cout << "\nFound it!\n";
				  //
					//
				  //
					//
				  std::cout << "(From vector record: " << nm_got
					  << ' ' << val_got << ")\n";
			  }
			else std::cout << "\nDidn't find it!\n";
		}
	time(&currtime);
	sec=difftime(currtime,timev);
	std::cout << "Search finishes with only " << sec << " seconds";
	system("pause>null");
	//
    return 0;
}

Open in new window


since I just remove those old codes right after "//", please let me know if there is problem to compile the above.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
reading 40 files instead of one surely is not the best solution. if a key cannot be found you would have 20 reads for each file, what sums up to 800 reads totally. if you would merge the keys into one big file of 40 million records, you would have a maximum of 27 reads for one search. in the savebinaryfile.cpp you already have code which opens all 40 files for reading. it would be a rather simple loop to merge the keys of the file to one big file. the savebinaryfile.cpp already was cleaned by me and you easily could find out what I have done and do similar to the readbinaryfile.cpp.
Sara,
Really appreciate you a lot!

Do you think creating the whole one big file, would slow down the whole machine?
I mean, to use the current way only.
Do you think creating the whole one big file, would slow down the whole machine?
no, not at all. if you have enough space  at your disk the ntfs file system was able to handle a 4gb file within reasonable times. you could help by already creating a file of right size (for example by using a little helper program which opens a new file and then uses seekp call to write record 40.000.000).

you may open a new question where you ask how to merge 40 sorted files into one big one with best speed.

note, if you have the big file, you must adopt the readbinaryfile to only using unsigned int variables for positioning cause 4 GB requires all 32 bits of a 32-bit integer and the sign bit must be omitted by using only 'unsigned int'

Sara