Way to improve it

Hi,
Further to this thread
http://www.experts-exchange.com/Programming/Languages/CPP/Q_28629782.html#a40686952

I try the way to have 10 items to the struct below

struct rec_struc
{
      item items[10];
      rec_struc(bool bfill = false);

};

but the speed of the project is rather slow. Is there a way to improve it?
LVL 12
HuaMin ChenProblem resolverAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

sarabandeCommented:
you may try with

const unsigned int NUM_FILES = 100;
const unsigned int NUM_RECORDS = 80000;

Open in new window


what produces the same output.

the speed mostly is determined by the size of the std::set containers, so the above should increase speed dramatically.

a second thing you could do is to move the descriptions of dynamic length to a separate data file. that way you would store less than the half of the current data to disk.

Sara
0
David Johnson, CD, MVPOwnerCommented:
I would use something like Telerik JustTrace to analyse my code to determine where the maximum time is spent. I'd combine Just trace with perfmon or some other system performance monitoring tool and look for disk and memory usage.
0
HooKooDooKuCommented:
Each time you create an instance of 'rec_struct', you are creating 10 instances of 'item'.

Do you ALWAYS need 10 instances of 'item'?

If not, you can improve performance by changing your structure from an array of 'item' objects to an array of pointers to 'item' objects.

struct rec_struc
{
      item* pItems[10];
      rec_struc(bool bfill = false);

};

Even better should be to use a dynamic array object.  That way you're not locked into only 10 pointers.  For example, I use MFC that includes a CPtrArray.  So I might have code that looks like this:

struct rec_struc
{
      CPtrArray Items;
      ~rec_struc();    //Destructor

};

//This destructor will make sure that all 'Item' objects placed into Items array are deleted when the structure goes out of scope
rec_struc::~rec_struc()
{
    for(int i=0; i<Items.GetCount(); i++
        delete Items.GetAt(i);
}

...

rec_struc r;
item* pItem;
pItem = new Item();
r.Items.Add( pItem );    //The CPtrArray will auto-grow the array as needed

...

pItem = (Item) r.Items.GetAt(i);
0
Angular Fundamentals

Learn the fundamentals of Angular 2, a JavaScript framework for developing dynamic single page applications.

sarabandeCommented:
HooKooDooKu, the structures of the program were used for storing binary data to records of file. so, it is not possible to using pointers or classes/structures which cannot made persistent by a flat copy.

in the previous thread (see the link in the original post), we used a structure fouritems (which has an array of 4 items and not 10 like it is here) and stored 80.000.000 records (320 million items) to a binary file. an item contains of a name (char[21]), a 64-bit number, and a description (wchar_t[100]). for name and number there are additional index files which refer to the record numbers of the data file (means any item name and item number - which are supposed to be unique - point to a 4-item-record of the data file). the index files are sorted and allow a binary search in a second program for either name or number.

I don't know why the number of items per record now should be increased to 10. my suggestion was to use 1 item per record. but anyhow, the most negative influence to the performance are due to std::set containers which are used to get the name and number indices sorted. with the current program (it is a 64-bit program because it needs more than 4 gb virtual memory), there are 2 std::set containers, one would store 10 million of name indices (a 21 byte name + unsigned int record number)  and the other 10 million of number indices (a 64-bit integer + unsigned int record number). after the std::set containers were filled, they were stored to a temporary file. currently, 8 index files of each index were built and finally merged to a big index file which contains all 80 million of indices of each kind. my suggestion to improve performance is to decrease the number of entries for one temporary index file and increase the number of temporary files to merge since currently the virtual memory used for the std::set containers is more than available what leads to extreme swapping and a cillion of page faults. on the other hand the merge operation of the index files doesn't need a relevant amount of memory and the speed to read 100 k records from 100 files is the same as reading 1 million records from 10 files.  so, if the speed would improve dramatically - as I would expect - it might be worth a try to go to 1000 files of 10000 records each.

Hua, would you please post the latest code of savebinaryfile here in a code window such that other experts have a fair chance to participate?

Sara
0
HuaMin ChenProblem resolverAuthor Commented:
Many thanks all.
Sara,
Using these
const unsigned int NUM_FILES = 100;
const unsigned int NUM_RECORDS = 80000;
...

Open in new window


it took 7 hours to generate 100 name files and 100 number files. Can the speed of this be improved? Have a great weekend!
0
sarabandeCommented:
it might be worth a try to go to 1000 files of 10000 records each.
did you try this?

of course you should run a release version.

then please measure time for creating the std::set containers by adding

clock_t start = clock();
for (n = 0; n < NUM_RECORDS;  ++n)
{
    ...
}
clock_t end = clock();
std::cout << "time elapsed for building std::set " <<  (end-start) << std::endl;

into the f-loop.

Sara
0
HuaMin ChenProblem resolverAuthor Commented:
Many thanks Sara.

When running "ReadBinaryFile", do we put

Name=...

with the name and then press Enter to start the search? Is this correct?
0
sarabandeCommented:
yes. to find valid names you ,ay open any of the name index files within visual studio.

Sara
0
HuaMin ChenProblem resolverAuthor Commented:
Good day Sara,
I encounter one open error like

Enter your Search Criteria: (CTRL-Z to stop)
Name=PvPHLtNDJAJMeENgbGZ
open error
Enter your Search Criteria: (CTRL-Z to stop)

Open in new window


when running ReadBinaryFile
0
sarabandeCommented:
you may output 'errno' or GetLastError() with the "open error".

error code 2: file or path not found
error code 3: wrong directory
error code 4: too many open files
error code 5: access denied

for 2 and 3 you may output (or debug) the path passed to the open call.
for 4 (less likely) you would have to reduce the number of files opened (or look why the files no longer in use are not closed)
if error is 5 it is likely that another program opened the files and either still was running (look into task manager) or did not close the file properly. if you opened a file in visual studio for to look into, this also may prevent from opening in readbinaryfile.

Sara
0
HuaMin ChenProblem resolverAuthor Commented:
Many thanks Sara.
Where to put "GetLastError() "?
0
sarabandeCommented:
you do it like

std::cout << strpath << " open error " << GetLastError() << std::endl;

Open in new window


where strpath is the std::string containing the path to the file which couldn't be opened.

alternatively include <errno.h>  (if not already included) and do

std::cout << strpath << " open error " << errno << std::endl;

Open in new window


the output would report the same with both.

Sara
0
HuaMin ChenProblem resolverAuthor Commented:
Many thanks Sara. How to get full error message? I now only get this

Enter your Search Criteria: (CTRL-Z to stop)
Name=PvPHLtNDJAJMeENgbGZ
 open error 2

Enter your Search Criteria: (CTRL-Z to stop)

Open in new window


after I've added the line like
	if (!indexfile.is_open())
	{
		std::cout << " open error " << GetLastError() << std::endl;
		return false;
	}
	...

Open in new window

0
sarabandeCommented:
as told, error==2 means that the path passed to indexfile.open(...) doesn't exist.

you may post the contents of the string variable that was passed to indexfile.open and the code how you build the path.

did you use the 'createFileName' function?

if so, you must not pass a file number to be added cause you want the merged index file to be used in readbinaryfile.

Sara
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
HuaMin ChenProblem resolverAuthor Commented:
Yes, I do use createFilename. As it is showing the full message, per the change I applied to the codes, what to further adjust?
0
HuaMin ChenProblem resolverAuthor Commented:
Good day Sara,
I recently did no change to ReadBinaryFile but have adjust the number of files, and number of records, within SaveBinaryFile. Many thanks.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C++

From novice to tech pro — start learning today.