Solved

Writing to files using VC++ - Too Slow?

Posted on 1998-10-08
16
650 Views
Last Modified: 2008-03-17
Hello Guyz.
I'm trying to write to files, using WriteFile function.
The thing is, it is very slow!
I have to write a file that contains about 200-300 Integers.
I use a loop, like
for (i=0;i<200;i++)
    WriteFile(FileHandle, &Int[i] ...)
and it takes a VERY LONG TIME to accomplish (At least on my P166).
How Can I overcome this? Should I use larger 'blocks' to write to the file, so I can use less WriteFile functions? Is that what makes it so slow?

Maybe there is a better way to read files? Maybe Memory-mapping? (I don't know what it means, I just heard about it). Please help...
0
Comment
Question by:ShadowHawk071998
  • 5
  • 5
  • 5
  • +1
16 Comments
 
LVL 3

Accepted Solution

by:
plaroche earned 90 total points
ID: 1174669
I don't know the overhead caused by this but you could maybe use a binary dump. Something like:

int   arr[200];

pFile->WriteFile(FileHandle, arr, 200*sizeof(int), ...);

That would do it in just one call and I suspect it'd be a LOT faster.
0
 
LVL 3

Expert Comment

by:plaroche
ID: 1174670
Just to add a tidbit and explain memory-mapping.

let's say you create a file that's 10k in size and you "map it", this means that you will receive a pointer to a memory address.  That pointer is then valid from that address to 10k more.

When you write data to those address it automatically is saved by the OS (NT) into that file.

I use this with huge structures and it's very very nice, you just assign the pointer you got a struct and write to the struct. It automatically is written to the file.

This is the same thing that is used by NT to load a DLL once and "map it" to different processes.
0
 
LVL 22

Expert Comment

by:nietod
ID: 1174671
There is a lot of overhead in starting a fle write operation.  Hust you are best off calling the write operation a few times for large amounts of data rather than many times for small amounts of data.  Thus if you write out the integers in one or a few large chuncks it will be much faster.
0
 
LVL 8

Expert Comment

by:Answers2000
ID: 1174672
>> That would do it in just one call and I suspect it'd be a LOT faster.
It is a lot faster.

By performing equivalent mods on character reading code of a WP file (about 5000 chars), I enhanced it from > 60s, to < 1 s
0
 

Author Comment

by:ShadowHawk071998
ID: 1174673
Okay. I understood this will help (I can't check it till next week, when I'm back at work...).
Can anyone explain this 'memory-mapping' a bit more precisly?
I mean, Can you give me an example?
And if I use it, then can I write single integers or chars to the file, and it will work fast? Or am I better off using big 'dumps' to the file?
My main concern is speed, not easy-writing. I'm willing to use complicated code, if it will prove to be Faster.
thanks!
0
 
LVL 3

Expert Comment

by:plaroche
ID: 1174674
Memory-mapping won't be faster for the use you need. Here's a bit of code from the top of my head:

struct sExample {
  int   array[20];
  char  name[32];
};

// Then let's say you have the code that creates a file of size
// equal to sizeof(sExample), or more.
    m_hFile = ::CreateFile(m_fileName,
                        GENERIC_READ|GENERIC_WRITE,
                        FILE_SHARE_READ|FILE_SHARE_WRITE,
                        0,
                        OPEN_ALWAYS,
                        FILE_ATTRIBUTE_NORMAL,
                        NULL);

    if( m_hFile == INVALID_HANDLE_VALUE ) {
        return FALSE;
    }

    m_hFileMap = ::CreateFileMapping(m_hFile,
                                NULL,
                                PAGE_READWRITE,
                                0,
                                (DWORD)sizeof(sExample),
                                NULL);

// NOW you "map" the file in memory. A pointer is returned.
        m_pMemMap = ::MapViewOfFile(m_hFileMap,
                                    FILE_MAP_READ|FILE_MAP_WRITE,
                                    0,
                                    0,
                                    0 );

// Now you can say that this memory zone is the struct, WITHOUT allocating the struct because you already got the memory with the file-mapping.
 
  sExample*   pEx = m_pMemMap;

// And now you can write to the struct, everything you modify in
// the struct (which is really the memory-mapped file) will be
// written on disk.
  pEx->array[0] = 1;
  etc...
0
 

Author Comment

by:ShadowHawk071998
ID: 1174675
So What exactly is the benefit of the Memory Mapping to Me?
I mean, I don't need to use a 'struct' it is basically a file with many Integers in it.
I think it would be better to me to 'read it all' in one chunk, work on that chunk in memory.
And when I want to save to a file, do it in a 'chunk' again.
Am I right?
0
 
LVL 3

Expert Comment

by:plaroche
ID: 1174676
This is precisely what my first line said:

"Memory-mapping won't be faster for the use you need. "
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:ShadowHawk071998
ID: 1174677
Hmmm. Sorry to be dumb:
If I use memory mapping, and then write to it one integer at a time. Than essencially, when speed is involoved, then is it the same like writing to a file integer at a time? Or is it better optimized?
Anyway, I guess that I'll use the idea of using big chunks of data, and write/read to my files by myself. One question still remains :
I'm using not just integerse, but also chars, and all sorts of arrays. It was very simple to use WriteFile directly, because I could write a char, or an integer, or an array of integers in one simple command. If I allocate an array, can I do the same? I mean, if I allocate an array of char, can I insert in a specific location a whole array? Without doing all sorts of silly math? (Like getting the LSB of an integer to the first byte, and the MSB to the second byte, etc). I want it to be simple, see...

thanks!
0
 
LVL 22

Expert Comment

by:nietod
ID: 1174678
>>if I use memory mapping, and then write to it one integer at a time.
>> Than essencially, when speed is involoved, then is it the same like
>> writing to a file integer at a time? Or is it better optimized?
It is probably faster than writting out one integer at a time.  This is because you are just moving the integer into a memory location and at some later time the OS will write out the whole chunk of memory in one shot.  However, for most cases, it is probably better and safer if you don't use memory mapping, but instead write the information yourself.  This yields code that is easier to read and maintain and less likely to have weird bugs.  Memory mapping is best reserved for some really complex cases that are almost impossible to manage without it.

>> I could write a char, or an integer, or an array of integers in one simple
>>command. If I allocate an array, can I do the same?
There are't many (any?) cases where it makes sense to write data of different types to a file without some sort of underlying structure to help interpret the data.  (Otherwise, when you go to read the data, you can't tell if a byte you are reading is a character or part of an integer.)  This structure (by structure I mean orginzation, not struct { } ) usually helps determine how you will write the data.

If for example, the file has a fixed format that is a mixture of chars and ints, but at specific locations.  The you can create a structure (here I mean struct {}) that has the same format, fill it in with the chars and ints, and write the whole thing.  

If you can describe the file format better, we may be able t help more.
0
 

Author Comment

by:ShadowHawk071998
ID: 1174679
Hi Nietod, Thanks for the help.
Actually, My app involves writing to 4 files. 3 of them are pretty basic, and I could use the 'structure' thing you mention (I figured that I would have to do it like that).
The third file is more problematic, because I have strings in it. That means, (If I were using pascal way of 'file-of'), It's a file of (for example)
Struct a{
  int a;
  my_array b;
  char *name;
}.

What do I do with the strings in that solution? How can I use records then? (And I don't want to 'limit' the strings, by saving like a 100chars each time. I would like to use Null-terminiated strings in the file. How can I do it?
When I was using 'writefile' functions, It was no problem, cause I was writing one char at a time. I wasn't aware of the great overhead of that solution. But now, how can I still use null-terminated strings?
0
 
LVL 22

Expert Comment

by:nietod
ID: 1174680
There are two ways to record strings in a file.  You can use a NUL termianted string or you can specify an initial length.  Of the two, the initial length system is much more efficient and easier to program, although it generates a slightly larger file.  It also has the advantage that it allows you to store NUL characters in the strings, which can be nice at times (To make use of this you have to switch to using a string class at run-time rather than C's NUL terminated strings.  A change that I highly recommend.)  

The problem with this is that it is a bit harder to write out the strings in one operation.  When you have NUL terminated stirngs, you can copy several of them to a byte array and  have each one termianted by a NUL character before the next one bigins and then write the whole array out.  You can't do it quite so easily with the initial length strings, but I think it is still worth it.  What you do in this case is create a byte array for holding the strings and their lengths, and the copy the lengths into the byte array using a casting operation.  example follows.
0
 
LVL 22

Expert Comment

by:nietod
ID: 1174681
For example, to write out the two strings

const char *Str1 = "this is string 1.";
const char *Str2  = "This is a second string.";
int Len1 = strlen(Str1);
int Len2 = strlen(Str2);
char *Array = new char [Len1+Len2+2*sizeof(int)]; // Note extra storage for two ints.
char *Dst = Array;
 *(int *)Dst = Len1; // Store the first length.
Dst+= sizeof(int); // Point past first length.
memcpy(Dst,Str1,Len1); // Copy in first string.
Dst+= Len1; // Point past end of first string.

 *(int *)Dst = Len2; // Store the second length.
Dst+= sizeof(int); // Point past second length.
memcpy(Dst,Str2,Len2); // Copy in second string.

Now Array, contains an integer length, followed by a string of that length, followed by another integer length, followed by another string of that length.    This can be written to the file in "one shot".

In general, that is the way I recommend you do it.
0
 

Author Comment

by:ShadowHawk071998
ID: 1174682
Hello Guyz. I hope you're still here.
Well, my problem now, is that while I have strings inside my files, there is also that:
1. The files are big (Thousands of integers and strings, about 300K File)
2. The Strings and integers may be changed. Changing the integers is not a big problem, because they always occupy the same length. But how do I change the Strings effectivly? If the file gets bigger/smaller, how Do I do it?
I thought about allocating a new block of memory, and transfer the old file, making the needed changes. It Is not very good, because I'll have to traverse the file first to find the old sting, get it's size, determine the new block's size, allocate it, and then transfer the old one, and make changes.
Is there something I'm missing?
0
 
LVL 3

Expert Comment

by:plaroche
ID: 1174683
Allocate a fixed amount of space for each string, make sure that amount is enough for any string you got.
0
 
LVL 22

Expert Comment

by:nietod
ID: 1174684
That is the easiest way, but has some obvious limitations--Potentially wastes space and potentially limits the string length too much

Other methods would be to write the string using a format that allows the string to expand.  For example, when strings expand you could write the part that fits and then write the additional length elsewhere in the file (append it to the end).  You would need to ouput the string information in a format that allows you to find there other locations where the string data may be stored, like start the string using the format

unsigned int StringLength // Total string length
unsigned int  NxtOffset  // Next offset.  If 0, the string follows entirely.
unsigned int SectionLength // Length of the current section.  This is StringLength if NextOffset is 0.
char [] // The characters in the current section.  This should be SectionLength characters.

If there is a next section, it could have a format like

unsigned int  NxtOffset  // Next offset.  If 0, the rest of the string follows entirely.
unsigned int SectionLength // Length of the current section.  
char [] // The characters in the current section.  This should be SectionLength characters.

This allows string to expand as needed, as they string you will loose space however,  That is there will be wasted space in the file.  There are ways around that as well.
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

When writing generic code, using template meta-programming techniques, it is sometimes useful to know if a type is convertible to another type. A good example of when this might be is if you are writing diagnostic instrumentation for code to generat…
What is C++ STL?: STL stands for Standard Template Library and is a part of standard C++ libraries. It contains many useful data structures (containers) and algorithms, which can spare you a lot of the time. Today we will look at the STL Vector. …
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now