Link to home
Start Free TrialLog in
Avatar of jlsjls
jlsjls

asked on

Increase performance when writing to a file

Hi,

I'm creating a component (C++) which intends to simulate non-volatile RAM (32k) by writing information to a file on disk (data is flushed to disk after each write operation).
I notice that each write to this NVRAM takes about (max.) 94ms. However the application which uses this component frequently writes to the NVRAM (f.e. after scanning an article it writes 10 settings to NVRAM by calling 10 times the write-method of the component). So for each action by the application, I notice an extra delay of approx. 1s (or more) which is intolerable for the application.
Could you provide me some advice on increasing the performance when using writing to a file?

Current way of working in write method of component :
* read content of file
* write new content to file
* flush data to disk (takes most of the time)

Is multithreading an option and why?
Overlapped I/O?

BTW. Using other hardware (harddisk) is no option.
BTW. The application may not be changed.

jlsjls
Avatar of wayside
wayside

Couple of ideas:

- Keep the file in memory the whole time, and write it as needed. That way you save the cost of the read every time through. Write the entire file into a buffer in memory, and then write the entire buffer out at once, this way you are doing only one write operation.

- What file i/o functions are you using? If you are using iostreams from the standard library, switch to c library functions such as fopen, fwrite, etc., they are much faster. You can fiddle with buffer sizes using setvbuf() to maximize your performance. Or even better switch to native win32 API calls if you are programming on Windows.

- Switch to using a memory-mapped file, this will probably get you the best performance of all. This lets you open the file and treat it as a memory array, to write to the file you simply change that memory location.
Do you really need to flush the data to the disk every time? If you are concerned that their might be a power or system failure at any moment and need to have the most up to date data then you will have to do this, which will cause a delay whilst the physcial storage medium writes the data. This will be regardless of whether you have a conventional file or memory mapped file.

The comments by wayside are correct and will improve performance, but the only real way to dramatically improve matters is to not call the flush function so often.
Avatar of jlsjls

ASKER

The purpose of the component is to simulate a Non-volatile RAM. So it must be able to cope with power/system failures at any moment (most important goal).
So I agree with you about flushing to disk (accessing slow media) for each write request is the only solution.
I notice for a complete write cylce (max. 94ms) :
1/3 time -> reading + writing
2/3 time -> flushing

Maybe by using memory-mapped files (my file is only 32k in size) I can improve the read/write operation a bit.

MSDN states that asynchronous I/O for relatively fast I/O would be avoided :
"In situations where an I/O request is expected to take a large amount of time, such as a refresh or backup of a large database, asynchronous I/O is generally a good way to optimize processing efficiency.
However, for relatively fast I/O operations, the overhead of processing kernel I/O requests and kernel signals may make asynchronous I/O less beneficial, particularly if many fast I/O operations need to be made.In this case, synchronous I/O would be better."

It sounds like you don't want buffered I/O, because you want to commit changes all the time. Use open (UN*X) / CreateFile (Windows) rather than fopen. Load the image into RAM. Update, seek and write modified parts of the image only.
Avatar of jlsjls

ASKER

I'm using Windows API to create (with FILE_FLAG_NO_BUFFERING), read and write to file.
The file contains plain text data.
What I'd do is copy the data to a local array and set a timer for say 100 msec.

If you get called before the timer fires, just copy the new data to the array.

If the timer fires, then you can write to the file and close it.  You've saved a bunch of writes and flushes.

No need to call flush(), most systems will do so in a second or so.

In other words, cache the data in memory until the flurry of updates subsides, THEN write the whole mess to disk.

A few gotchas though:

(1)  There's no way to fully simulate NVRAM.  If the power fails during the disk write, the disk block might get half-written, which means next time you go to read it it will be  unreadable.  Much better idea:  write to a different file each time, say NV1 thru NV5.  That way if one file goes bad you can go to the previous one.

(2)  The power might fail while writing the directory.  That's REALLY bad news.

(2)  Calling Flush() isnt a secure way to ensure anything.  Modern file systems have so many layers of buffering (in the app, in the OS, in the disk cache, in the disk controller, in the disk drive), that callng flush() from the app is like the president shouting "private Jones, go to bed!" and expecting the order to be carried out.





> The file contains plain text data.

If that means your entire 32K is liable to be altered with each write I guess you are stuck with having to write all 32K with each update. If, however, you are able to get away with writing no more than a few disk sectors each time, you could SetFilePointer to the relevant sector offset and write only the changed sectors with FILE_FLAG_NO_BUFFERING. The file should be a fixed size for this approach and after its initial creation, which means there should be no worries about trashing the directory entry on a power failure, but take grg99's advice on this point I'm not sure of my ground.
>be no worries about trashing the directory entry on a power failure,

On most OS's, every time you open and close the file the last access time gets updated in the directory.

And I should have mentioned, there are some really clever file systems, specially designed so that a bad directory write does no major harm.  You can lose the last file changes, but at least the previous file contents are readable.  This isnt true for the FAT file systems.  Probably not true for NTFS either but I'm not 100% sure.  You need one of those file systems with "logging" on the name.



Avatar of jlsjls

ASKER

After carefully reading of MSDN documentation, I've decided to use the CreateFile-method with attribute FILE_FLAG_NO_BUFFERING and no longer use the method 'FlushFileBuffers' which leads to the delays. In that way, it's possible that the metadata of the file isn't flushed to disk (MSDN) on power failure but that's the least of my concerns.
ASKER CERTIFIED SOLUTION
Avatar of CetusMOD
CetusMOD
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial