asked on

How do you flush the directory in .NET (windows)

I need to flush the contents of the file system's directory in much the same way that I am
am flushing the contents of the file to the disk. Any help on how to get a file pointer to the directory (in .NET preferred)
To flush the file I do the following:
void MyFileFlush(FileStream fs)
{ FlushFileBuffers(fs.SafeFileHandle.DangerousGetHandle() ); }
[DllImport("kernel32.dll")]
static extern bool FlushFileBuffers(IntPtr hFile);

I am persisting data that absolutely must go to the media in the event of a power failure. I save
multiple (rolling) copies of the files for redundancy. However, I had a situation where a power failure
left me with an empty directory, presumably because the cached directory information was still
waiting to be written. Writing to multiple directories won't solve the problem as all of the directories\
could be cached. Writing to another PC is possible, but not currently permitted.
We supposed have write cacheing turned off on the drive, but it is unconfirmed (is there an API
that can be used to confirm this programmatically?)
Also, if someone knows how to tell the disk drive itself to not hold on to the data in it's RAM, that would
further increase the reliability.

marcdotnet

ASKER

Additionally, there is a UPS on the system, but this does not guard against the UPS faulting (which is what brought down the system instantly last night), but also if the OS blue screens, we need the data safely on the media as quickly as it is available.
The quantity of data is in the order of 1M every 10 seconds, not much. Just it can't get lost !

ASKER CERTIFIED SOLUTION

jkr

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

JimBeveridge

I think you mostly have a hardware problem. I expect that you are having trouble with the drive caching, even though you think it's turned off. Many consumer drives won't let you turn it off.

Also remember that NTFS is transaction based. That "empty directory" may have been caused by NTFS rolling back a failed transaction.

I have several recommendations to solve the problem.

#1. Create the file and immediately set the size to extend it to the desired size. Now you can write to it without worrying about the directory being updated. You can also use unbuffered I/O (FILE_FLAG_WRITE_THROUGH) to prevent the data from being cached. If you still lose data after this, then it's a virtual certainty that you are having a problem with drive caching.

#2. Bypass the filesystem completely. Use a raw partition (without NTFS) and write the blocks yourself using raw low-level I/O. Now you can be absolutely certain that no filesystem or software cache is getting in the way.

#3. Use SQL Server. It's designed to do guaranteed data delivery. That's the point of an ACID database. You can buy hardware that's certified to run SQL Server, which means that you can be guaranteed that you won't have hard drive caching issues. See the article at http://support.microsoft.com/kb/234656.

4. If the data is that important, why are you using with consumer hardware whose behavior isn't guaranteed? Buy hard drives that are documented to have the cache disabled, then buy a real battery-backed RAID card that will correctly save and restore the data flawlessly in the event of a complete power failure.

Finally, you still have a race condition between the time you tell the operating system to write the data and the time (potentially dozens of milliseconds later) that the data is actually written. Lots of time for failure in there.

marcdotnet

ASKER

This is on a server class machine with a RAID array. It also has a UPS on the system, however, the technician was doing a 'UPS test' and pressed the battery test button. It killed power to the system instantly! In this case, it was the UPS that interrupted us :<
There are 2 rolling older versions of the data file that is being moved (using the OS move which should be a directory only operaton) and but all of these files were gone as well. [If we can't get the latest data, then the next to the latest is better than nothing, etc]
We are using write through, but that doesn't help me if the directory doesn't show the files.

It almost seems that having RAID increased our vulnerability. We were told that RAID will make our system more robust, but that seems only to be in the case of a disk failure, not the case here. I'll inquire about it being separately battery backed.

I'll check to see if the caching in the disk is disabled (any idea as to how to check this?)

JimBeveridge

What do you mean "a RAID array?" If you tried to build the box "on the cheap", then it's probably just using the motherboard's RAID feature, which is software-only and is garbage. You need a battery-backed, hardware RAID, with certified drives.

In fact, RAID can get you into a LOT of trouble if the person who selected and installed it isn't an expert. Western Digital consumer drives (*especially* the Green drives) are specifically documented as "never use in RAID."

Again, what you are doing is what SQL Server does. You need hardware that's certified to run SQL Server. In such situations, the system is certified as an entire unit (motherboard, drives, RAID card, etc.) It may even be certified to particular revision levels for each drive and all of the drives have to match.

A RAID card is still a single point of failure. I say that from experience, because I've had a RAID card die and take my database with it. If you need reliability at the multiple nines level, then you need redundancy at every level. Multiple UPSs, multiple power supplies, multiple controllers, multiple hard drives... All of this is well understood technology, you just need a company that knows how to build it for you. (I'm not trying to sell anything here. My company has nothing to do with this.)

I still believe the solutions I gave earlier are correct as a way to narrow down the problem, but you need to do due diligence with your hardware platform.

Guy Hengel [angelIII / a3]

I've requested that this question be deleted for the following reason:

This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.

jkr

Well, since "you cannot do that" is still a valid answer on EE, I'd like to object in relation to my first comment in this Q.

jkr

OK, in this case: http:#a35495537 - it can't be done for the reasons stated in this comment.