Link to home
Start Free TrialLog in
Avatar of puckkk
puckkk

asked on

Fastest way to read files

Hello,

I have to read a lot of files  (like 500 or more, they are all text files; each file is little like 8 or 10 KB) in a directory.
Foreach file I have to open it, reading chars from 15 to 19, if chars are =="F24A0", i need to compute the has of the file and store it in a database. If !="F24A0" file will be skipped. What is the fastest way to do it? Any suggestion? I need a very fast way, cause i need to read a lot of files...
I made this: (pseudo code)
foreach (File file in Dir.GetFiles())
{
using (System.IO.FileStream fs = file.OpenRead()))
{
                StreamReader sr = new StreamReader(fs);
                char[] buff = new char[20];
                fs.Position = 15;
                sr.ReadBlock(buff, 0, 5);
                fs.Position = 0;
                string t = new String(buff);
                if (t=="F24A0)
               {
                System.Security.Cryptography.MD5 sscMD5 = System.Security.Cryptography.MD5.Create();
                byte[] mHash = sscMD5.ComputeHash(fs);
                retValue = Convert.ToBase64String(mHash); // diventa stringa da 24 caratteri!!!
                }
               else{}
}
Avatar of SweatCoder
SweatCoder
Flag of United States of America image

I doubt you can improve much on the way you're doing it, without writing it in vc++, which could improve performance. But I suspect your bottleneck will be much more on the file I/O side, and less on the code execution speed.
ASKER CERTIFIED SOLUTION
Avatar of JimBrandley
JimBrandley
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
One more thought: The expected ratio of those you need to hash versus those you do not may help determine the best way to attack the problem. Reading the entire file into memory then checking the 5 bytes will be, I think, much faster for those you hash, and a bit slower for those you do not. It will depend partly on how the OS decides to cache buffers for the files you are reading. You want to avoid waiting for the disk to make another revolution before loading the next block into a buffer as much as possible.

Jim