I am working with some rather large files (> 250GB in some cases) and I need to determine how I can best make a signature of these files. My initial plan was to do something like this.
' specify the path to a file and this routine will calculate your hash
Public Function calcFileHash(ByVal filepath As String) As String
' open file (as read-only)
Using reader As New System.IO.FileStream(filepath, IO.FileMode.Open, IO.FileAccess.Read)
Using md5 As New System.Security.Cryptography.MD5CryptoServiceProvider
' hash contents of this stream
Dim hash() As Byte = md5.ComputeHash(reader)
' return formatted hash
The problem with this is that it is VERY slow when working with files that are this large. I am using MD5 because in my understanding (limited) I believe there to be much less of a collision risk than other methods... however the speed is a very real problem. Does anyone know how I might be able to better generate a unique signature for a file of this size? The only real tools I have at my disposal are all in the .net realm but I could use a third part provided component if it fit the bill.
The goal: Generate a signature that is unique to the contents of a file that can be in excess of 250GB.
Thanks for your thoughts and direction here....