Link to home
Start Free TrialLog in
Avatar of VK
VKFlag for Germany

asked on

Need fastest method to read a binary file

Hello Experts !

I have to do a cyclic redundancy check 32-bit (CRC32) for given files to ensure (not strong algorithm but sufficient and faster than SHA,MD5,Blowfish) it's authenticity.
System:

PII,MMX
450 MHz
256MB RAM
Win2000

This are the benchmark values for the current algorithm:

FileName        1MB.TXT     10MB.TXT
Size(Bytes)     1.048.576   10.485.760
Duration (s)    3,19        32,02
MB/s            0,314       0,312

Code:

'------------------------------------------------------'FORM
'------------------------------------------------------

Private Sub cmdCompute_Click()
    Dim FSO As FileSystemObject
    Dim Start As Single
    Dim Duration As Double
    Dim lCrc32Value As Long
    Dim FileStr As String
   
   
    Me.MousePointer = vbHourglass
    Start = Timer
       
    Set FSO = New FileSystemObject
    lCrc32Value = InitCrc32()
    FileStr = FSO.GetFile(cmbPfade.Text).OpenAsTextStream.ReadAll
    lCrc32Value = AddCrc32(FileStr, lCrc32Value)
    txtCRC32.Text = Hex$(GetCrc32(lCrc32Value))
       
       
    Duration = CDbl(Timer) - CDbl(Start)
    If Duration > 0 Then
        txtMBProSec.Text = Format(CDbl(txtDateiGröße.Text) / (Duration * CDbl(1048576)), "0.000")
    Else
        txtMBProSec.Text = "Duration too short!"
    End If
    txtDuration.Text = Format(Duration, "0.00")
    Me.MousePointer = vbDefault
End Sub
'------------------------------------------------------'MODULE
'------------------------------------------------------
Private Crc32Table(255) As Long

Public Function InitCrc32(Optional ByVal Seed As Long = &HEDB88320, Optional ByVal Precondition As Long = &HFFFFFFFF) As Long
    Dim iBytes As Integer 'counter
    Dim iBits As Integer 'counter
    Dim lCrc32 As Long 'value
    Dim lTmpCrc32 As Long 'value

    For iBytes = 0 To 255
        lCrc32 = iBytes
        'Now iterate through each bit in counter byte
        For iBits = 0 To 7
            'Right shift unsigned long 1 bit
            lTmpCrc32 = lCrc32 And &HFFFFFFFE
            lTmpCrc32 = lTmpCrc32 \ &H2
            lTmpCrc32 = lTmpCrc32 And &H7FFFFFFF
            If (lCrc32 And &H1) <> 0 Then 'Now check if temporary is less than zero and then mix Crc32 checksum with Seed value
                lCrc32 = lTmpCrc32 Xor Seed
            Else
                lCrc32 = lTmpCrc32
            End If
        Next
        Crc32Table(iBytes) = lCrc32 'Put Crc32 checksum value in the holding array
    Next
    'After this is done, set function value to the precondition value
    InitCrc32 = Precondition
End Function

Public Function AddCrc32(Item As String, ByVal Crc32 As Long) As Long
    'ByVal Item As String   <= string that is to be checksum-computed
    Dim bCharValue As Byte
    Dim iCounter As Long 'V.K. Aufgrund großer Dateien geändert zu long
    Dim lIndex As Long
    Dim lAccValue As Long
    Dim lTableValue As Long
   
    For iCounter = 1 To Len(Item)
        bCharValue = Asc(Mid$(Item, iCounter, 1)) 'Get ASCII value for the current character
        'Right shift an Unsigned Long 8 bits
        lAccValue = Crc32 And &HFFFFFF00
        lAccValue = lAccValue \ &H100
        lAccValue = lAccValue And &HFFFFFF
        'Now select the right adding value from the holding table
        lIndex = Crc32 And &HFF
        lIndex = lIndex Xor bCharValue
        lTableValue = Crc32Table(lIndex)
        'Then mix new Crc32 value with previous accumulated Crc32 value
        Crc32 = lAccValue Xor lTableValue
    Next
    AddCrc32 = Crc32 'Set function value the the new Crc32 checksum
End Function

Public Function GetCrc32(ByVal Crc32 As Long) As Long
    GetCrc32 = Crc32 Xor &HFFFFFFFF 'Set function to the current Crc32 value
End Function

Does any expert know a way to increase noticeable the time performance by changing the code ?


Regards

V.K.
Avatar of VK
VK
Flag of Germany image

ASKER

PS:

Add this to the declaration Section of the module.

Option Explicit
Option Compare Text

Private Crc32Table(255) As Long

The files hve an average size of 300k, but files up to 10MB are possible.

V.K.
I don't know what the fastest way to get data from a file in VB is, I read binary files like this..

    Dim x As String
    Dim iFile As Long
   
    Dim times
   
    iFile = FreeFile
   
    times = Timer
   
    Open "c:\download\msaoe.exe" For Binary As iFile
    x = String(LOF(iFile), 0)
    Get iFile, , x
   
    Close iFile
   
   
    MsgBox Timer - times



Have you considered c?
Avatar of VK

ASKER

Hello deighton !

I've now tested your suggested code:

1. It's about 16% faster (noticeable ?).
2. I'm getting a different values for CRC32 !
   a. FileStr = FSO.GetFile(cmbPfade.Text).OpenAsTextStream.ReadAll
      CRC32 = "40E22B13"
   b. Get #DateiNr, , FileStr
      CRC32 = "B225DEA7"

V.K.

Avatar of VK

ASKER

I've made some changes:

Replaced

Set FSO = New FileSystemObject
FileStr = FSO.GetFile(cmbPfade.Text).OpenAsTextStream.ReadAll


to

Dim DateiNr As Integer
DateiNr = FreeFile
Open cmbPfade.Text For Binary As DateiNr
FileStr = String(LOF(DateiNr), 0)
Get DateiNr, , FileStr
Close DateiNr

And

bCharValue = Asc(Mid$(Item, iCounter, 1)) 'Get ASCII value for the current character

to

bCharValue = AscB(Mid$(Item, iCounter, 1)) 'Get ASCII value for the current character

Performace:

0,430 MB/s

CRC32 ="44134EDB"

I will inrease points to 100 if somebody can explain me the different CRC32-values.

V.K.
Avatar of VK

ASKER

My next approach:

Private Sub cmdCompute_Click()
    Dim Start As Single
    Dim Duration As Double
    Dim lCrc32Value As Long
    Dim ByteArray() As Byte
    Dim FileNr As Integer
   
    Start = Timer
    lCrc32Value = InitCrc32()
                       
    FileNr = FreeFile
    Open cmbPfade.Text For Binary As FileNr
    ReDim ByteArray(LOF(FileNr) - 1)
    Get FileNr, , ByteArray
    Close FileNr
       
    lCrc32Value = AddCrc32(ByteArray, lCrc32Value)
    txtCRC32.Text = Hex$(GetCrc32(lCrc32Value))
       
    Duration = CDbl(Timer) - CDbl(Start)
    If Duration > 0 Then
        txtMBProSec.Text = Format(CDbl(txtDateiGröße.Text) / (Duration * CDbl(1048576)), "0.000")
    Else
        txtMBProSec.Text = "Duration too short!"
    End If
    txtDuration.Text = Format(Duration, "0.00")
End Sub

Public Function AddCrc32(Item() As Byte, ByVal Crc32 As Long) As Long
    Dim bCharValue As Byte
    Dim iCounter As Long
    Dim lIndex As Long
    Dim lAccValue As Long
    Dim lTableValue As Long
   
    For iCounter = 0 To UBound(Item)
        bCharValue = Item(iCounter)
        lAccValue = Crc32 And &HFFFFFF00
        lAccValue = lAccValue \ &H100
        lAccValue = lAccValue And &HFFFFFF
        lIndex = Crc32 And &HFF
        lIndex = lIndex Xor bCharValue
        lTableValue = Crc32Table(lIndex)
        Crc32 = lAccValue Xor lTableValue
    Next
    AddCrc32 = Crc32
End Function


Performace:

0,725 MB/s

CRC32 ="B225DEA7"


V.K.
Avatar of VK

ASKER

PS:

All performance Data is for the execution in the vb-ide.
Compiled to an exe:

10,751 MB/s !! -> thats OK

But what sould be the correct CRC32 ?
Is it only important to have different CRC's for different files and vica versa ?

Example:

When i create a Text file (63Byte) with the following content (of course without quotes):

'https://www.experts-exchange.com/jsp/qManageQuestion.jsp?qid=20287363'

--> CRC32 = "FC69DB59"

50 Pts for anyone who can explain me the correct CRC32.

V.K.
ASKER CERTIFIED SOLUTION
Avatar of deighton
deighton
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of VK

ASKER

Thank you for your assistance in realizing the difference between ASCII and Unicode.

Because it isn't relevant what CRC32 (but unique to a file) i'm getting i don't have to convert the string.

Regards

V.K.