larrysy
asked on
Is there a better way to search for a particular record in a binary file?
Is there a better way to look for a particular record in a binary file aside from looping through each record? If the # of records is very large and the particular record is near the end of the file, it wil take a lot of time doing the search. I just need the record number so that on the next access I can go to the record right away.
Example code below.
Option Explicit
Private Type Header
Name As String * 14
NotUsed As Integer
NotUsed1 As Long
Reserve As String * 32
End Type
Private Type Record
Date As Integer
Time As Integer
A As Single
B As Single
C As Single
D As Single
E As Long
End Type
Private fileName As String
Private Sub Form_Load()
fileName = "c:\someFile.txt"
End Sub
Private Sub Command1_Click()
' create a test file with header and five records
Dim h As Header
Dim r As Record
Dim i As Integer
h.Name = "Mike"
Open fileName For Binary As #1
Put #1, , h
For i = 1 To 5
r.Date = i
r.A = i
r.B = i
r.C = i
r.D = i
r.E = i
r.Time = i
Put #1, , r
Next i
Close #1
End Sub
Private Sub Command2_Click()
Dim h As Header
Dim r1 As Record
Dim i As Integer
Open fileName For Binary As #1
Get #1, , h
i = 0
Do While Not EOF(1)
i = i + 1
Get #1, , r1
If r1.A = 4 Then
Call MsgBox("A=4 is in record " & i)
Exit Do
End If
Loop
Close #1
End Sub
Example code below.
Option Explicit
Private Type Header
Name As String * 14
NotUsed As Integer
NotUsed1 As Long
Reserve As String * 32
End Type
Private Type Record
Date As Integer
Time As Integer
A As Single
B As Single
C As Single
D As Single
E As Long
End Type
Private fileName As String
Private Sub Form_Load()
fileName = "c:\someFile.txt"
End Sub
Private Sub Command1_Click()
' create a test file with header and five records
Dim h As Header
Dim r As Record
Dim i As Integer
h.Name = "Mike"
Open fileName For Binary As #1
Put #1, , h
For i = 1 To 5
r.Date = i
r.A = i
r.B = i
r.C = i
r.D = i
r.E = i
r.Time = i
Put #1, , r
Next i
Close #1
End Sub
Private Sub Command2_Click()
Dim h As Header
Dim r1 As Record
Dim i As Integer
Open fileName For Binary As #1
Get #1, , h
i = 0
Do While Not EOF(1)
i = i + 1
Get #1, , r1
If r1.A = 4 Then
Call MsgBox("A=4 is in record " & i)
Exit Do
End If
Loop
Close #1
End Sub
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
1. Going to a record directly in a text/binary file is not possible.
2. It becomes possible only by maintaining data in databases.
3. Even if you have the record number, you cannot directly jump to the particular record if you are maintaining a text file .
4. I am not very sure about what dbrckovi has written. You have to loop thru the file till you get your particular record.
Try to post a sample/format of your binary file.
Write also, how and when you append your file
and how you read your file....
2. It becomes possible only by maintaining data in databases.
3. Even if you have the record number, you cannot directly jump to the particular record if you are maintaining a text file .
4. I am not very sure about what dbrckovi has written. You have to loop thru the file till you get your particular record.
Try to post a sample/format of your binary file.
Write also, how and when you append your file
and how you read your file....
sgayatri:
>> I am not very sure about what dbrckovi has written. You have to loop thru the file till you get your particular record.
When you use Random access mode, you don't have to loop. You can read any record from any position at any time.
Try this example: It reads first, then 12564'th, then 1000'th second record without looping.
'------------------------- ---------- ---------- ----
Option Explicit
Private Type Header
Name As String * 14
NotUsed As Integer
NotUsed1 As Long
Reserve As String * 32
End Type
Private Type Record
Date As Integer
Time As Integer
A As Single
B As Single
C As Single
D As Single
E As Long
End Type
Private Sub Command1_Click()
' create a test file with header and five records
Dim h As Header
Dim r As Record
Dim i As Integer
h.Name = "Mike"
Open "c:\test.txt" For Random As #1
Put #1, , h
For i = 1 To 30000
r.Date = i
r.A = i
r.B = i
r.C = i
r.D = i
r.E = i
r.Time = i
Put #1, , r
Next i
Close #1
MsgBox "Done!"
End Sub
Private Sub Command2_Click()
Dim h As Header
Dim r1 As Record
Dim i As Integer
Open "c:\test.txt" For Random As #1
Get #1, 1, h 'get first record
Get #1, 12564, r1 'get 12564'th record
Print r1.A
Get #1, 1000, r1 'get 1000'th record
Print r1.A
Close #1
End Sub
'------------------------- ---------- ---------- ---------- ----------
>> I am not very sure about what dbrckovi has written. You have to loop thru the file till you get your particular record.
When you use Random access mode, you don't have to loop. You can read any record from any position at any time.
Try this example: It reads first, then 12564'th, then 1000'th second record without looping.
'-------------------------
Option Explicit
Private Type Header
Name As String * 14
NotUsed As Integer
NotUsed1 As Long
Reserve As String * 32
End Type
Private Type Record
Date As Integer
Time As Integer
A As Single
B As Single
C As Single
D As Single
E As Long
End Type
Private Sub Command1_Click()
' create a test file with header and five records
Dim h As Header
Dim r As Record
Dim i As Integer
h.Name = "Mike"
Open "c:\test.txt" For Random As #1
Put #1, , h
For i = 1 To 30000
r.Date = i
r.A = i
r.B = i
r.C = i
r.D = i
r.E = i
r.Time = i
Put #1, , r
Next i
Close #1
MsgBox "Done!"
End Sub
Private Sub Command2_Click()
Dim h As Header
Dim r1 As Record
Dim i As Integer
Open "c:\test.txt" For Random As #1
Get #1, 1, h 'get first record
Get #1, 12564, r1 'get 12564'th record
Print r1.A
Get #1, 1000, r1 'get 1000'th record
Print r1.A
Close #1
End Sub
'-------------------------
ASKER
I have to add another condition. I have no control over the creation of the binary file. The Command 1 is just to create an example for use in Command 2. In actual use, I'll just be doing the Command 2 function to an existing file. But I'll test if I can use the methods suggested. Thanks.
(1000'th second record without looping. should be 1000'th record without looping)
Going directly to Record number using text/binary access mode is also possible, but then you have to specify exactly at which byte is the beginning of your record.
To know that, you would have to know length of each record (in bytes), calculate the position of the record which you need and use Get #1,PositionInBytes,r1.
But this is not required when you use Random Access mode becouse in this case all calculations and indexing is done automaticaly.
Going directly to Record number using text/binary access mode is also possible, but then you have to specify exactly at which byte is the beginning of your record.
To know that, you would have to know length of each record (in bytes), calculate the position of the record which you need and use Get #1,PositionInBytes,r1.
But this is not required when you use Random Access mode becouse in this case all calculations and indexing is done automaticaly.
larrysy:
I just want to make sure there is no confusion or missunderstanding.
Are you trying to open a record under specific number, or are you trying to find and open a record based on the value it contains?
In first case all you have to know is record number which you need and specify it in Get statement, but if you are trying to search through records
and open the one which contains some value, then you'll either have to loop through all of them or you'll have to use some kind of a database like
sgayatri suggested.
If you woud have some control on how is the file created, then you might create some Index record which would be saved at the end or the beginning of the.
It could contain values which you need and record number which contains them. This way you wouldn't have to loop through the whole file.
I just want to make sure there is no confusion or missunderstanding.
Are you trying to open a record under specific number, or are you trying to find and open a record based on the value it contains?
In first case all you have to know is record number which you need and specify it in Get statement, but if you are trying to search through records
and open the one which contains some value, then you'll either have to loop through all of them or you'll have to use some kind of a database like
sgayatri suggested.
If you woud have some control on how is the file created, then you might create some Index record which would be saved at the end or the beginning of the.
It could contain values which you need and record number which contains them. This way you wouldn't have to loop through the whole file.
ASKER
dbrckovi
1. I have no control in the creation of the original file (so its a plain binary file in the format of the file created in Command 1)
2. The file contains dates in the first column
3. There are variable number of records for each date ( so cannot extrapolate from the first record)
4. I need to extract the records from a date range to another file, so I have to look for the record with the first instance of the start date and start processing
I can go to a specific record using the Seek function, so I can test the record at 50% of the file length, then move to the next one in the center of the 50% range and so on until I come to within a few record of the start date. However, can I specify the starting point of a Do Loop? Havent tried this yet, but just came out of my head.
Thanks
1. I have no control in the creation of the original file (so its a plain binary file in the format of the file created in Command 1)
2. The file contains dates in the first column
3. There are variable number of records for each date ( so cannot extrapolate from the first record)
4. I need to extract the records from a date range to another file, so I have to look for the record with the first instance of the start date and start processing
I can go to a specific record using the Seek function, so I can test the record at 50% of the file length, then move to the next one in the center of the 50% range and so on until I come to within a few record of the start date. However, can I specify the starting point of a Do Loop? Havent tried this yet, but just came out of my head.
Thanks
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi idle_mind,
Thanks again. This is my solution based on your ideas (and your previous soultion). I can get to the Target Value really fast.
Private Sub getStart()
Dim H As Header
Dim r1 As Record
Dim GotIt As Boolean
Dim i As Long
GotIt = False
TargetValue = some value
factor = 2
nextRecPos = Len(H) + 1 + ((Round(numRecs / factor, 0) - 1) * Len(r1))
Open binfile For Binary As #1
Get #1, , H
Do
i = i + 1
Debug.Print i
Seek #1, nextRecPos
Get #1, , r1
With r1
Select Case .A
Case Is < TargetValue
factor = factor * 2
nextRecPos = nextRecPos + ((Round(numRecs / factor - 1, 0) - 1) * Len(r1))
Case Is > TargetValue
factor = factor * 2
nextRecPos = nextRecPos - ((Round(numRecs / factor - 1, 0) - 1) * Len(r1))
Case Is = TargetValue
GotIt = True
End Select
End With
Loop Until GotIt = True
Close #1
End Sub
Private Sub Process()
Seek #1, nextRecPos
Do until EOF(1)
Loop
End Sub
(I'll have to give some points to dbrckovi)
Thanks again. This is my solution based on your ideas (and your previous soultion). I can get to the Target Value really fast.
Private Sub getStart()
Dim H As Header
Dim r1 As Record
Dim GotIt As Boolean
Dim i As Long
GotIt = False
TargetValue = some value
factor = 2
nextRecPos = Len(H) + 1 + ((Round(numRecs / factor, 0) - 1) * Len(r1))
Open binfile For Binary As #1
Get #1, , H
Do
i = i + 1
Debug.Print i
Seek #1, nextRecPos
Get #1, , r1
With r1
Select Case .A
Case Is < TargetValue
factor = factor * 2
nextRecPos = nextRecPos + ((Round(numRecs / factor - 1, 0) - 1) * Len(r1))
Case Is > TargetValue
factor = factor * 2
nextRecPos = nextRecPos - ((Round(numRecs / factor - 1, 0) - 1) * Len(r1))
Case Is = TargetValue
GotIt = True
End Select
End With
Loop Until GotIt = True
Close #1
End Sub
Private Sub Process()
Seek #1, nextRecPos
Do until EOF(1)
Loop
End Sub
(I'll have to give some points to dbrckovi)
ASKER
Ooops, I got the split points wrong - the accepted answer should be to idle_mind.
ASKER
Thanks, it's now corrected.
File is saved differently and with more additional data when u use Random.
For example, file in your case is 172 bytes long when you use Binary, and 664 bytes when you use Random.
I'm not sure but I assume that in files created using Random access mode, application saves some kind of index or table of contents, so Get statement
can jump directly to specific record number, without looping through file.