Link to home
Start Free TrialLog in
Avatar of CroetOne
CroetOneFlag for Netherlands

asked on

reading unknown file format

I try to read the file I have attached to this post. Using binary or normal file handling with VBA, this file seems to be empty.
My question is, does someone know how to read this format? Does someone recognize this format?
Any help would be very welcome.
JK

081607.txt
Avatar of VBClassicGuy
VBClassicGuy
Flag of United States of America image

It's just a five-field CSV file, with commas as delimiters. The first few lines are:
aaau,646945,6.7700000,,2
aaau,85830,0.0250000,,2
aaau,922974,2.4000000,,2
aaau,945925,0.7800000,,2
aaau,GRR8,2.2200000,,2
aaau,TZ,2.7500000,,2
aaau,TZL,2.1000000,,2
aaau,alk,0.2800000,,2
aaau,bsl,6.9800000,,2
aaau,ced,0.1500000,,2
aaau,chb,42.5300000,,2
aaau,ghg,1.2250000,,2
aaau,ral,0.2700000,,2
aaau,sbd,0.0900000,,2
aaau,ses,0.0800000,,2
aaau,swt,0.2100000,,2
aaau,tls,3.6700000,,2
aaau,wmc,9.4600000,,2

Fields are:
text, text or LONG number, FLOAT number, null, and  the number 2.
Don't know what it's used for...
VBClassicguy is correct, this is a simple CSV (Comma Separated Value) File.
A very good program to 'open' this file is excel; however a text editor (such as notepad) will open it nonetheless.

A quick-and-dirty way of making this file open with excel is to change the .txt to .csv

then:

1. Open a Blank Workbook in Excel.
2. Data, Get External Data, Import Data. (Excel 2007 is Data, Get External Data, Data from Text)
3. Browse to your .csv file and Select "Import".
4. Import Wizard should appear.
5. Page 1 Select "Delimited"
6. Select the row which you want to start the import.
7. click "Next"
8. In the Delimiters, select "comma"
9. click "Next"
10. highlight each column of your data in the window below. For each column you can specify "General", "Text", "Data", or "do not import column" using the radio buttons in the top left of the Wizard box. This is an optional step.
11. Click Finish.

source : http://www.fixya.com/support/t266670-csv_file_import_correctly_into_excel
Or heck, if you have Excel installed, just change the file extension to CSV, then double-click it and Excel will open it for a quick look. By the way, it has 9896 lines in it!
Avatar of CroetOne

ASKER

Sorry guys, wrong file :-)
This was indeed a csv file and should be no trouble, even for me..
The file I supposed to attach is now attached, sorry. But we all had a big laugh. But never the less yoy doun't earn those points so easy.
JK

020710.txt
Avatar of Dana Seaman
The file appears to be fixed binary records.
Header is 16 bytes
There are 734 Data records of 33 bytes each.
The last record appears to be truncated and has only 25 bytes.

For each record:
The first 16 bytes are ANSI padded with null characters. Example "aaauTZL"
The next 16 bytes is some type of binary data.
The final byte of each record is &hOF.
 
Here is sample code to read file:



Private Sub Command1_Click()
   Dim FF         As Integer
   Dim lLenB      As Long
   Dim bHdr(15)   As Byte
   Dim bDat(32)   As Byte
   Dim i          As Long
   
   FF = FreeFile
   Open App.Path & "\020710.txt" For Binary Access Read As FF
   lLenB = LOF(FF) - 16
   Debug.Print "lLenB\33", lLenB \ 33
   Get FF, , bHdr
   Debug.Print "bHdr", StrConv(bHdr, vbUnicode)
   Debug.Print "-----"
   For i = 0 To lLenB \ 33
      Get FF, , bDat
      Debug.Print "bDat", StrConv(bDat, vbUnicode)
   Next

   Close FF

End Sub

Open in new window

Hi Danaseaman, That looks good. There are indeed 734 records.
The last 16 bytes are a floating point number. The first 14 lines in readable data:
aaau      TZL      0.960000000            16
aaca      36070      6.880000000            16
aaca      61081      63.100000000            16
aaca      82095      0.280000000            16
aaca      85519      0.080000000            16
aaca      91059      1.400000000            16
aaca      L1067      19.240000000            16
aaca      v1013      0.790000000            16
aaca      v1031      3.340000000            16
aach      12005      57.150000000            16
aach      12032      169.800000000            16
aach      1213860      55.100000000            16
aach      1222171      18.790000000            16
aach      24899      13.880000000            16

Do you have any suggestion how to decode the binairy value to the decimal value?
JK
ASKER CERTIFIED SOLUTION
Avatar of Dana Seaman
Dana Seaman
Flag of Brazil image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Removed Val function sicne 2nd field may contain letters:

Debug.Print "bDat", Left$(strDat, 4), Mid$(strDat, 5, 12), ByteArrayToDouble(bDat), Asc(Right$(strDat, 1)) + 1
@daneseaman.
Perfect solution. I try to understand it and provide some commets in the code. When I do understand completly, I'll close the case, otherwise I'll wil ask you some additional questions. If I may. TnX!
JK
With comments:

Option Explicit

Private Declare Sub CopyMemory Lib "KERNEL32" Alias "RtlMoveMemory" (hpvDest As Any, hpvSource As Any, ByVal cbCopy As Long)

Private Sub Command1_Click()
   Dim FF         As Integer
   Dim lLenB      As Long
   Dim bHdr(15)   As Byte
   Dim bDat(32)   As Byte
   Dim strDat     As String
   Dim dblDat     As String
   Dim i          As Long
   
   FF = FreeFile
   Open App.Path & "\020710.txt" For Binary Access Read As FF
   'Get length of file in bytes less header length.
   lLenB = LOF(FF) - 16
   'Calculate how many 33 bytes records are in file
   Debug.Print "lLenB\33", lLenB \ 33
   Get FF, , bHdr
   Debug.Print "bHdr", StrConv(bHdr, vbUnicode)
   Debug.Print "-----"
   'Step through records
   For i = 0 To lLenB \ 33
      Get FF, , bDat 'Read 33 byte record
      strDat = StrConv(bDat, vbUnicode) 'Convert to string.
      'First 4 characters is field 1.
      'Characters 5 to 16 is field 2
      'Field 3 is double (8 bytes) extracted starting at byte 17.
      'Field 4 is last character which is &H0F. Adjusting for base 0 it is 16.
      Debug.Print "bDat", Left$(strDat, 4), Mid$(strDat, 5, 12), ByteArrayToDouble(bDat), Asc(Right$(strDat, 1)) + 1
   Next

   Close FF

End Sub

Function ByteArrayToDouble(ByRef aByte() As Byte) As Double
   'Each record is 33 bytes. Floating point Double starts at byte 17 and i 8 bytes inlength.
   'Adjusting for base 0 we copy bytes to a double variable.
   CopyMemory ByteArrayToDouble, aByte(16), 8
End Function

Open in new window

@daneseaman.
Sorry, didn't notice you already did the same. I'm having two questions however:
1. Is there a function for dbldat?
2. why do you use $-string functions when strdat is already converted?
Regards,
JK
   Dim FF         As Integer
   'to retrieve file
   Dim lLenB      As Long
   'byte array to retrieve first record
   Dim bHdr(15)   As Byte
   'byte array to retrieve body records
   Dim bDat(32)   As Byte
   'to save the body records in unicode
   Dim strDat     As String
   'no function
   'Dim dblDat     As String
   Dim i          As Long
    
   FF = FreeFile
   Open FileName For Binary Access Read As FF
   'LOF= length of file in bytes minus the header (which is 16 bytes)
   lLenB = LOF(FF) - 16
   'recordlength is LOF divided by 33 (2*16+1) by use of the integer division operator
   'Integer division is carried out using the \ Operator. Integer division
   'returns the quotient, that is, the integer that represents the number of
   'times the divisor can divide into the dividend without consideration of any remainder.
   Debug.Print "lLenB\33", lLenB \ 33
   Get FF, , bHdr
   'converting a byte array to unicode (readable English)
   Debug.Print "bHdr", StrConv(bHdr, vbUnicode)
   Debug.Print "-----"
   'Stepping through the file in blocks of 33 bytes
   For i = 0 To lLenB \ 33
      Get FF, , bDat
      'building data record in readable English
      strDat = StrConv(bDat, vbUnicode)
        'printing the bytes using $-functions ($= to handle bytes instead of characters?)
        'but strDat is already converted to string. Is there another reason to use the
        '$-function.
        Debug.Print "bDat", Left$(strDat, 4), Mid$(strDat, 5, 12), ByteArrayToDouble(bDat), Asc(right$(strDat, 1)) + 1
   Next
 
   Close FF
 
End Sub
 
Function ByteArrayToDouble(ByRef aByte() As Byte) As Double
   'convert starting byte 16 with a length of 8 bytes.
   CopyMemory ByteArrayToDouble, aByte(16), 8
End Function

Open in new window

1. Is there a function for dbldat?
   This is no longer needed since we get the double direct from byte array via Function ByteArrayToDouble
2. why do you use $-string functions when strdat is already converted?
   The first 16 bytes are ANSI. We are assuming that the first 4 characters are Field1 so we use Left$
   Likewise we assume that characters 5 to 16 are Field 2 so we use Mid$ to extact this from strDat.
   End of record &H0F is the last character so we use Asc + Right$ to get this from strDat.
Perfect solution.
Thanks for points.
Originally I did not know that Field 3 was Floating point Double.
With this knowledge we can use a udt Type to simplify reading this file.
Note that Field 4 is declared also as double but is all zeros in the file.

Option Explicit

Private Type udtRecord
   sField1     As String * 4
   sField2     As String * 12
   dField3     As Double '8 bytes
   dField4     As Double '8 bytes
   sField5     As Byte
End Type

Private Sub Command2_Click()

   Dim FF         As Integer
   Dim lRows      As Long
   Dim bHdr(15)   As Byte
   Dim bDat       As udtRecord
   Dim i          As Long
   
   FF = FreeFile
   Open App.Path & "\020710.txt" For Binary Access Read As FF
   lRows = (LOF(FF) - 16) \ Len(bDat) 'Calculate how many 33 bytes records are in file
   Debug.Print "lRows", lRows
   Get FF, , bHdr
   Debug.Print "bHdr", StrConv(bHdr, vbUnicode)
   Debug.Print "-----"
   
   For i = 0 To lRows 'Step through records
      Get FF, , bDat 'Read record
      Debug.Print bDat.sField1, bDat.sField2, bDat.dField3, bDat.sField5 + 1
   Next

   Close FF
End Sub

Open in new window

Great, thanks. That simplies further use of this code. Btw, if have tried to read the header, using bytetodouble, but came up with a very small or very large number.
First 8 bytes of header is ANSi "pri1.001". There is a trailing &HOF just as there is for all data records. That only leaves 7 bytes so it can't be a double.
I tried single but that looks like garbage.  

We are seeing a character sequence of "þúûü" in header and all data records after the double field. I have no idea what this is.

Okay, Thanks again.