Solved

problems reading a text file

Posted on 2006-10-27
30
190 Views
Last Modified: 2010-04-30
I have a text fiile coming through each day from a supllier containg orders, is there a way to 'read' certain parts of this file for information?

Here is an example of the data contained in the file:

----------------------------------------------------------------------------------------------------------------
BGM+230::9:+NORI04%C8776+1+AE'
DTM+137:20061020:102'
UNS+S'
UNT+5+EMP0001'
UNZ+1+NORI04004383'
UNB+UNOA:2+GBMDH.MDH555:ZZ+GBMDH.MDH923:ZZ+061021:0057+NORI04004384++EMPORD+++EANCOM'
UNH+EMP0001+ORDERS:D:93A:UN:EAN007+01.13'
BGM+220::9:+NORI04%C8882++AE'
DTM+137:20061020:102'
DTM+ZZZ:20061021:102'
RFF+CR:000-876-283'
NAD+DP+::+123 Made Up Street :Made Up Town::Made Up City:+Mr Made Up:01614833974:::++++AA12 1AA' >>>This Line has info on customer name/address
TDT+20++   :+31:P%REAL04+:::REALITY'
LIN+1'
PIA+5+Y3392%22%:IN::92+YD3392%22%:ZZZ::92+CHAIR BED:SA::91+1422:ON::92'
IMD+++DU:::NAVY BED' >>> This line contains info on product ordered
QTY+21:1.00' >>>This line contains info on quantity ordered
DTM+11:20061031:102'
DTM+75:7:804'
PRI+AAA:24.000'
FTX+ZZ1+++:::'
PAC+1'
PCI++91502586000:1:    :H:   :EW06::N:  %0:0.000:8405150258601066'
GIN+AT+BURS:66+55B :    +6 :666'
UNS+S'
CNT+2:1'
UNT+21+EMP0001'
----------------------------------------------------------------------------------------------------------------

This block of text contains info on just one order. There anything upto 100 of these in one text file (100 Orders) and they are all formated as above.

My target is to some how import the marked 'fields' (lines with <<< after) in to a database, but I cannot think of a way at the moment because strange layout of the file (lots of +, : and '), for example the quantity field (QTY+21:1.00' ) the only information I am after is the 1.00 part.

Does any one know any ways of achieving this?

Thank Alot.
0
Comment
Question by:sam_norian
  • 17
  • 10
  • 2
  • +1
30 Comments
 
LVL 92

Expert Comment

by:Patrick Matthews
Comment Utility
Hi sam_norian,

As long as the file uses consistent rules, this can be done.  What we need you to do is to codify those rules
in as much detail as you can.  E.g.:

"The quantity field always starts:

QTY+21:

then has the value I want, sometimes [or always, as the case may be] followed by a decimal.  The field terminates
with an apostrophe."

Regards,

Patrick
0
 
LVL 2

Expert Comment

by:quarkphoton
Comment Utility
Alot,

please elaborate what information you want to extract so that i can suggest an appropriate solution.

if i assume you are just interested in the line containing <<<<< or >>>>>, just process line by line and then make use of the "split" function of VB
0
 
LVL 2

Expert Comment

by:quarkphoton
Comment Utility
ooops....Sam Norian not Alot :)
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
@sam_norian

>>import the marked 'fields' (lines with <<< after) in to a database

in the given sample content, you didnt mention the portion you want to select
Please specify the portion you want to extract
0
 
LVL 3

Author Comment

by:sam_norian
Comment Utility
Hi,

Sorry these are the parts I need to import:

IMD+++DU:::NAVY BED' >>> i just want 'navy bed'

QTY+21:1.00' >>>i just want '1.00'

NAD+DP+::+123 Made Up Street :Made Up Town::Made Up City:+Mr Made Up:01614833974:::++++AA12 1AA' >>> From this line i need '123 Made Up Street' as (db field) Address 1, 'Made Up Town' as (db field) Address 2, Made Up City as (db field) Address 3, Mr Made Up as (db field) CustomerName, 01614833974 as (db field) PhoneNumber and finally 'AA12 1AA' as (db field) Postcode.

DTM+11:20061031:102' >>> I need the 20061031 (if possible reformatted to English date (31/10/2006).

I hope I have explained this well enough :s

0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
As far as i understood following are the lines u want to process

IMD+++DU:::NAVY BED' >>>
QTY+21:1.00' >>>
NAD+DP+::+123 Made Up Street :Made Up Town::Made Up City:+Mr Made Up:01614833974:::++++AA12 1AA' >>>
DTM+11:20061031:102' >>>

some more questions
1. following prefixes will always be constant right ?
IMD+++
QTY+
DTM+

2. File content will carry only one item(line) for each entries ?
3. There wont be any character after ">>>" in the same line ?

0
 
LVL 3

Author Comment

by:sam_norian
Comment Utility
>>1. following prefixes will always be constant right ? Yes they are.
>>2. File content will carry only one item(line) for each entries ? I'm not sure what you mean by this?
>>3. There wont be any character after ">>>" in the same line ? I added the >>> to comment the text, the ' (apostrophe) signals the end of the line.
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
in second question i mean to know that
there can be two entries like the following in one file?

IMD+++DU:::NAVY BED1'>>>
IMD+++DU:::NAVY BED2'>>>
0
 
LVL 3

Author Comment

by:sam_norian
Comment Utility
Yes, there is a very good chance there will be, each chunk of text (like the shown above) is 1 order, there are likely to situations where that product is order more than once in one file but that line wont appear in an order more than once.
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
where do u need all data to be populated
Shall i put message boxes, so that you can do the rest from there
0
 
LVL 3

Author Comment

by:sam_norian
Comment Utility
The data will end up in a database table. Message boxes will be fine, I should be able to work it out from there.

Thanks alot,

Sam
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
Private Sub Command1_Click()
Dim oReg, oMat, oMCol, Content, sFile
    Set oReg = CreateObject("VBScript.RegExp")
    oReg.IgnoreCase = True
    oReg.Global = True
    sFile = "C:\Your_File.txt" '<----- "Modify your file path
    Content = GetFileContent(sFile)
   
    'Getting IMD+++DU:::NAVY BED' >>>
     oReg.Pattern = "IMD\+{3}\w+\:{3}(.*?)'[ \t]*>>>"
     Set oMCol = oReg.Execute(Content)
     For Each oMat In oMCol
        MsgBox "IMD : " & oReg.Replace(oMat.Value, "$1")
     Next
     'Getting QTY+21:1.00' >>>i just want '1.00'
     oReg.Pattern = "QTY\+\d+\:(.+?)'[ \t]*>>>"
     Set oMCol = oReg.Execute(Content)
     For Each oMat In oMCol
        MsgBox "QTY : " & oReg.Replace(oMat.Value, "$1")
     Next
     'DTM+11:20061031:102' >>>
     oReg.Pattern = "DTM\+\d+\:(\d{4})(\d{2})(\d{2})'[ \t]*>>>"
     Set oMCol = oReg.Execute(Content)
     For Each oMat In oMCol
        MsgBox "DTM : " & oReg.Replace(oMat.Value, "$3") & "/" & oReg.Replace(oMat.Value, "$2") & "/" & oReg.Replace(oMat.Value, "$1")
     Next
End Sub
Private Function GetFileContent(ByVal sFile As String) As String
Dim oFso, oText
    Set oFso = CreateObject("Scripting.FileSystemObject")
    If Not oFso.FileExists(sFile) Then
        MsgBox "File does not exist", vbExclamation
        Exit Function
    End If
    Set oText = oFso.OpenTextFile(sFile)
    GetFileContent = oText.ReadAll
    oText.Close
    Set oFso = Nothing
End Function
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
i need some more clarification in the line
please let me know what all things to be extracted from this line

NAD+DP+::+123 Made Up Street :Made Up Town::Made Up City:+Mr Made Up:01614833974:::++++AA12 1AA' >>>
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
one change

replace this line   --------------> oReg.Pattern = "DTM\+\d+\:(\d{4})(\d{2})(\d{2})'[ \t]*>>>"
with
       oReg.Pattern = "DTM\+\w+\:(\d{4})(\d{2})(\d{2})\:\d+'[ \t]*>>>"

0
 
LVL 3

Author Comment

by:sam_norian
Comment Utility
123 Made Up Street , Made Up Town , Made Up City , Mr Made Up , 01614833974 , AA12 1AA

Thanks alot
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
Still having doubt in that line
i am listing some possible entries, are they valid ?

NAD+DP+::+123 Made Up Street1 :Made Up Town1::Made Up City1:+Mr Made Up1:01614833974:::++++AA12 1AA' >>>

NAD+DP+::+123 Made Up Street2 :Made Up Town2::Made Up City2:+Mr Made Up2:01614833975:::++++AA12 1AB' >>>

NAD+DP+::+123 Made Up Street3 :Made Up Town3::Made Up City3:+Mr Made Up3:01614833976:::++++AA12 1AC' >>>


That is only the values
  123 Made Up Street
  Made Up Town
  Made Up City
  Mr Made Up
  01614833974
  AA12 1AA
are getting changed rest of the things will be same.
That is seperators  :   :+   :::++++ are will be in place
right ?
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
more specifically 6 data

NAD+DP+::+content1:content2::content3:+content4:content5:::++++content6' >>>

right ?
0
 
LVL 3

Author Comment

by:sam_norian
Comment Utility
Yes this is right. Thank you
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
ok append this much code

     'Getting NAD+DP+::+123 Made Up Street :Made Up Town::Ma...
     oReg.Pattern = "NAD\+\w+\+\:{2}\+(.+?)\:(.+?)\:{2}(.+?)\:\+(.+?)\:(.+?)\:{3}\+{4}(.+?)'[ \t]*>>>"
     Set oMCol = oReg.Execute(Content)
     Dim i As Integer
     Dim sData, sSep
     For Each oMat In oMCol
        sSep = ""
        sData = ""
        For i = 1 To 6
            sData = sData & sSep & oReg.Replace(oMat.Value, "$" & i)
            sSep = ","
        Next
        MsgBox "NAD : " & sData
     Next
0
 
LVL 3

Author Comment

by:sam_norian
Comment Utility
I have put this code under a command button but it doesn't seem to do anything, if I change the sFile = "C:\Your_File.txt" to file that doesn't exist I get the error message but when I have the file located correctly it doesn't do anything (no message boxes or error messages).

Thanks
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
can u post the complete code u have ?
0
 
LVL 3

Author Comment

by:sam_norian
Comment Utility
Private Sub Command1_Click()
Dim oReg, oMat, oMCol, Content, sFile
    Set oReg = CreateObject("VBScript.RegExp")
    oReg.IgnoreCase = True
    oReg.Global = True
    sFile = "c:\23-10-06.txt" '<----- "Modify your file path
    Content = GetFileContent(sFile)
   
    'Getting IMD+++DU:::NAVY BED' >>>
     oReg.Pattern = "IMD\+{3}\w+\:{3}(.*?)'[ \t]*>>>"
     Set oMCol = oReg.Execute(Content)
     For Each oMat In oMCol
        MsgBox "IMD : " & oReg.Replace(oMat.Value, "$1")
     Next
     'Getting QTY+21:1.00' >>>i just want '1.00'
     oReg.Pattern = "QTY\+\d+\:(.+?)'[ \t]*>>>"
     Set oMCol = oReg.Execute(Content)
     For Each oMat In oMCol
        MsgBox "QTY : " & oReg.Replace(oMat.Value, "$1")
     Next
     'DTM+11:20061031:102' >>>
     oReg.Pattern = "DTM\+\w+\:(\d{4})(\d{2})(\d{2})\:\d+'[ \t]*>>>"
     Set oMCol = oReg.Execute(Content)
     For Each oMat In oMCol
        MsgBox "DTM : " & oReg.Replace(oMat.Value, "$3") & "/" & oReg.Replace(oMat.Value, "$2") & "/" & oReg.Replace(oMat.Value, "$1")
     Next
     
          'Getting NAD+DP+::+123 Made Up Street :Made Up Town::Ma...
     oReg.Pattern = "NAD\+\w+\+\:{2}\+(.+?)\:(.+?)\:{2}(.+?)\:\+(.+?)\:(.+?)\:{3}\+{4}(.+?)'[ \t]*>>>"
     Set oMCol = oReg.Execute(Content)
     Dim i As Integer
     Dim sData, sSep
     For Each oMat In oMCol
        sSep = ""
        sData = ""
        For i = 1 To 6
            sData = sData & sSep & oReg.Replace(oMat.Value, "$" & i)
            sSep = ","
        Next
        MsgBox "NAD : " & sData
     Next
End Sub
Private Function GetFileContent(ByVal sFile As String) As String
Dim oFso, oText
    Set oFso = CreateObject("Scripting.FileSystemObject")
    If Not oFso.FileExists(sFile) Then
        MsgBox "File does not exist", vbExclamation
        Exit Function
    End If
    Set oText = oFso.OpenTextFile(sFile)
    GetFileContent = oText.ReadAll
    oText.Close
    Set oFso = Nothing
End Function
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
can u post the sample file content you tried
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
or u can upload file to
http://www.ee-stuff.com/
against your question
0
 
LVL 3

Author Comment

by:sam_norian
Comment Utility
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
Oops
there is no string  ">>>" in the file
i thought we need to take only lines ending with >>>
0
 
LVL 14

Accepted Solution

by:
Shiju Sasidharan earned 500 total points
Comment Utility
ok try this

Private Sub Command1_Click()
Dim oReg, oMat, oMCol, Content, sFile
Dim i As Integer
Dim sData, sSep
    Set oReg = CreateObject("VBScript.RegExp")
    oReg.IgnoreCase = True
    oReg.Global = True
    sFile = "c:\23-10-06.txt" '<----- "Modify your file path
    Content = GetFileContent(sFile)
    Content = Content & vbCrLf
    'Getting IMD+++DU:::NAVY BED' >>>
     oReg.Pattern = "IMD\+{3}\w+\:{3}(.*?)'[ \t]*\r"
     Set oMCol = oReg.Execute(Content)
     sData = "":  sSep = ""
     For Each oMat In oMCol
        sData = sData & sSep & "IMD : " & oReg.Replace(oMat.Value, "$1")
        sSep = vbCrLf
     Next
     If sData <> "" Then MsgBox sData
     'Getting QTY+21:1.00' >>>i just want '1.00'
     oReg.Pattern = "QTY\+\d+\:(.+?)'[ \t]*>>>"
     Set oMCol = oReg.Execute(Content)
     sData = "": sSep = ""
     For Each oMat In oMCol
        sData = sData & sSep & "QTY : " & oReg.Replace(oMat.Value, "$1")
        sSep = vbCrLf
     Next
     If sData <> "" Then MsgBox sData
     'DTM+11:20061031:102'>>>
     oReg.Pattern = "DTM\+\w+\:(\d{4})(\d{2})(\d{2})\:\d+'[ \t]*\r"
     Set oMCol = oReg.Execute(Content)
     sData = "": sSep = ""
     For Each oMat In oMCol
        sData = sData & sSep & "DTM : " & oReg.Replace(oMat.Value, "$3") & "/" & oReg.Replace(oMat.Value, "$2") & "/" & oReg.Replace(oMat.Value, "$1")
        sSep = vbCrLf
     Next
     If sData <> "" Then MsgBox sData
     'Getting NAD+DP+::+123 Made Up Street :Made Up Town::Ma...
     oReg.Pattern = "NAD\+\w+\+\:{2}\+(.+?)\:(.+?)\:{2}(.+?)\:\+(.+?)\:(.+?)\:{3}\+{4}(.+?)'[ \t]*\r"
     Set oMCol = oReg.Execute(Content)
     For Each oMat In oMCol
        sSep = ""
        sData = ""
        For i = 1 To 6
            sData = sData & sSep & oReg.Replace(oMat.Value, "$" & i)
            sSep = ","
        Next
        MsgBox "NAD : " & sData
     Next
End Sub
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
sorry , missed out one line

pls change
           oReg.Pattern = "QTY\+\d+\:(.+?)'[ \t]*>>>"

to
           oReg.Pattern = "QTY\+\d+\:(.+?)'[ \t]*\r"

0
 
LVL 3

Author Comment

by:sam_norian
Comment Utility
Ok brilliant that seems to work.

Thank you very much for all your help
0
 
LVL 14

Expert Comment

by:Shiju Sasidharan
Comment Utility
:-)
Thank you for the grade
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

Introduction While answering a recent question about filtering a custom class collection, I realized that this could be accomplished with very little code by using the ScriptControl (SC) library.  This article will introduce you to the SC library a…
Have you ever wanted to restrict the users input in a textbox to numbers, and while doing that make sure that they can't 'cheat' by pasting in non-numeric text? Of course you can do that with code you write yourself but it's tedious and error-prone …
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now