Link to home
Start Free TrialLog in
Avatar of cobybenson
cobybenson

asked on

Parse HTML code into access database

Hi,
I'd like to write some code that could automatically parse various HTML pages and insert the table data into an access database.

The webpages are used for tracking courier data.

You can see an example here: http://www.hdnl.co.uk/tracker.aspx?UPI=806290025850a

The tracking number is held between <strong> tags, so it shouldn't be too difficult to isolate this.
I'd like to then extract the <tr> and <td> data and dump it into a table next to the tracking number.

Then I'll be able to query this information and find consignments that aren't being delivered on time.
Any help or advice you can give will be great.

Thanks very much
ASKER CERTIFIED SOLUTION
Avatar of Markus Fischer
Markus Fischer
Flag of Switzerland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Addendum: in the link above I used ....documentElement.innerText. In your case it would have to be ....documentElement.innerHTML of course.
(°v°)
SOLUTION
Avatar of Ryan Chong
Ryan Chong
Flag of Singapore image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Once you have the file locally, this will import the central table:

DoCmd.TransferText acImportHTML, _
    TableName:="<choose table name here>", _
    FileName:="C:\Full Path To File\tracker.aspx.html", _
    HasFieldNames:=True, _
    HTMLTableName:="Home Delivery Network Limited"

(°v°)
Again, once you have the file locally, this would extract the parcel number:

    Dim strLine As String
   
    Open "C:\Full Path To File\tracker.aspx.html" For Input As #1
    Do While Not EOF(1)
        Line Input #1, strLine
        If InStr(strLine, "<strong>") Then Exit Do
        strLine = ""
    Loop
    Close #1
   
    If strLine <> "" Then
        strLine = Split(Split(strLine, "<strong>")(1), "</strong>")(0)
        MsgBox strLine
    End If

Good luck!
(°v°)
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You can also use Excel to read the data live, no copy.
Use this iqy query:

WEB
1
http://www.hdnl.co.uk/tracker.aspx?UPI=806290025850a

Selection=1
Formatting=None
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False

Now, update this, link the worksheet to Access.

/gustav
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
In Html wizard select Tables, not Lists.

/gustav