asked on

Parse HTML code into access database

Hi,
I'd like to write some code that could automatically parse various HTML pages and insert the table data into an access database.

The webpages are used for tracking courier data.

You can see an example here: http://www.hdnl.co.uk/tracker.aspx?UPI=806290025850a

The tracking number is held between <strong> tags, so it shouldn't be too difficult to isolate this.
I'd like to then extract the <tr> and <td> data and dump it into a table next to the tracking number.

Then I'll be able to query this information and find consignments that aren't being delivered on time.
Any help or advice you can give will be great.

Thanks very much

ASKER CERTIFIED SOLUTION

Markus Fischer

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Markus Fischer

Addendum: in the link above I used ....documentElement.innerText. In your case it would have to be ....documentElement.innerHTML of course.
(°v°)

SOLUTION

Ryan Chong

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Markus Fischer

Once you have the file locally, this will import the central table:

DoCmd.TransferText acImportHTML, _
TableName:="<choose table name here>", _
FileName:="C:\Full Path To File\tracker.aspx.html", _
HasFieldNames:=True, _
HTMLTableName:="Home Delivery Network Limited"

(°v°)

Markus Fischer

Again, once you have the file locally, this would extract the parcel number:

Dim strLine As String

Open "C:\Full Path To File\tracker.aspx.html" For Input As #1
Do While Not EOF(1)
Line Input #1, strLine
If InStr(strLine, "<strong>") Then Exit Do
strLine = ""
Loop
Close #1

If strLine <> "" Then
strLine = Split(Split(strLine, "<strong>")(1), "</strong>")(0)
MsgBox strLine
End If

Good luck!
(°v°)

SOLUTION

rockiroads

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Gustav Brock

You can also use Excel to read the data live, no copy.
Use this iqy query:

WEB
1
http://www.hdnl.co.uk/tracker.aspx?UPI=806290025850a

Selection=1
Formatting=None
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False

Now, update this, link the worksheet to Access.

/gustav

SOLUTION

Gustav Brock

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Gustav Brock

In Html wizard select Tables, not Lists.

/gustav