[Webinar] Streamline your web hosting managementRegister Today

  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 376
  • Last Modified:

Tools to get html table data to a local Access97 MDB?

What would be good tools or methods to gather html table data off the web into a local Access97 MDB?

My main interest is in private web, perhaps my data but more often not. Where data is presented to user in 'readable' format, one would think a number of tools should be able to gather the data somehow. A look at a table page, then using 'save' as text or anything leaves me with nothing that looks like data.

If the table is presented by a query, and I can programatically manage the URL content to achieve the query result desired, how can I programatically retrieve the data within the table? For example:


produces a result on the screen. I'd like it imported on demand thru the automagics.  I'm thinking that M$ Visual tools ought to do that by now if not Access itself, but am fogging out on the how at the moment.

A direct link from access to the web-based table would be nice as well, but not knowing filename off-hand, even if it is and Access MDB source, even tho' it looks like I have read access permitted, I doubt a direct connect is feasible over a long haul. But am open to suggestions in that area as well, especially since the contents may be less abbreviated (I have in mind one particular column that looks to be truncated to fit on the screen, and another column that 'should be' among items in table but is not posted to web table).

How to do/ what's best tool? or are there none?

I'll post a separate Q; on identifying public web samples for this exercise to get us a common ground.
  • 11
  • 4
1 Solution
SunBowAuthor Commented:

 - this URL should become a source of a sample table(s) for this discussion -
Nice discussion SunBow, did you also place a link on the Frontpage section of EE?

I would start looking whether it's possible to get the Frontpage library available (FPAPI 2.0 type Library is available for Office 2000) and what methods I can use on the objects.

Perhaps a work around is importing the page into Word and to manipulate it from there. In the developers toolkit there are some samples of that.

SunBowAuthor Commented:
FrontPage ToolKit? I hadn't thought of that one except in global inclusion. Used once early on, I thought of it more as an organizer, but headed towards some ODBC compatibility. I'll take some time to post there later  in week. In meantime, if interested, try putting up sample? (no input there yet, maybe EE thruput problem).

Workaround to use Word is ok, or excel, provided it is automatic. I'll assume it can later get to access with no much big a deal and take from there. The appearance of saving web output to disk is no longer as clean with data that can be extracted via parsing (or import to wordpad and delete extraneous matter) as I remember. I must have slept through another web upgrade.
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

A perl routine (using the HTML::Parser module) would probably be your easiest solution.  Any HTML parser will have to be customized on a per page basis though.

It's like going through the reverse of report generation.  Pulling the data from the report, rather than generating the report via the data.  Depending upon how complex the report is (your HTML page) determines how easy it's going to be able to get the data back out.

Perl is designed with parsing in mind, so it's one of the easier tools available to do this.

If you want a sample perl parser that dumps data into an Access DB for the ebay listing or oldlook experts-exchange urls I posted, let me know and I'll write one up.
Found last week (in another section) a question about extracting information from tag's. Can't find it agian, but it used the object model of HTML and could get the variables by name.
SunBowAuthor Commented:
On parsing, my wish here is for 'automagic' of application like office type tools, even Visual Studio. But Parsing is viable backup plan if no one yet does. Since Access is destination, its VB would be preferred if viable. Isn't it amazing how PERL remains so popular! I don't know, but I assume that the PERL reference has webfeatures not available in VBA, which would then be more manual intensive effort.

I had hit up some help and book refs on Microsoft wares pre-2K, and found the increase html support rather anemic in this respect, like "Create your own page" would be only allowing anather definition of link or use of tag like filenmae. I had hope either I missed the right document or that there are upgrades that may handle some of this.

I did follow nico5038 and posted ref to here in another topic, but I may have messed it, placing in more general web than specific product.

I thought I'd seen ActiveServer topics/questions here in Access, so I hoped an answer would lie here, since Access in intended destination.

I know I can use mouse to swipe over webtable content and move it to Access. It seems just too magical to be expected to work, but it remains manual. One would think if copy/paste can do it, a more expen$ive application like office should be able to. (inc. Visual tools/access/etc., M$ web-enabled apps)
SunBowAuthor Commented:
Elsewhere I see at least some references to folk running word counters on EE threads. So the way this is going, results to date less than satisfying.

Lest it get autodeleted, should I give it a real 'low' grade?

(any final remarks?)
SunBowAuthor Commented:
The dupl on ASP topic also dead thread:
Sorry the HTML model was no solution.
Just kill this question fast, so it won't have to suffer until autodelete !
SunBowAuthor Commented:
I don't want it die, but was tempted.

I did revisit how you mentioned FrontPage, and I posted to ASP = not same, so earlier today I did finally post to FrontPage topic, so I have to let that run a course now:

SunBowAuthor Commented:
No news but this.  I just closed the Q# on URL suggestions, so's you don't have to pay more for the paq, here's the winning URL:
clockwatcher> The oldlook on experts-exchange is another fairly simple one.

SunBowAuthor Commented:
Comment accepted as answer
SunBowAuthor Commented:
Much disappointment here, seems I got nowhere, but reluctant to let it slip into the big bit_bucket in the sky, preferring to save as reference for some future retry.  Also disappointed in the grading scale, I thought there were more than three letters to pick from, and that desciptors were shown to help decide grade, like one answer is complete in itself, or another is very vague and I had to do about all the research and work myself.

I am sure there is more available on this, just could not get it in.  There's some EE references to people using MS_word to count words here by user, there's other references to people running their own top_ten or top_100, whatever. So I feel safe in assuming this should have gone better and remains unsatisfactory.  So let no newbie misunderstand my acceptance to PAQ this - it ain't over, ain't that good yet.

And now there's:


apparently some help in editing, perhaps controlling the viewer.  There surely must be more to be had.

nico5038 gave the best shots, and since I cannot figure out how to grade today, but want some closure, I choose rather to up the grade based on contributions noticed elsewhere, tho' I cannot be precise at the moment.  It was noticed, and as the wheel spins, it does come around. That's my view and I'm welcome to it.

and I do hope we get to return to the issue some day, I think it remains important.

That's a long time ago !

But you are a "bit lucky" as far as the extracting of text between tag's in Word is involved. (I assume you can get the sourcecode of a page available)
Last week I created a macro that sets the heading, but the main body is usefull as it's isolating a start and an end tag in a text file.

From the comment the macro text:
(so you don't have to buy it)

Sub Macro1()

Dim myRange As Range
Dim strStart As Integer
Dim strEnd As Integer
    SendKeys "^{HOME}"
    With Selection.Find
        .Text = "<Headline>"
        .Replacement.Text = "" 
        .Forward = True
        .Wrap = wdFindAsk
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Extend Character:=">"
    Set myRange = Selection.Range
    If Selection.Find.Found Then
        'replace tag's and CR's
        myRange.Find.Execute findtext:="<heading 1>", Replace:=wdReplaceAll, replacewith:=""
        myRange.Find.Execute findtext:="</heading 1>", Replace:=wdReplaceAll, replacewith:=""
        myRange.Find.Execute findtext:=vbCr, Replace:=wdReplaceAll, replacewith:=" " 
        myRange.Style = ActiveDocument.Styles("Headline")
        GoTo lblStart
    End If
End Sub

The question was: http://www.experts-exchange.com/jsp/qShow.jsp?ta=msoffice&qid=10382376 

It's not much but all small parts might add up !

Success SunBow !

And thanks !

SunBowAuthor Commented:
> It's not much but all small parts might add up !

yeh, that's where I went, I agree. The goal is to hit a database directly as much as I can, as authorized as I can. I 'assume' some kind of read-access available, but more is unknown. One target has little that looks like html. I can copy the screen to clipboard and move parts, but html 'source' shown, and even the html 'save' commands are anemic in results compared to simple clipboard.

But anything's better'n nothing.  To run a mile begins with the first step, so mucho gracias, I concur.  But deferring the link click til later in week, today I've need to do else, beginning with something foamy while my beely awaits the quality proteins.

If you missed above test/link, I'll likely run the thing on EE, perhaps even this page for 1st go at it.

Sorry I do not have the EE link handy where I got the prior url. I think from lounge, with many in EE commenting favorably on the client, if you get into those alternative styles, like delphi (now added to my 'list').
SunBowAuthor Commented:
(had notify = "on" so just received the ref.link):

has the ee discussion on that client

Featured Post

Receive 1:1 tech help

Solve your biggest tech problems alongside global tech experts with 1:1 help.

  • 11
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now