Tools to get html table data to a local Access97 MDB?

Posted on 2000-04-05
Medium Priority
Last Modified: 2010-08-05
What would be good tools or methods to gather html table data off the web into a local Access97 MDB?

My main interest is in private web, perhaps my data but more often not. Where data is presented to user in 'readable' format, one would think a number of tools should be able to gather the data somehow. A look at a table page, then using 'save' as text or anything leaves me with nothing that looks like data.

If the table is presented by a query, and I can programatically manage the URL content to achieve the query result desired, how can I programatically retrieve the data within the table? For example:


produces a result on the screen. I'd like it imported on demand thru the automagics.  I'm thinking that M$ Visual tools ought to do that by now if not Access itself, but am fogging out on the how at the moment.

A direct link from access to the web-based table would be nice as well, but not knowing filename off-hand, even if it is and Access MDB source, even tho' it looks like I have read access permitted, I doubt a direct connect is feasible over a long haul. But am open to suggestions in that area as well, especially since the contents may be less abbreviated (I have in mind one particular column that looks to be truncated to fit on the screen, and another column that 'should be' among items in table but is not posted to web table).

How to do/ what's best tool? or are there none?

I'll post a separate Q; on identifying public web samples for this exercise to get us a common ground.
Question by:SunBow
  • 11
  • 4
LVL 24

Author Comment

ID: 2688235

 - this URL should become a source of a sample table(s) for this discussion -
LVL 54

Accepted Solution

nico5038 earned 400 total points
ID: 2689373
Nice discussion SunBow, did you also place a link on the Frontpage section of EE?

I would start looking whether it's possible to get the Frontpage library available (FPAPI 2.0 type Library is available for Office 2000) and what methods I can use on the objects.

Perhaps a work around is importing the page into Word and to manipulate it from there. In the developers toolkit there are some samples of that.

LVL 24

Author Comment

ID: 2691905
FrontPage ToolKit? I hadn't thought of that one except in global inclusion. Used once early on, I thought of it more as an organizer, but headed towards some ODBC compatibility. I'll take some time to post there later  in week. In meantime, if interested, try putting up sample? (no input there yet, maybe EE thruput problem).

Workaround to use Word is ok, or excel, provided it is automatic. I'll assume it can later get to access with no much big a deal and take from there. The appearance of saving web output to disk is no longer as clean with data that can be extracted via parsing (or import to wordpad and delete extraneous matter) as I remember. I must have slept through another web upgrade.
Easily Design & Build Your Next Website

Squarespace’s all-in-one platform gives you everything you need to express yourself creatively online, whether it is with a domain, website, or online store. Get started with your free trial today, and when ready, take 10% off your first purchase with offer code 'EXPERTS'.

LVL 25

Expert Comment

ID: 2696606
A perl routine (using the HTML::Parser module) would probably be your easiest solution.  Any HTML parser will have to be customized on a per page basis though.

It's like going through the reverse of report generation.  Pulling the data from the report, rather than generating the report via the data.  Depending upon how complex the report is (your HTML page) determines how easy it's going to be able to get the data back out.

Perl is designed with parsing in mind, so it's one of the easier tools available to do this.

If you want a sample perl parser that dumps data into an Access DB for the ebay listing or oldlook experts-exchange urls I posted, let me know and I'll write one up.
LVL 54

Expert Comment

ID: 2696830
Found last week (in another section) a question about extracting information from tag's. Can't find it agian, but it used the object model of HTML and could get the variables by name.
LVL 24

Author Comment

ID: 2709062
On parsing, my wish here is for 'automagic' of application like office type tools, even Visual Studio. But Parsing is viable backup plan if no one yet does. Since Access is destination, its VB would be preferred if viable. Isn't it amazing how PERL remains so popular! I don't know, but I assume that the PERL reference has webfeatures not available in VBA, which would then be more manual intensive effort.

I had hit up some help and book refs on Microsoft wares pre-2K, and found the increase html support rather anemic in this respect, like "Create your own page" would be only allowing anather definition of link or use of tag like filenmae. I had hope either I missed the right document or that there are upgrades that may handle some of this.

I did follow nico5038 and posted ref to here in another topic, but I may have messed it, placing in more general web than specific product.

I thought I'd seen ActiveServer topics/questions here in Access, so I hoped an answer would lie here, since Access in intended destination.

I know I can use mouse to swipe over webtable content and move it to Access. It seems just too magical to be expected to work, but it remains manual. One would think if copy/paste can do it, a more expen$ive application like office should be able to. (inc. Visual tools/access/etc., M$ web-enabled apps)
LVL 24

Author Comment

ID: 2818506
Elsewhere I see at least some references to folk running word counters on EE threads. So the way this is going, results to date less than satisfying.

Lest it get autodeleted, should I give it a real 'low' grade?

(any final remarks?)
LVL 24

Author Comment

ID: 2818523
The dupl on ASP topic also dead thread:
LVL 54

Expert Comment

ID: 2818645
Sorry the HTML model was no solution.
Just kill this question fast, so it won't have to suffer until autodelete !
LVL 24

Author Comment

ID: 2819440
I don't want it die, but was tempted.

I did revisit how you mentioned FrontPage, and I posted to ASP = not same, so earlier today I did finally post to FrontPage topic, so I have to let that run a course now:

LVL 24

Author Comment

ID: 2901953
No news but this.  I just closed the Q# on URL suggestions, so's you don't have to pay more for the paq, here's the winning URL:
clockwatcher> The oldlook on experts-exchange is another fairly simple one.

LVL 24

Author Comment

ID: 2902022
Comment accepted as answer
LVL 24

Author Comment

ID: 2902023
Much disappointment here, seems I got nowhere, but reluctant to let it slip into the big bit_bucket in the sky, preferring to save as reference for some future retry.  Also disappointed in the grading scale, I thought there were more than three letters to pick from, and that desciptors were shown to help decide grade, like one answer is complete in itself, or another is very vague and I had to do about all the research and work myself.

I am sure there is more available on this, just could not get it in.  There's some EE references to people using MS_word to count words here by user, there's other references to people running their own top_ten or top_100, whatever. So I feel safe in assuming this should have gone better and remains unsatisfactory.  So let no newbie misunderstand my acceptance to PAQ this - it ain't over, ain't that good yet.

And now there's:


apparently some help in editing, perhaps controlling the viewer.  There surely must be more to be had.

nico5038 gave the best shots, and since I cannot figure out how to grade today, but want some closure, I choose rather to up the grade based on contributions noticed elsewhere, tho' I cannot be precise at the moment.  It was noticed, and as the wheel spins, it does come around. That's my view and I'm welcome to it.

and I do hope we get to return to the issue some day, I think it remains important.
LVL 54

Expert Comment

ID: 2902185

That's a long time ago !

But you are a "bit lucky" as far as the extracting of text between tag's in Word is involved. (I assume you can get the sourcecode of a page available)
Last week I created a macro that sets the heading, but the main body is usefull as it's isolating a start and an end tag in a text file.

From the comment the macro text:
(so you don't have to buy it)

Sub Macro1()

Dim myRange As Range
Dim strStart As Integer
Dim strEnd As Integer
    SendKeys "^{HOME}"
    With Selection.Find
        .Text = "<Headline>"
        .Replacement.Text = "" 
        .Forward = True
        .Wrap = wdFindAsk
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Extend Character:=">"
    Set myRange = Selection.Range
    If Selection.Find.Found Then
        'replace tag's and CR's
        myRange.Find.Execute findtext:="<heading 1>", Replace:=wdReplaceAll, replacewith:=""
        myRange.Find.Execute findtext:="</heading 1>", Replace:=wdReplaceAll, replacewith:=""
        myRange.Find.Execute findtext:=vbCr, Replace:=wdReplaceAll, replacewith:=" " 
        myRange.Style = ActiveDocument.Styles("Headline")
        GoTo lblStart
    End If
End Sub

The question was: http://www.experts-exchange.com/jsp/qShow.jsp?ta=msoffice&qid=10382376 

It's not much but all small parts might add up !

Success SunBow !

And thanks !

LVL 24

Author Comment

ID: 2902627
> It's not much but all small parts might add up !

yeh, that's where I went, I agree. The goal is to hit a database directly as much as I can, as authorized as I can. I 'assume' some kind of read-access available, but more is unknown. One target has little that looks like html. I can copy the screen to clipboard and move parts, but html 'source' shown, and even the html 'save' commands are anemic in results compared to simple clipboard.

But anything's better'n nothing.  To run a mile begins with the first step, so mucho gracias, I concur.  But deferring the link click til later in week, today I've need to do else, beginning with something foamy while my beely awaits the quality proteins.

If you missed above test/link, I'll likely run the thing on EE, perhaps even this page for 1st go at it.

Sorry I do not have the EE link handy where I got the prior url. I think from lounge, with many in EE commenting favorably on the client, if you get into those alternative styles, like delphi (now added to my 'list').
LVL 24

Author Comment

ID: 2902665
(had notify = "on" so just received the ref.link):

has the ee discussion on that client

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Sometimes MS breaks things just for fun... In Access 2003, only the maximum allowable SQL string length could cause problems as you built a recordset. Now, when using string data in a WHERE clause, the 'identifier' maximum is 128 characters. So, …
A Case Study of using the Windows API to provide RS232 communications capability in Access without the use of Active-X controls.
Have you created a query with information for a calendar? ... and then, abra-cadabra, the calendar is done?! I am going to show you how to make that happen. Visualize your data!  ... really see it To use the code to create a calendar from a q…
This lesson discusses how to use a Mainform + Subforms in Microsoft Access to find and enter data for payments on orders. The sample data comes from a custom shop that builds and sells movable storage structures that are delivered to your property. …

569 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question