extract links from html page

Posted on 2006-05-01
Last Modified: 2012-05-05

I'd like to extract all the links, e.i : <a href='/XXXXXXXXX'>More Information...</a>, in the html page and import them into a table so I can later get into each link and pull out more data on that page.

Question by:ishadowme
    LVL 17

    Expert Comment

    One way to do this is using Python ( This is free, and there is a urllib2 module for getting a web page, and an also free 3rd party module BeautifulSoup for locating and extracting tags. Then you'd write the results to a tab delimited file for import into Access.

    If you want to consider this approach , let me know.
    LVL 58

    Accepted Solution


    You would need a browser, such as M$'s WebBrowser ActiveX control. You can then navigate to the page. See examples at {http:/Q_21597349.html} and {http:/Q_21824078.html}.

    Once the page is open, the browser's document object will have:

        .links.length   -- number of links on the page
        .links(0).href   -- href of first link
        .links(0).innerText   -- displayed name of the link
        .links[0].text   -- javascript version of the above.

    Incidentally, you can get the list of  links from any page (including this one) by pasing the following into your address bar:

    javascript:function f() {d=document; t='<html><body><h2>'+d.title+'</h2> <h3>'+d.location.href+'</h3> <hr><ol>'; for(i=0;i<d.links.length;i++) {l=d.links[i]; t+='<li><b>'+l.text+'</b><br>'+l.href} d.write(t); document.close()}; f();

    If you like it, make a bookmark out of it...

    Hope this helps,

    Author Comment

    thank you

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Enabling OSINT in Activity Based Intelligence

    Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

    Overview: This article:       (a) explains one principle method to cross-reference invoice items in Quickbooks®       (b) explores the reasons one might need to cross-reference invoice items       (c) provides a sample process for creating a M…
    A simple tool to export all objects of two Access files as text and compare it with Meld, a free diff tool.
    With Microsoft Access, learn how to specify relationships between tables and set various options on the relationship. Add the tables: Create the relationship: Decide if you’re going to set referential integrity: Decide if you want cascade upda…
    In Microsoft Access, when working with VBA, learn some techniques for writing readable and easily maintained code.

    760 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    13 Experts available now in Live!

    Get 1:1 Help Now