XMLHTTP to scrape specific content

Posted on 2006-05-05
Last Modified: 2008-03-17
I'm using XMLHTTP to pull in a page and scrape some specific content... but having trouble doing so... is there a way I could load the content and rip out everything but the content living within a certain div?

<div id='cal'>
some code

Question by:just1coder
    LVL 12

    Expert Comment

    This is definitely possible, but it is hard to say the best way to go about it without seeing the entire block of code you are scraping.  Additionally, it will be prone to breaking if the name of the div id changes or other aspects of the page dramatically and you are not aware of it.

    Can you post the entire block of scraped content?

    LVL 15

    Expert Comment

    whats the exact problem you are facing , i mean any error??
    LVL 22

    Expert Comment

    regex would be a good tool to use.  do you want the div tag with it or just the text between the tags?
    LVL 2

    Author Comment

    just the text between would be perfect.
    LVL 22

    Accepted Solution

    <div[ ]id='cal'>\s*([^<]+?)\s*</div>

    that pattern should work.  are you familiar with regular expression use?  this would be specific to that div tag though.

    <div[ ]id='?(.+?)'?>\s*([^<]+?)\s*</div>

    this pattern will return 2 submatches.  one will be the div tag id, the other will be the text in between.
    LVL 22

    Expert Comment

    never heard back on whether the asker knows how to use regex.  the pattern is good though.

    Featured Post

    Find Ransomware Secrets With All-Source Analysis

    Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

    Join & Write a Comment

    I would like to start this tip/trick by saying Thank You, to all who said that this could not be done, as it forced me to make sure that it could be accomplished. :) To start, I want to make sure everyone understands the importance of utilizing p…
    Have you ever needed to get an ASP script to wait for a while? I have, just to let something else happen. Or in my case, to allow other stuff to happen while I was murdering my MySQL database with an update. The Original Issue This was written…
    Sending a Secure fax is easy with eFax Corporate ( First, Just open a new email message.  In the To field, type your recipient's fax number You can even send a secure international fax — just include t…
    This video discusses moving either the default database or any database to a new volume.

    728 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    21 Experts available now in Live!

    Get 1:1 Help Now