Html Screen Scrapping

Hi All ,

Please help me out of this prob I want to write the code for the application which can extract the html of any web site say ""  (just u can say when we view the source for any website) in that i am searching for the specific data . Any Tutorial,or code, or any tool that can search for that data will help me a lot .

Thanks in advance  
Who is Participating?
aherpsConnect With a Mentor Commented:

using System;
using System.Collections.Generic;
using System.Text;
using System.Net;
using System.IO;
using System.Windows.Forms;
namespace WebHelper
    public class webpage
        public string results;
        public webpage(string address)
            string strResult = "";
            WebResponse objResponse;
            WebRequest objRequest = System.Net.HttpWebRequest.Create(address);
            objResponse = objRequest.GetResponse();
            using (StreamReader sr = new StreamReader(objResponse.GetResponseStream()))
                strResult = sr.ReadToEnd();
                // Close and clean up the StreamReader
            this.results = strResult;

Open in new window

Just be warned with the above - this wont work with webpages that use AJAX as the result is taken on the initial load.  Not the subsequent data
Ray PaseurCommented:
If you have access to PHP, it's very easy.  Best, ~Ray
$html = file_get_contents('');
echo htmlentities($html);

Open in new window

Neeraj SoniSr. ArchitectCommented:
The code from aherps is perhaps thestart point to begin with. 
All you need is to write a custom parser for html and identify your landmark tags in html source. From these tang you can read the inner html or text, attribute and other values.
Even you can manipulate ajax calls by identifying their url and attempt to download partial data from those urls.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.