Solved

Reading HTML using System.IO.StreamReader

Posted on 2008-10-18
3
647 Views
Last Modified: 2008-10-18
Hi Experts,

I am reading into my application many pages of HTML so that I can retrieve (scrape) data from them. My problem is that the data I require lies a thousand lines of HTML into the page. Having to read through these unwanted lines of code each time a data scrape is made is slowing things down. Is it possible to make an HTML page request of the server starting at line 1000 for example?
0
Comment
Question by:DColin
  • 2
3 Comments
 
LVL 2

Expert Comment

by:wmestrom
ID: 22747873
You can try if the server supports the range option in the HTTP header. Then you can specify a byte offset. The HTTP request would look something like this:

GET /somepage HTTP/1.1
Host: www.xyz.org
Range: bytes=123456-
Accept: *.*, */*

Hope this will work for you.

Greets
Willem
0
 

Author Comment

by:DColin
ID: 22748077
Hi wmestrom:

Do you know how I can use your answer with my existing code? Thanks.
        Dim MyRequest As System.Net.HttpWebRequest
        Dim MyResponse As System.Net.HttpWebResponse
        Dim MyStream As System.IO.StreamReader
 
        MyRequest = System.Net.WebRequest.Create("http://www.abc.com")
        MyResponse = MyRequest.GetResponse()
        MyStream = New System.IO.StreamReader(MyResponse.GetResponseStream())

Open in new window

0
 
LVL 2

Accepted Solution

by:
wmestrom earned 500 total points
ID: 22748477
This should work. However many servers ignore the range part... The page I used should work though.
        Dim MyRequest As System.Net.HttpWebRequest
        Dim MyResponse As System.Net.WebResponse
        Dim MyStream As System.IO.StreamReader
 
        MyRequest = DirectCast(System.Net.HttpWebRequest.Create("http://www.gnu.org/projects/dotgnu/pnetlib-doc/System/Net/HttpWebRequest.html"), System.Net.HttpWebRequest)
        MyRequest.AddRange(10000, Integer.MaxValue)
        MyResponse = MyRequest.GetResponse()
        MyStream = New System.IO.StreamReader(MyResponse.GetResponseStream())

Open in new window

0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
This Micro Tutorial will teach you how to censor certain areas of your screen. The example in this video will show a little boy's face being blurred. This will be demonstrated using Adobe Premiere Pro CS6.
This Micro Tutorial demonstrates using Microsoft Excel pivot tables, how to reverse engineer competitors' marketing strategies through backlinks.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question