WebClient request excluding images and other elements

Posted on 2011-10-26
Last Modified: 2012-05-12
I would like to read a webpage using WebClient.OpenRead but for the sake of bandwidth I don't want to download any images or other elements (flash, js,...). In short, I just want the bare website, like when you disable images in IE.

How can I do this?
Question by:jiiins2
    LVL 74

    Accepted Solution

    That's exactly what you get! When your browser downloads a web page, it reads all the <img>, <embed>, <object>, etc. and if it sees that any have references to external (to the page) resources, then it sends subsequent requests for said resources. This is easily visible if you run a tool like Fiddler and examine the output. For your operation, calling OpenRead on some web page will give you exactly what you are after.

    However, one thing to note is that if you save this data off to a file and open that file in a web browser, then you are letting your web browser examine the data, and any absolute paths to resources will be requested when the browser parses the page; relative resources will fail because the (most likely) do not exist on your machine at that location. You will see placeholders for areas where a relatively-linked resource should be.
    LVL 74

    Expert Comment

    by:käµfm³d 👽
    As an example, using this code:

    Module Module1
        Sub Main()
            Dim client As New Net.WebClient()
            Dim webStream As IO.Stream = client.OpenRead("")
            Dim buffer(4096) As Byte
            Dim bytesRead As Integer
                bytesRead = webStream.Read(buffer, 0, buffer.Length)
            Loop While bytesRead > 0
            Console.WriteLine(vbNewLine & vbNewLine & "**** DONE ****")
        End Sub
    End Module

    Open in new window

    ...against the URL to your question:

    Open in new window

    ...this is what Fiddler sees:

    Fiddler and Code
    ...and this is what it sees when I go through the browser:

    Fiddler and Browser
    I've highlighted the images that were requested by my browser. As you can see, each is a request of its own. (I actually highlighted a few images that were cached on my computer. Had they not been cached, those images would have been separate requests to the web server as well.)

    Author Closing Comment

    Perfect, thanks!

    Featured Post

    Why You Should Analyze Threat Actor TTPs

    After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

    Join & Write a Comment

    Calculating holidays and working days is a function that is often needed yet it is not one found within the Framework. This article presents one approach to building a working-day calculator for use in .NET.
    It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
    Sending a Secure fax is easy with eFax Corporate ( First, Just open a new email message.  In the To field, type your recipient's fax number You can even send a secure international fax — just include t…
    This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor ( If you're looking for how to monitor bandwidth using netflow or packet s…

    729 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    18 Experts available now in Live!

    Get 1:1 Help Now