In, why would my webclient.DownloadString instruction produce a string that is missing several Links

Posted on 2014-08-14
Last Modified: 2016-06-21
I'm writing an application to read a website that includes a link to a spreadsheet that I need to download on a daily basis. I want to automate this.

Dim strWebPageSource As String
strWebPageSource = New System.Net.WebClient().DownloadString("websiteURL")

String is returned successfully, and parts of the web page are included. However, some of the parts are not included.

When I actually look at the web page manually and do a "VIEW SOURCE", I can see the link (<a href="...">xxx</a>) to the spreadsheet just fine. But the entire tag is missing from my string. There are other selective tags missing, also.

Is this because of some Javascript somewhere? And is there anything I can do to capture these tags?
Question by:TexanDonnaP
    LVL 74

    Expert Comment

    by:käµfm³d 👽
    Is this because of some Javascript somewhere?
    Most likely.

    And is there anything I can do to capture these tags?
    Probably not with WebClient. You may want to switch to a a WebBrowser control, which is a scaled down version of IE. The WB control has the ability to run Javascript. However, since you are basically working with an instance of IE, you may be limited by any browser-specific coding the site owner has done.
    LVL 35

    Expert Comment

    Usually VIEW SOURCE will be a good check what your code should also pick up. The source will only have the original javascript code. not the evaluated/interpreted javascript commands.
    To double check, use Wireshark and your code, and see what's captured. Also check if this is the same as what is captured with your browser (but that should be the same as your VIEW SOURCE).
    If the raw data is really different, a thing you cannot control could be some server side code (for instance, if browser type/version = x, then return this page. Or if cookie this, then return this page). But since this is server code, you can only guess what is causing the different returns (you could try to fake your response the same way as your browser, set all the headers the same).

    Author Comment

    What do you mean, "Set all the headers the same?"
    LVL 35

    Expert Comment

    Setting the headers the same is ONLY useful IF:

    You capture data from your code: returns OTHER code then the captured data from your browser.

    In this case, to check if the server is indeed sending other data depending on headers, look at the full headers sent by the browser and imitate as many as possible in your code:
    Since your line is one code to insert the webclient, you have to adjust it to enable to add parameters (make more lines, use more variables)

    myWebClient.Headers.Add("Content-Type", "application/x-www-form-urlencoded")
    myWebClient.Headers.Add("User-Agent", "Opera/9.80 (Windows NT 6.2; Win64; x64) Presto/2.12.388 Version/12.17")

    etc etc

    Then check if your code now finally gets different data.

    Accepted Solution

    Seems that neither answer here was pertinent. Found that the "source" I screen-scraped on a previous page had converted part of my site's HREF info from "&" to "&amp;". I'm a web newbie, and I didn't realize this conversion was done, and it caused my WebClient to return a partial webpage, along with an Error Message. I had missed the Error Message.
    Once I took the URL string, and replaced the "&amp;" with "&", the webpage string was returned correctly using the WebClient.
    Thanks so much to KAUFMED and KIMPUTER  for helping.

    Author Comment

    Sorry my response took so long. I had a week of vacation!

    Author Comment

    This problem is resolved.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Better Security Awareness With Threat Intelligence

    See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

    Foreword (July, 2015) Since I first wrote this article, years ago, a great many more people have begun using the internet.  They are coming online from every part of the globe, learning, reading, shopping and spending money at an ever-increasing ra…
    A Change in PHP Behavior with Session Write Short Circuit ( (Winter 2014)** With the release of PHP 5.6 the session handler changed in a way that many think should be considered a bug.  See the note …
    This video teaches viewers how to create their own website using cPanel and Wordpress. Tutorial walks users through how to set up their own domain name from tools like Domain Registrar, Hosting Account, and Wordpress. More specifically, the order in…
    Use Wufoo, an online form creation tool, to make powerful forms. Learn how to selectively show certain fields based on user input using rules to gather relevant information and data from your forms. The rules feature provides you with an opportunity…

    737 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    17 Experts available now in Live!

    Get 1:1 Help Now