• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 65
  • Last Modified:

Reading source of a web page created dynamically

I would like to be able to read the contents of a table on a web page.  
I can use the WebCient.OpenRead to read the source HTML code.  Using FireFox I can view the source and they seem to be identical.  So far so good.
Also using Firefox I can inspect an element - the table I want to read.  The ID of the table doesn't exist in the source code, in fact no tables exist in the source code of the website.  Looking at the source I think there is a frame which is being created dynamically and this frame then has the contents copied and pasted into a div.

In the source from the WebClient.OpenRead I get:   <div id="xyz_abc_div"></div>
In the browser (inspect element) I see a load more HTML with the table I want.

Question.  How can I get the contents of this dynamically filled div ?
0
AndyAinscow
Asked:
AndyAinscow
  • 3
  • 3
2 Solutions
 
Dave BaldwinFixer of ProblemsCommented:
It is more likely that the data is being loaded thru AJAX / javascript after the page is loaded.  It would not show in the view source in that case.  That is a common method these days.  The only way to get that content is to run the javascript that makes the request.
0
 
AndyAinscowFreelance programmer / ConsultantAuthor Commented:
Hmmm, thanks.

I think it is javascript (numbers of .js scripts being loaded).  I also see a function called LoadFrame with a frame and then (snippet)

                  var iframe = document.getElementById('xyz_frame');
                  iframe.src = 'path goes here';
                  iframe.onload = function(){
                        xyz_copyFrameToDiv(iframe, 'xyz_div')


>>The only way to get that content is to run the javascript that makes the request.
I understand the words but the result exists in the browser, can't I get at that.
0
 
Dave BaldwinFixer of ProblemsCommented:
The browsers run javascript and your function does not.  WebCient.OpenRead basically just reads a single file from a 'resource'.  It does not do any processing.  https://msdn.microsoft.com/en-us/library/ms144209(v=vs.110).aspx    The javascript, CSS, and images are often in other files which are not loaded with the original request.  In addition, even if you download the javascript files, you must have a javascript interpreter and the HTML DOM to make any sense out of them.  Javascript is a programming language, just reading the source file doesn't create the results.
0
Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

 
AndyAinscowFreelance programmer / ConsultantAuthor Commented:
Found what I wanted.  I can use the DocumentText property of a WebBrowserControl to get the currently used HTML of the page after the browser has run the js functions.
0
 
AndyAinscowFreelance programmer / ConsultantAuthor Commented:
Thanks for the info on how it is working in the background, it made me think of an alternative to get what I wanted.
0
 
Dave BaldwinFixer of ProblemsCommented:
You're welcome, glad to help.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now