Solved

CHtmlView::GetSource() Problem Retrieving Full Source

Posted on 2009-05-05
4
944 Views
Last Modified: 2013-11-20
When I CHtmlView::Navigate2(..) to a particular Web page, a subsequent call to CHtmlView::GetSource(..) from CHtmlView::OnNavigateComplete2() does not retrieve the full HTML source code. Instead, it is cut short. That is, only some of the source is retrieved. For example, if I open up IE to the web page and View->Source, and the source is, say, 100kB, then calling CHtmlView::GetSource(..) after navigating to the same page retrieves something < 100kB.

Any ideas on how to get the full source? Thanks
0
Comment
Question by:Raymun
  • 3
4 Comments
 
LVL 49

Accepted Solution

by:
DanRollins earned 500 total points
ID: 24319732
The Microsoft documentation indicates that the two operations are functionally identical. In all of my experience, they are the same.
The only thing that comes to mind is... perhaps the anomaly relates to the containing object -- a CString. For instance, a CString will end at the first embedded NULL character. I don't know how one could get into the HTML source, but that's one thing I'd look at (in the debugger, use the memory viewer to see if there is eyeball-readable data beyond the end of the CString).
An error you might possibly be making is to to call CHtmlView::GetSource(..) before you have gotten the OnDocumentComplete() call(or, alternately, before the READYSTATE_COMPLETE status is set).
If you have checked these things, please provide a URL and I'll try an experiement to see if I can reproduce the problem.
0
 
LVL 4

Author Comment

by:Raymun
ID: 24320714
I want to add a little more info:

When I CHtmlView::Navigate2(..) to a page with a big source file, a subsequent call to CHtmlView::GetSource(..) from CMyHtmlView::OnNavigateComplete2(..) gives me a source full of scripting code and the actual HTML code (supposed to be at the end) is cut off. When I do the same procedure on a page with a small source file, I get the full code (script + HTML).

I also tried overriding GetSource(..) to use IHTMLwhatever to get the source:
CMyHtmlView::GetSource(CString &strRef)
{
   CComPtr<IDispatch> pDisp = GetHtmlDocument();
   CComPtr<IHTMLDocument2> pDoc;
   pDisp->QueryInterface(IID_HTMLDocument2, (void**)&pDoc);
   CComPtr<IHTMLElement> pBody;
   pDoc->get_body(&pBody);

   ...
}

However, after the call to get_body(..), pBody is always NULL regardless of what page I've navigated to, even though both pDisp and pDoc are NOT NULL.

Another idea i tried was overriding CHtmlView::GetSource(..) and keeping everything the same except changing
   hMemory = GlobalAlloc(GMEM_MOVEABLE, 0);
to
   hMemory = GlobalAlloc(GMEM_MOVEABLE, <some large number>);
and after testing different numbers, the returned source is still cut short.

I am running out of ideas. Please help.

DanRollins:
Thanks for the input. What two identical functions are you referring to? Anyway I will look into the CString and see what I can find. I wasn't familiar with OnDocumentComplete(..). According to Microsoft, the event is fired after finishing downloading a Web page.  I thought CHtmlView::OnNavigateComplete2(..) is called on the same event? What's the difference?

Thanks
0
 
LVL 4

Author Comment

by:Raymun
ID: 24320757
0
 
LVL 4

Author Closing Comment

by:Raymun
ID: 31578311
OnDocumentComplete() did the trick. Thanks Dan
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Suggested Solutions

Introduction: Load and Save to file, Document-View interaction inside the SDI. Continuing from the second article about sudoku.   Open the project in visual studio. From the class view select CSudokuDoc and double click to open the header …
Introduction: Dialogs (2) modeless dialog and a worker thread.  Handling data shared between threads.  Recursive functions. Continuing from the tenth article about sudoku.   Last article we worked with a modal dialog to help maintain informat…
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now