?
Solved

using Ifilter to examine a PDF downloaded in a Twebbrowser

Posted on 2005-04-01
4
Medium Priority
?
477 Views
Last Modified: 2010-04-05

There is an excellent discussion of using an Ifilter to get the text out of a PDF document here

http://www.experts-exchange.com/Programming/Programming_Languages/Delphi/Q_20293579.html

but my problem is a bit different.

I want to extract the text of a pdf that has been downloaded in a browser (IE, Twebbrowser) but I want to know

a) when that download is complete .. can I use OnDocumentComplete or does that only work for the HTML pages

b) where the pdf is , so I can examine it. I suppose it is in a cache somewhere, but how can I find it/establish the correspondence between the original pdf url and the name in the cache?

thanks
0
Comment
Question by:Mutley2003
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 10

Accepted Solution

by:
Jacco earned 2000 total points
ID: 13687119
I have tried the following:

I monitored all events generated by the TWebBrowser. The last OnDownloadComplete should mark the correct moment but it occurs three times. Also a OnDocumentComplete occurs once. The last OnDownloadComplete is the one that comes after the OnDocumentComplete.

Then I started looking for the download PDF but it is nowhere. IE probable directly streams it to the AcrobatReader and it not on disk...

You could download the PDF using the IdHTTP component (of Indy) and save the PDF to a file and inspect it then. Let me know if you need a sample of that.

Regards Jacco
0
 

Author Comment

by:Mutley2003
ID: 13687185
Hi Jacco

I also monitored TWebBrowser events and got
onBeforeNavigate2 not busy , loading
http://www.fia.com/resources/documents/1797101136__Appendix_L_a.pdf
onDownloadBegin busy , loading
onDownloadComplete not busy , loading
onDownloadBegin busy , loading
onNavigateComplete2 busy , loading
http://www.fia.com/resources/documents/1797101136__Appendix_L_a.pdf
onCommandStateChange busy , interactive
onDownloadComplete not busy , interactive
onDocumentComplete not busy , complete
http://www.fia.com/resources/documents/1797101136__Appendix_L_a.pdf
onDocumentComplete not busy , complete
http://www.fia.com/resources/documents/1797101136__Appendix_L_a.pdf
onDownloadBegin busy , complete
onDownloadComplete not busy , complete

as you say, a whole bunch of completion events.

This reminds me of what TwebBrowser does with frames.



as for using Indy and a direct download, thanks for the idea but that won't work for what I want.

So that leaves the problem

b) where the pdf is , so I can examine it. I suppose it is in a cache somewhere, but how can I find it/establish the correspondence between the original pdf url and the name in the cache?

and you suggest that
". IE probable directly streams it to the AcrobatReader and it not on disk"

I guess that is possible and I might believe it if I had a good utility app that watched changes to the disk .. some wrapper around FindFirstChangeNotification or some such.


Also, I vaguely remember that there is a mechanism for telling IE how to handle certain filetypes .. it is not plugins, not pluggable protocols .. the name escapes me.  If I knew how that worked, then maybe I would know what IE does with PDF.


any ideas?


 
0
 
LVL 10

Expert Comment

by:Jacco
ID: 13687333
I have searched my whole C drive and found nothing. I really think the PDF exists only in memory.

Regards Jacco
0
 

Author Comment

by:Mutley2003
ID: 13714961
well, when I get some time I will monitor disk changes with FindFirstChangeNotification
and let you know what I find out
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A lot of questions regard threads in Delphi.   One of the more specific questions is how to show progress of the thread.   Updating a progressbar from inside a thread is a mistake. A solution to this would be to send a synchronized message to the…
In this tutorial I will show you how to use the Windows Speech API in Delphi. I will only cover basic functions such as text to speech and controlling the speed of the speech. SAPI Installation First you need to install the SAPI type library, th…
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses
Course of the Month8 days, 2 hours left to enroll

765 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question