Solved

look for words in TWebBrowser when completed download

Posted on 2009-05-19
7
265 Views
Last Modified: 2012-05-07
I need to look for specific words in a TWebBrowser when it has completed it's download of a website. I need to wait for the component to finished downloading all the frames from the website before i start looking for specific words in the web source for any and all frames.

The key is that the component must triger this when it is in a rest state after all frames has been downloaded.
0
Comment
Question by:Code2009
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
7 Comments
 
LVL 4

Expert Comment

by:irishbuddha
ID: 24423657
Something along the lines of the following should do the trick:


   oToBrowser.Navigate('my web doc');
   while oToBrowser.ReadyState < READYSTATE_INTERACTIVE do
      begin
         Application.ProcessMessages;
      end;

Open in new window

0
 
LVL 4

Expert Comment

by:irishbuddha
ID: 24423716
Sorry, left out the portion for digging into the actual document's HTML. The following will give you the full  document in a simple string, which you can then parse/search or perform whatever action you are after.
function GetHTMLFromBrowser(oFromBrowser: TWebBrowser): string;
var iMyHTML : IHTMLElement;
begin
   result := '';
   if Assigned(oFromBrowser.Document) then
      begin
         iMyHTML := (oFromBrowser.Document AS IHTMLDocument2).body;
         //now, back up to the parent/full document
         while iMyHTML.parentElement <> nil do
            begin
               iMyHTML := iMyHTML.parentElement;
            end;
         result := iMyHTML.outerHTML;
      end;
end;

Open in new window

0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 24426359
Try something like this. This highlights all the keywords from Edit1.Text when you click on the Find button.
Be aware that MSHTML.DLL has a bug that will cause a 800A0025E exception if the element you are trying to select is hidden. I believe I've captured the exception but if you try dbl-clicking on an item in the Listbox and it doesn't scroll into view, it is one of the ones that was hidden.
unit Unit1;
 
interface
 
uses
  Windows, SysUtils, Forms, Graphics, Controls, Dialogs, ComCtrls, ExtCtrls, Classes,
  OleCtrls, SHDocVw, StdCtrls, MSHTML;
 
type
  TForm1 = class(TForm)
    Button1: TButton;
    Edit1: TEdit;
    Button2: TButton;
    Edit2: TEdit;
    ListBox1: TListBox;
    Panel1: TPanel;
    WB: TWebBrowser;
    procedure FormCreate(Sender: TObject);
    procedure Button1Click(Sender: TObject);
    procedure Button2Click(Sender: TObject);
    procedure FormDestroy(Sender: TObject);
    procedure ListBox1DblClick(Sender: TObject);
    procedure WBDocumentComplete(Sender: TObject; const pDisp: IDispatch;
      var URL: OleVariant);
  private
    { Private declarations }
  public
    { Public declarations }
    TextRange: IHTMLTxtRange;
    ilist:     TInterfaceList;
    procedure WBLocateHighlight(WB: TWebBrowser; Text: string);
  end;
 
var
  Form1: TForm1;
 
implementation
 
{$R *.dfm}
 
procedure TForm1.FormCreate(Sender: TObject);
begin
  WB.Navigate('about:blank');
  ilist:=TInterfaceList.Create;
end;
 
procedure TForm1.WBLocateHighlight(WB: TWebBrowser; Text: string);
const
   prefix = '<span style="color:white; background-color: red;">';
   suffix = '</span>';
var
   tr: IHTMLTxtRange;
begin
   if Assigned(WB.Document) then
   begin
     tr := ((wb.Document AS IHTMLDocument2).body AS IHTMLBodyElement).createTextRange;
     while tr.findText(Text, 1, 0) do
     begin
       // this try..except..finally block keep us from getting the 800A0025E error
       // that occurs due to a bug in MSHTML.DLL when the element is hidden
       try try
       tr.select;
       except
       end;
       finally
         ilist.Add(tr.parentElement);
         ListBox1.Items.Add(tr.Text);
         tr.pasteHTML(prefix + tr.htmlText + suffix);
         tr.scrollIntoView(True);
       end;
     end;
   end;
end;
 
procedure TForm1.Button1Click(Sender: TObject);
begin
  ListBox1.Clear;
  WBLocateHighlight(WB, Edit1.Text);
  TextRange := ((WB.Document as IHTMLDocument2).Body As IHTMLBodyElement).CreateTextRange;
end;
 
procedure TForm1.Button2Click(Sender: TObject);
begin
  WB.navigate(Edit2.Text);
end;
 
procedure TForm1.FormDestroy(Sender: TObject);
begin
  ilist.Free;
end;
 
procedure TForm1.ListBox1DblClick(Sender: TObject);
var
  pe: IHTMLElement;
begin
  pe := (ilist[ListBox1.ItemIndex] as IHTMLElement);
  if pe <> nil then
  begin
    TextRange.moveToElementText(pe);
    TextRange.findText(ListBox1.Items[ListBox1.ItemIndex], 1, 0);
    // this try..except..finally block keep us from getting the 800A0025E error
    // that occurs due to a bug in MSHTML.DLL when the element is hidden
    try try
      TextRange.select;
    except
    end;
    finally
      TextRange.scrollIntoView(True);
    end;
  end;
end;
 
procedure TForm1.WBDocumentComplete(Sender: TObject;
  const pDisp: IDispatch; var URL: OleVariant);
begin
  Button1.Enabled := True;
  TextRange := ((WB.Document as IHTMLDocument2).Body As IHTMLBodyElement).CreateTextRange;
end;
 
end.
 
{DFM}
object Form1: TForm1
  Left = 304
  Top = 193
  Width = 779
  Height = 652
  Caption = 'Form1'
  Color = clBtnFace
  Font.Charset = DEFAULT_CHARSET
  Font.Color = clWindowText
  Font.Height = -11
  Font.Name = 'MS Sans Serif'
  Font.Style = []
  OldCreateOrder = False
  OnCreate = FormCreate
  OnDestroy = FormDestroy
  PixelsPerInch = 96
  TextHeight = 13
  object Button1: TButton
    Left = 8
    Top = 64
    Width = 75
    Height = 25
    Caption = 'Find'
    TabOrder = 0
    OnClick = Button1Click
  end
  object Edit1: TEdit
    Left = 104
    Top = 64
    Width = 121
    Height = 21
    TabOrder = 1
    Text = 'expert'
  end
  object Button2: TButton
    Left = 8
    Top = 24
    Width = 75
    Height = 25
    Caption = 'Navigate'
    TabOrder = 2
    OnClick = Button2Click
  end
  object Edit2: TEdit
    Left = 104
    Top = 24
    Width = 193
    Height = 21
    TabOrder = 3
    Text = 'http://www.experts-exchange.com'
  end
  object ListBox1: TListBox
    Left = 32
    Top = 112
    Width = 185
    Height = 433
    ItemHeight = 13
    TabOrder = 4
    OnDblClick = ListBox1DblClick
  end
  object Panel1: TPanel
    Left = 304
    Top = 0
    Width = 467
    Height = 625
    Align = alRight
    Anchors = [akLeft, akTop, akRight, akBottom]
    BevelOuter = bvNone
    Caption = 'Panel1'
    TabOrder = 5
    object WB: TWebBrowser
      Left = 0
      Top = 0
      Width = 467
      Height = 625
      Align = alClient
      TabOrder = 0
      OnDocumentComplete = WBDocumentComplete
      ControlData = {
        4C00000044300000984000000000000000000000000000000000000000000000
        000000004C000000000000000000000001000000E0D057007335CF11AE690800
        2B2E126208000000000000004C0000000114020000000000C000000000000046
        8000000000000000000000000000000000000000000000000000000000000000
        00000000000000000100000000000000000000000000000000000000}
    end
  end
end

Open in new window

0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 26

Expert Comment

by:EddieShipman
ID: 24426377
OH, BTW, I'm not sure if it will work with frames/IFrames and I don't think the other code will either because you can't get the source to any cross-domain frames/iframes.
0
 

Author Comment

by:Code2009
ID: 24431607
Thanks so far ... what if i want the word or sentance next to the word that i found?

If i am looking for the word name in the source code and it is found ... i would like to read the word or sentance next to that word. I will give double points for this one ... can i even do that?
0
 
LVL 4

Accepted Solution

by:
irishbuddha earned 500 total points
ID: 24431970
You can accomplish that several different ways, one of which is a simple Copy() as seen below. Beyond this, you could parse the document and pull out what you are after, but you'll need to determine a few bits of logic that identify where the next 'sentence' or string you want to extract ends so that you can extract the correct piece :


procedure TForm1.BitBtn1Click(Sender: TObject);
begin
   //
   ShowMessage('Found: "' + ExtractMyString('<html><body><h1>Simple String Extraction</h1></body></html>','<h1>','</h1>') + '"');
end;
 
function TForm1.ExtractMyString(cSource           : string;
                                cExtractStartText : string;
                                cExtractToText    : string): string;
var nStartPos : integer;
    nStopPos : integer;
begin
   result := '';
   //first, identify where you want to stop from
   //Pos() will give you the startingt position of your cExtractToText string within cSource
   //For the starting position, we'll find cExtractStartText and then start at the end of that text
   nStartPos := Pos(cExtractStartText,cSource) + Length(cExtractStartText);
   //Pos() is Case-Sensitive, just a heads up
   //if you want it to be case-insensitive for locating the string:
   //   Pos(UpperCase(cExtractStartText),UpperCase(cSource))
   //for our ending position, just find the Pos of it within our cSource string
   nStopPos  := Pos(cExtractToText,cSource);
   //next, copy out the string that is in the middle of the Start/End pieces you passed in
   result := Copy(cSource,              //source to copy from
                  nStartPos,            //starting position
                  nStopPos - nStartPos);//how many characters to copy
end;

Open in new window

0
 

Author Closing Comment

by:Code2009
ID: 31582917
Thank you. You are a star :)
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Help on project with Soap 10 63
Delphi Firemonkey Need Sample for Online Shopping Example. 2 193
Create a path if not exists 7 108
IP without any Dots 1 56
Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…
Introduction I have seen many questions in this Delphi topic area where queries in threads are needed or suggested. I know bumped into a similar need. This article will address some of the concepts when dealing with a multithreaded delphi database…
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …
Are you ready to implement Active Directory best practices without reading 300+ pages? You're in luck. In this webinar hosted by Skyport Systems, you gain insight into Microsoft's latest comprehensive guide, with tips on the best and easiest way…

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question