Solved

look for words in TWebBrowser when completed download

Posted on 2009-05-19
7
255 Views
Last Modified: 2012-05-07
I need to look for specific words in a TWebBrowser when it has completed it's download of a website. I need to wait for the component to finished downloading all the frames from the website before i start looking for specific words in the web source for any and all frames.

The key is that the component must triger this when it is in a rest state after all frames has been downloaded.
0
Comment
Question by:Code2009
  • 3
  • 2
  • 2
7 Comments
 
LVL 4

Expert Comment

by:irishbuddha
ID: 24423657
Something along the lines of the following should do the trick:


   oToBrowser.Navigate('my web doc');

   while oToBrowser.ReadyState < READYSTATE_INTERACTIVE do

      begin

         Application.ProcessMessages;

      end;

Open in new window

0
 
LVL 4

Expert Comment

by:irishbuddha
ID: 24423716
Sorry, left out the portion for digging into the actual document's HTML. The following will give you the full  document in a simple string, which you can then parse/search or perform whatever action you are after.
function GetHTMLFromBrowser(oFromBrowser: TWebBrowser): string;

var iMyHTML : IHTMLElement;

begin

   result := '';

   if Assigned(oFromBrowser.Document) then

      begin

         iMyHTML := (oFromBrowser.Document AS IHTMLDocument2).body;

         //now, back up to the parent/full document

         while iMyHTML.parentElement <> nil do

            begin

               iMyHTML := iMyHTML.parentElement;

            end;

         result := iMyHTML.outerHTML;

      end;

end;

Open in new window

0
 
LVL 26

Expert Comment

by:EddieShipman
ID: 24426359
Try something like this. This highlights all the keywords from Edit1.Text when you click on the Find button.
Be aware that MSHTML.DLL has a bug that will cause a 800A0025E exception if the element you are trying to select is hidden. I believe I've captured the exception but if you try dbl-clicking on an item in the Listbox and it doesn't scroll into view, it is one of the ones that was hidden.
unit Unit1;
 

interface
 

uses

  Windows, SysUtils, Forms, Graphics, Controls, Dialogs, ComCtrls, ExtCtrls, Classes,

  OleCtrls, SHDocVw, StdCtrls, MSHTML;
 

type

  TForm1 = class(TForm)

    Button1: TButton;

    Edit1: TEdit;

    Button2: TButton;

    Edit2: TEdit;

    ListBox1: TListBox;

    Panel1: TPanel;

    WB: TWebBrowser;

    procedure FormCreate(Sender: TObject);

    procedure Button1Click(Sender: TObject);

    procedure Button2Click(Sender: TObject);

    procedure FormDestroy(Sender: TObject);

    procedure ListBox1DblClick(Sender: TObject);

    procedure WBDocumentComplete(Sender: TObject; const pDisp: IDispatch;

      var URL: OleVariant);

  private

    { Private declarations }

  public

    { Public declarations }

    TextRange: IHTMLTxtRange;

    ilist:     TInterfaceList;

    procedure WBLocateHighlight(WB: TWebBrowser; Text: string);

  end;
 

var

  Form1: TForm1;
 

implementation
 

{$R *.dfm}
 

procedure TForm1.FormCreate(Sender: TObject);

begin

  WB.Navigate('about:blank');

  ilist:=TInterfaceList.Create;

end;
 

procedure TForm1.WBLocateHighlight(WB: TWebBrowser; Text: string);

const

   prefix = '<span style="color:white; background-color: red;">';

   suffix = '</span>';

var

   tr: IHTMLTxtRange;

begin

   if Assigned(WB.Document) then

   begin

     tr := ((wb.Document AS IHTMLDocument2).body AS IHTMLBodyElement).createTextRange;

     while tr.findText(Text, 1, 0) do

     begin

       // this try..except..finally block keep us from getting the 800A0025E error

       // that occurs due to a bug in MSHTML.DLL when the element is hidden

       try try

       tr.select;

       except

       end;

       finally

         ilist.Add(tr.parentElement);

         ListBox1.Items.Add(tr.Text);

         tr.pasteHTML(prefix + tr.htmlText + suffix);

         tr.scrollIntoView(True);

       end;

     end;

   end;

end;
 

procedure TForm1.Button1Click(Sender: TObject);

begin

  ListBox1.Clear;

  WBLocateHighlight(WB, Edit1.Text);

  TextRange := ((WB.Document as IHTMLDocument2).Body As IHTMLBodyElement).CreateTextRange;

end;
 

procedure TForm1.Button2Click(Sender: TObject);

begin

  WB.navigate(Edit2.Text);

end;
 

procedure TForm1.FormDestroy(Sender: TObject);

begin

  ilist.Free;

end;
 

procedure TForm1.ListBox1DblClick(Sender: TObject);

var

  pe: IHTMLElement;

begin

  pe := (ilist[ListBox1.ItemIndex] as IHTMLElement);

  if pe <> nil then

  begin

    TextRange.moveToElementText(pe);

    TextRange.findText(ListBox1.Items[ListBox1.ItemIndex], 1, 0);

    // this try..except..finally block keep us from getting the 800A0025E error

    // that occurs due to a bug in MSHTML.DLL when the element is hidden

    try try

      TextRange.select;

    except

    end;

    finally

      TextRange.scrollIntoView(True);

    end;

  end;

end;
 

procedure TForm1.WBDocumentComplete(Sender: TObject;

  const pDisp: IDispatch; var URL: OleVariant);

begin

  Button1.Enabled := True;

  TextRange := ((WB.Document as IHTMLDocument2).Body As IHTMLBodyElement).CreateTextRange;

end;
 

end.
 

{DFM}

object Form1: TForm1

  Left = 304

  Top = 193

  Width = 779

  Height = 652

  Caption = 'Form1'

  Color = clBtnFace

  Font.Charset = DEFAULT_CHARSET

  Font.Color = clWindowText

  Font.Height = -11

  Font.Name = 'MS Sans Serif'

  Font.Style = []

  OldCreateOrder = False

  OnCreate = FormCreate

  OnDestroy = FormDestroy

  PixelsPerInch = 96

  TextHeight = 13

  object Button1: TButton

    Left = 8

    Top = 64

    Width = 75

    Height = 25

    Caption = 'Find'

    TabOrder = 0

    OnClick = Button1Click

  end

  object Edit1: TEdit

    Left = 104

    Top = 64

    Width = 121

    Height = 21

    TabOrder = 1

    Text = 'expert'

  end

  object Button2: TButton

    Left = 8

    Top = 24

    Width = 75

    Height = 25

    Caption = 'Navigate'

    TabOrder = 2

    OnClick = Button2Click

  end

  object Edit2: TEdit

    Left = 104

    Top = 24

    Width = 193

    Height = 21

    TabOrder = 3

    Text = 'http://www.experts-exchange.com'

  end

  object ListBox1: TListBox

    Left = 32

    Top = 112

    Width = 185

    Height = 433

    ItemHeight = 13

    TabOrder = 4

    OnDblClick = ListBox1DblClick

  end

  object Panel1: TPanel

    Left = 304

    Top = 0

    Width = 467

    Height = 625

    Align = alRight

    Anchors = [akLeft, akTop, akRight, akBottom]

    BevelOuter = bvNone

    Caption = 'Panel1'

    TabOrder = 5

    object WB: TWebBrowser

      Left = 0

      Top = 0

      Width = 467

      Height = 625

      Align = alClient

      TabOrder = 0

      OnDocumentComplete = WBDocumentComplete

      ControlData = {

        4C00000044300000984000000000000000000000000000000000000000000000

        000000004C000000000000000000000001000000E0D057007335CF11AE690800

        2B2E126208000000000000004C0000000114020000000000C000000000000046

        8000000000000000000000000000000000000000000000000000000000000000

        00000000000000000100000000000000000000000000000000000000}

    end

  end

end

Open in new window

0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 26

Expert Comment

by:EddieShipman
ID: 24426377
OH, BTW, I'm not sure if it will work with frames/IFrames and I don't think the other code will either because you can't get the source to any cross-domain frames/iframes.
0
 

Author Comment

by:Code2009
ID: 24431607
Thanks so far ... what if i want the word or sentance next to the word that i found?

If i am looking for the word name in the source code and it is found ... i would like to read the word or sentance next to that word. I will give double points for this one ... can i even do that?
0
 
LVL 4

Accepted Solution

by:
irishbuddha earned 500 total points
ID: 24431970
You can accomplish that several different ways, one of which is a simple Copy() as seen below. Beyond this, you could parse the document and pull out what you are after, but you'll need to determine a few bits of logic that identify where the next 'sentence' or string you want to extract ends so that you can extract the correct piece :



procedure TForm1.BitBtn1Click(Sender: TObject);

begin

   //

   ShowMessage('Found: "' + ExtractMyString('<html><body><h1>Simple String Extraction</h1></body></html>','<h1>','</h1>') + '"');

end;
 

function TForm1.ExtractMyString(cSource           : string;

                                cExtractStartText : string;

                                cExtractToText    : string): string;

var nStartPos : integer;

    nStopPos : integer;

begin

   result := '';

   //first, identify where you want to stop from

   //Pos() will give you the startingt position of your cExtractToText string within cSource

   //For the starting position, we'll find cExtractStartText and then start at the end of that text

   nStartPos := Pos(cExtractStartText,cSource) + Length(cExtractStartText);

   //Pos() is Case-Sensitive, just a heads up

   //if you want it to be case-insensitive for locating the string:

   //   Pos(UpperCase(cExtractStartText),UpperCase(cSource))

   //for our ending position, just find the Pos of it within our cSource string

   nStopPos  := Pos(cExtractToText,cSource);

   //next, copy out the string that is in the middle of the Start/End pieces you passed in

   result := Copy(cSource,              //source to copy from

                  nStartPos,            //starting position

                  nStopPos - nStartPos);//how many characters to copy

end;

Open in new window

0
 

Author Closing Comment

by:Code2009
ID: 31582917
Thank you. You are a star :)
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
creating manifest for my dll that called from activex 6 98
Delphi Form ownership 4 70
select query - oracle 16 91
Press three keys together and trigger a function 3 50
Have you ever had your Delphi form/application just hanging while waiting for data to load? This is the article to read if you want to learn some things about adding threads for data loading in the background. First, I'll setup a general applica…
Introduction Raise your hands if you were as upset with FireMonkey as I was when I discovered that there was no TListview.  I use TListView in almost all of my applications I've written, and I was not going to compromise by resorting to TStringGrid…
A short film showing how OnPage and Connectwise integration works.
I designed this idea while studying technology in the classroom.  This is a semester long project.  Students are asked to take photographs on a specific topic which they find meaningful, it can be a place or situation such as travel or homelessness.…

929 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now