Solved

look for words in TWebBrowser when completed download

Posted on 2009-05-19
7
253 Views
Last Modified: 2012-05-07
I need to look for specific words in a TWebBrowser when it has completed it's download of a website. I need to wait for the component to finished downloading all the frames from the website before i start looking for specific words in the web source for any and all frames.

The key is that the component must triger this when it is in a rest state after all frames has been downloaded.
0
Comment
Question by:Code2009
  • 3
  • 2
  • 2
7 Comments
 
LVL 4

Expert Comment

by:irishbuddha
Comment Utility
Something along the lines of the following should do the trick:


   oToBrowser.Navigate('my web doc');

   while oToBrowser.ReadyState < READYSTATE_INTERACTIVE do

      begin

         Application.ProcessMessages;

      end;

Open in new window

0
 
LVL 4

Expert Comment

by:irishbuddha
Comment Utility
Sorry, left out the portion for digging into the actual document's HTML. The following will give you the full  document in a simple string, which you can then parse/search or perform whatever action you are after.
function GetHTMLFromBrowser(oFromBrowser: TWebBrowser): string;

var iMyHTML : IHTMLElement;

begin

   result := '';

   if Assigned(oFromBrowser.Document) then

      begin

         iMyHTML := (oFromBrowser.Document AS IHTMLDocument2).body;

         //now, back up to the parent/full document

         while iMyHTML.parentElement <> nil do

            begin

               iMyHTML := iMyHTML.parentElement;

            end;

         result := iMyHTML.outerHTML;

      end;

end;

Open in new window

0
 
LVL 26

Expert Comment

by:EddieShipman
Comment Utility
Try something like this. This highlights all the keywords from Edit1.Text when you click on the Find button.
Be aware that MSHTML.DLL has a bug that will cause a 800A0025E exception if the element you are trying to select is hidden. I believe I've captured the exception but if you try dbl-clicking on an item in the Listbox and it doesn't scroll into view, it is one of the ones that was hidden.
unit Unit1;
 

interface
 

uses

  Windows, SysUtils, Forms, Graphics, Controls, Dialogs, ComCtrls, ExtCtrls, Classes,

  OleCtrls, SHDocVw, StdCtrls, MSHTML;
 

type

  TForm1 = class(TForm)

    Button1: TButton;

    Edit1: TEdit;

    Button2: TButton;

    Edit2: TEdit;

    ListBox1: TListBox;

    Panel1: TPanel;

    WB: TWebBrowser;

    procedure FormCreate(Sender: TObject);

    procedure Button1Click(Sender: TObject);

    procedure Button2Click(Sender: TObject);

    procedure FormDestroy(Sender: TObject);

    procedure ListBox1DblClick(Sender: TObject);

    procedure WBDocumentComplete(Sender: TObject; const pDisp: IDispatch;

      var URL: OleVariant);

  private

    { Private declarations }

  public

    { Public declarations }

    TextRange: IHTMLTxtRange;

    ilist:     TInterfaceList;

    procedure WBLocateHighlight(WB: TWebBrowser; Text: string);

  end;
 

var

  Form1: TForm1;
 

implementation
 

{$R *.dfm}
 

procedure TForm1.FormCreate(Sender: TObject);

begin

  WB.Navigate('about:blank');

  ilist:=TInterfaceList.Create;

end;
 

procedure TForm1.WBLocateHighlight(WB: TWebBrowser; Text: string);

const

   prefix = '<span style="color:white; background-color: red;">';

   suffix = '</span>';

var

   tr: IHTMLTxtRange;

begin

   if Assigned(WB.Document) then

   begin

     tr := ((wb.Document AS IHTMLDocument2).body AS IHTMLBodyElement).createTextRange;

     while tr.findText(Text, 1, 0) do

     begin

       // this try..except..finally block keep us from getting the 800A0025E error

       // that occurs due to a bug in MSHTML.DLL when the element is hidden

       try try

       tr.select;

       except

       end;

       finally

         ilist.Add(tr.parentElement);

         ListBox1.Items.Add(tr.Text);

         tr.pasteHTML(prefix + tr.htmlText + suffix);

         tr.scrollIntoView(True);

       end;

     end;

   end;

end;
 

procedure TForm1.Button1Click(Sender: TObject);

begin

  ListBox1.Clear;

  WBLocateHighlight(WB, Edit1.Text);

  TextRange := ((WB.Document as IHTMLDocument2).Body As IHTMLBodyElement).CreateTextRange;

end;
 

procedure TForm1.Button2Click(Sender: TObject);

begin

  WB.navigate(Edit2.Text);

end;
 

procedure TForm1.FormDestroy(Sender: TObject);

begin

  ilist.Free;

end;
 

procedure TForm1.ListBox1DblClick(Sender: TObject);

var

  pe: IHTMLElement;

begin

  pe := (ilist[ListBox1.ItemIndex] as IHTMLElement);

  if pe <> nil then

  begin

    TextRange.moveToElementText(pe);

    TextRange.findText(ListBox1.Items[ListBox1.ItemIndex], 1, 0);

    // this try..except..finally block keep us from getting the 800A0025E error

    // that occurs due to a bug in MSHTML.DLL when the element is hidden

    try try

      TextRange.select;

    except

    end;

    finally

      TextRange.scrollIntoView(True);

    end;

  end;

end;
 

procedure TForm1.WBDocumentComplete(Sender: TObject;

  const pDisp: IDispatch; var URL: OleVariant);

begin

  Button1.Enabled := True;

  TextRange := ((WB.Document as IHTMLDocument2).Body As IHTMLBodyElement).CreateTextRange;

end;
 

end.
 

{DFM}

object Form1: TForm1

  Left = 304

  Top = 193

  Width = 779

  Height = 652

  Caption = 'Form1'

  Color = clBtnFace

  Font.Charset = DEFAULT_CHARSET

  Font.Color = clWindowText

  Font.Height = -11

  Font.Name = 'MS Sans Serif'

  Font.Style = []

  OldCreateOrder = False

  OnCreate = FormCreate

  OnDestroy = FormDestroy

  PixelsPerInch = 96

  TextHeight = 13

  object Button1: TButton

    Left = 8

    Top = 64

    Width = 75

    Height = 25

    Caption = 'Find'

    TabOrder = 0

    OnClick = Button1Click

  end

  object Edit1: TEdit

    Left = 104

    Top = 64

    Width = 121

    Height = 21

    TabOrder = 1

    Text = 'expert'

  end

  object Button2: TButton

    Left = 8

    Top = 24

    Width = 75

    Height = 25

    Caption = 'Navigate'

    TabOrder = 2

    OnClick = Button2Click

  end

  object Edit2: TEdit

    Left = 104

    Top = 24

    Width = 193

    Height = 21

    TabOrder = 3

    Text = 'http://www.experts-exchange.com'

  end

  object ListBox1: TListBox

    Left = 32

    Top = 112

    Width = 185

    Height = 433

    ItemHeight = 13

    TabOrder = 4

    OnDblClick = ListBox1DblClick

  end

  object Panel1: TPanel

    Left = 304

    Top = 0

    Width = 467

    Height = 625

    Align = alRight

    Anchors = [akLeft, akTop, akRight, akBottom]

    BevelOuter = bvNone

    Caption = 'Panel1'

    TabOrder = 5

    object WB: TWebBrowser

      Left = 0

      Top = 0

      Width = 467

      Height = 625

      Align = alClient

      TabOrder = 0

      OnDocumentComplete = WBDocumentComplete

      ControlData = {

        4C00000044300000984000000000000000000000000000000000000000000000

        000000004C000000000000000000000001000000E0D057007335CF11AE690800

        2B2E126208000000000000004C0000000114020000000000C000000000000046

        8000000000000000000000000000000000000000000000000000000000000000

        00000000000000000100000000000000000000000000000000000000}

    end

  end

end

Open in new window

0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 26

Expert Comment

by:EddieShipman
Comment Utility
OH, BTW, I'm not sure if it will work with frames/IFrames and I don't think the other code will either because you can't get the source to any cross-domain frames/iframes.
0
 

Author Comment

by:Code2009
Comment Utility
Thanks so far ... what if i want the word or sentance next to the word that i found?

If i am looking for the word name in the source code and it is found ... i would like to read the word or sentance next to that word. I will give double points for this one ... can i even do that?
0
 
LVL 4

Accepted Solution

by:
irishbuddha earned 500 total points
Comment Utility
You can accomplish that several different ways, one of which is a simple Copy() as seen below. Beyond this, you could parse the document and pull out what you are after, but you'll need to determine a few bits of logic that identify where the next 'sentence' or string you want to extract ends so that you can extract the correct piece :



procedure TForm1.BitBtn1Click(Sender: TObject);

begin

   //

   ShowMessage('Found: "' + ExtractMyString('<html><body><h1>Simple String Extraction</h1></body></html>','<h1>','</h1>') + '"');

end;
 

function TForm1.ExtractMyString(cSource           : string;

                                cExtractStartText : string;

                                cExtractToText    : string): string;

var nStartPos : integer;

    nStopPos : integer;

begin

   result := '';

   //first, identify where you want to stop from

   //Pos() will give you the startingt position of your cExtractToText string within cSource

   //For the starting position, we'll find cExtractStartText and then start at the end of that text

   nStartPos := Pos(cExtractStartText,cSource) + Length(cExtractStartText);

   //Pos() is Case-Sensitive, just a heads up

   //if you want it to be case-insensitive for locating the string:

   //   Pos(UpperCase(cExtractStartText),UpperCase(cSource))

   //for our ending position, just find the Pos of it within our cSource string

   nStopPos  := Pos(cExtractToText,cSource);

   //next, copy out the string that is in the middle of the Start/End pieces you passed in

   result := Copy(cSource,              //source to copy from

                  nStartPos,            //starting position

                  nStopPos - nStartPos);//how many characters to copy

end;

Open in new window

0
 

Author Closing Comment

by:Code2009
Comment Utility
Thank you. You are a star :)
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
delphi exception 7 58
how to center only a line in richedit? 4 44
Delphi Form ownership 4 50
Magic Software info 18 100
This article explains how to create forms/units independent of other forms/units object names in a delphi project. Have you ever created a form for user input in a Delphi project and then had the need to have that same form in a other Delphi proj…
In this tutorial I will show you how to use the Windows Speech API in Delphi. I will only cover basic functions such as text to speech and controlling the speed of the speech. SAPI Installation First you need to install the SAPI type library, th…
Illustrator's Shape Builder tool will let you combine shapes visually and interactively. This video shows the Mac version, but the tool works the same way in Windows. To follow along with this video, you can draw your own shapes or download the file…
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

7 Experts available now in Live!

Get 1:1 Help Now