Solved

PARER QUESTION! QUICK EASY POINTS!

Posted on 1998-03-26
5
137 Views
Last Modified: 2010-04-06
I need an example of a parser to parse information pulled off the web.  Can someone please give a good example of that? Such as image, links, pictures, and mailto etc...  I need asap if possible and will possibly increase points for a good example.
0
Comment
Question by:aj85
  • 3
  • 2
5 Comments
 
LVL 4

Expert Comment

by:d003303
ID: 1360879
You get an extendable, good working parser code with sample app for 200 points.
0
 

Author Comment

by:aj85
ID: 1360880
Thanks, I have increased points to 200.

Tony
0
 
LVL 4

Accepted Solution

by:
d003303 earned 200 total points
ID: 1360881
OK, here we go.
This code is a generic parser that searches for a HTML tag. A HTML tag is mainly defined that the tag keyword is embedded in '<' and '>' characters. This is how the function GetBlockFromBuffer works.
The GetTagOptionValues procedure searches a PChar buffer for a tag with the specified tag option and adds all found values to the string list.
The demo app uses a memo where you can paste some HTML source in. In the two edit fields, put in e.g.
Tag To Find : a
Extract From Tag : href
(be sure to get NO spaces in). Press Go! and the list box will be filled with all hrefs in the HTML source. You could also use
Tag To Find : img
Extract From Tag : src
or anything you like. Now here's the source.

////////////////////////
// to be _tagfind.pas

unit _tagfind;

interface

uses
  Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
  StdCtrls;

type
  TForm1 = class(TForm)
    Memo1: TMemo;
    ListBox1: TListBox;
    Button1: TButton;
    Edit1: TEdit;
    Label1: TLabel;
    Label2: TLabel;
    Edit2: TEdit;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.DFM}

function GetBlockFromBuffer(SearchBuffer, StartTag, KeyTag, EndTag : PChar; var Buffer : string): PChar;
var BufPtr,
    TagStartPtr,
    TagKeyPtr,
    TagEndPtr    : PChar;
    SafeChar     : Char;
begin
  Buffer := '';
  SafeChar := #0;
  TagEndPtr := nil;
  Result := nil;
  BufPtr := SearchBuffer;
  repeat
  // search for start tag
  TagStartPtr := StrPos(BufPtr, StartTag);
  if (TagStartPtr <> nil) and (TagStartPtr[1] <> #0) then
   begin
     // search for end of tag
     TagEndPtr := StrPos(@TagStartPtr[1], EndTag);
     if TagEndPtr <> nil then
      begin
        // tag did close before EOF
        if TagEndPtr[1] <> #0 then
         begin
           // terminate string temporarily
           SafeChar := TagEndPtr[1];
           TagEndPtr[1] := #0;
           BufPtr := @TagEndPtr[1];
         end
        else BufPtr := nil;
        // search for key tag
        TagKeyPtr := StrPos(TagStartPtr, KeyTag);
        if TagKeyPtr <> nil then
         begin
           Result := BufPtr;
           Buffer := StrPas(TagStartPtr);
         end;
        // undo temporarily terminated string
        if BufPtr <> nil
         then TagEndPtr[1] := SafeChar;
      end;
   end;
  until (Buffer <> '') or (TagStartPtr = nil) or (TagEndPtr = nil);
end;

procedure GetTagOptionValues(Buffer : PChar; Tag, Option : string; Strings : TStrings);
var Index       : Integer;
    BufPtr      : PChar;
    CR,
    LF,
    FoundTag,
    FoundOption : string;
begin
  CR := Chr($0A);
  LF := Chr($0D);
  // lowercase
  StrLower(Buffer);
  // kick out CR/LF
  for Index := 0 to StrLen(Buffer) do
   if (Buffer[Index] = CR) or (Buffer[Index] = LF)
    then Buffer[Index] := ' ';
  // truncate multiple spaces, this takes most of the processing time
  // this could also be done before the function is called if you search
  // for several tags, like A HREF, IMG SRC, etc.
  Index := 0;
  while Index < StrLen(Buffer) do
   if (Buffer[Index] = #32) and (Buffer[Index + 1] = #32)
    then StrCopy(@Buffer[Index], @Buffer[Index + 1])
    else Inc(Index);

  BufPtr := Buffer;
  repeat
    // first step : get matching tag
    // added a trailing space because we search for options in a tag
    BufPtr := GetBlockFromBuffer(BufPtr, '<', PChar(Tag + ' '), '>', FoundTag);
    if FoundTag <> '' then
     // second step : validate tag
     if (FoundTag[Pos(Tag, FoundTag) -1] = ' ')
      or (FoundTag[Pos(Tag, FoundTag) -1] = '<') then
      begin
       // tag found, options have to be space limited and end with space or >
       GetBlockFromBuffer(PChar(FoundTag), ' ', PChar(Option), ' ', FoundOption);
       if FoundOption = ''
        then GetBlockFromBuffer(PChar(FoundTag), ' ', PChar(Option), '>', FoundOption);
       if FoundOption <> '' then
        begin
          // trim start and end tag fields
          Index := Pos(Option, FoundOption);
          FoundOption := Copy(FoundOption, Index - 1, Length(FoundOption) - Index + 1);
          Strings.Add(FoundOption);
        end;
     end;
  until BufPtr = nil;
end;

procedure TForm1.Button1Click(Sender: TObject);
var TextBuf   : PChar;
    MemHandle : HGlobal;
begin
  Enabled := false;
  Screen.Cursor := crHourglass;
  Memo1.SelectAll;
  MemHandle := GlobalAlloc(GMEM_MOVEABLE, Memo1.SelLength + 1);
  TextBuf := GlobalLock(MemHandle);
  try
    Memo1.GetSelTextBuf(TextBuf, Memo1.SelLength);
    // terminate string
    TextBuf[Memo1.SelLength] := #0;
    GetTagOptionValues(TextBuf, Edit1.Text, Edit2.Text, ListBox1.Items);
  finally
    GlobalUnLock(MemHandle);
    GlobalFree(MemHandle);
    Screen.Cursor := crDefault;
    Enabled := true;
  end;
end;

end.

////////////////////////
// to be _tagfind.dfm

object Form1: TForm1
  Left = 200
  Top = 108
  Width = 820
  Height = 478
  Caption = 'Form1'
  Font.Charset = DEFAULT_CHARSET
  Font.Color = clWindowText
  Font.Height = -11
  Font.Name = 'MS Sans Serif'
  Font.Style = []
  PixelsPerInch = 96
  TextHeight = 13
  object Label1: TLabel
    Left = 296
    Top = 360
    Width = 54
    Height = 13
    Caption = 'Tag to find:'
  end
  object Label2: TLabel
    Left = 296
    Top = 384
    Width = 77
    Height = 13
    Caption = 'Extract from tag:'
  end
  object Memo1: TMemo
    Left = 8
    Top = 12
    Width = 277
    Height = 425
    TabOrder = 0
  end
  object ListBox1: TListBox
    Left = 296
    Top = 12
    Width = 509
    Height = 329
    ItemHeight = 13
    TabOrder = 1
  end
  object Button1: TButton
    Left = 296
    Top = 412
    Width = 75
    Height = 25
    Caption = 'Go!'
    TabOrder = 2
    OnClick = Button1Click
  end
  object Edit1: TEdit
    Left = 380
    Top = 356
    Width = 121
    Height = 21
    TabOrder = 3
  end
  object Edit2: TEdit
    Left = 380
    Top = 380
    Width = 121
    Height = 21
    TabOrder = 4
  end
end

////////////////////////
// to be tagfind.dpr

program tagfind;

uses
  Forms,
  _tagfind in '_tagfind.pas' {Form1};

{$R *.RES}

begin
  Application.Initialize;
  Application.CreateForm(TForm1, Form1);
  Application.Run;
end.

////////////////////////

Have fun,
Slash/d003303
0
 

Author Comment

by:aj85
ID: 1360882
Thanks very good answer...
0
 
LVL 4

Expert Comment

by:d003303
ID: 1360883
Thx, have fun with it !
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Delphi cmd execution 6 62
URL for downloading Google Chrome for Win XP 2 159
Need Help Delphi 2010 CheckBox1 Stored value in memo 13 67
PHP preg_replace code convert to Delphi 14 57
This article explains how to create forms/units independent of other forms/units object names in a delphi project. Have you ever created a form for user input in a Delphi project and then had the need to have that same form in a other Delphi proj…
Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…

831 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question