Solved

Delphi: IHTMLDocument2, Extract Link

Posted on 2014-01-21
11
2,030 Views
Last Modified: 2014-01-21
Hi,

I'm trying to extract some info from a html code and i need to extract the url (link) from a structure block in html source.

Example, i have this html structure:

<div class="browse-info">
<span class="info">
<span class="browseTitleLink"><a href="http://xxx.com/movie/xxx">xxx</a></span><br />
<span class="browseInfoList" ><b>Size:</b> 1.85 GB</span><br />
<span class="browseInfoList" ><b>Quality:</b> 1080p</span><br />
<span class="browseInfoList" ><b>Genre:</b> Crime | Drama</span><br />
<span class="browseInfoList" ><b>IMDB Rating:</b> 6.0/10</span><br />
<span class="browseSeeds">
<span class="peers"><b>Peers:</b> 1454</span>
<span class="seeds"><b>Seeds:</b> 3412</span>
</span>
</span>
<span class="links">
<a href="http://xxx" class="std-btn-small mright">View Info<span></span></a>
<a href="http://xxx" class="std-btn-small mleft downloadDwl" data-movieID="4502" data-downloadID="4694">Download<span></span></a>
</span>
</div>
</div>
<div class="divider"></div>
</div>

Open in new window


I'm using this code to get some info:

procedure TForm1.Button3Click(Sender: TObject);
Var
  Documento : OleVariant;
  Elementos : OleVariant;
  I         : Integer;
  Item : TListItem;
  Source : TMemoryStream;
  Memo : Tmemo;
  IdHttp : TidHttp;
  Qualidade : String;
begin
Listview1.Clear;
idHttp := TIdHttp.Create(Self);
idHttp.AllowCookies := True;
idHttp.HandleRedirects := True;
memo := Tmemo.Create(Self);
Memo.Visible := False;
memo.Parent := Form1;
Source := TMemoryStream.Create;
Qualidade := 'http://xxx';
if CheckBox1.Checked then
 Qualidade := 'http:/xxx';
if CheckBox2.Checked then
 Qualidade := 'http:/xxx';
if CheckBox1.Checked and Checkbox2.Checked then
 Qualidade := 'http://xxx';
if Edit1.Text <> '' then
 Qualidade := 'http://xxx';
 Try
  Try
   IdHTTP.Get(Qualidade, Source);
   Source.Position := 0;
  Except on E: Exception do
   Begin
    ShowMessage(e.Message);
    Source.Free;
    memo.Free;
    idHttp.Free;
    Exit;
   End;
  End;
  memo.Lines.LoadFromStream(Source);
  Documento := coHTMLDocument.Create as IHTMLDocument2;
   if Source.Size > 0 then
    Documento.write(memo.Lines.Text)
   else
    Begin
     ShowMessage('erro');
     Source.Free;
     memo.Free;
     idHttp.Free;
     Exit;
   End;
  Documento.close;
  Listview1.Items.BeginUpdate;
   for i := 0 to Documento.body.all.length - 1 do
    begin
     Elementos := Documento.body.all.item(i);
      if (Elementos.tagName = 'SPAN') and (Elementos.className = 'browseTitleLink') then
       Begin
        item := Listview1.Items.Add;
        Item.Caption := Elementos.innerText;
       End;
        if (Elementos.tagName = 'SPAN') and (Elementos.className = 'browseInfoList') then
         item.SubItems.Add(Elementos.innerText);
        if (Elementos.tagName = 'SPAN') and (Elementos.className = 'browseSeeds') then
         Item.SubItems.Add(Elementos.innerText);      
    end;
  ListView1.Items.EndUpdate;
 Finally
  Source.Free;
  memo.Free;
  idHttp.Free;
 End;
end;

Open in new window


If i call:

if (Elementos.tagName = 'SPAN') and (Elementos.classname = 'links') then
         Item.SubItems.Add(elementos.innerText);

Open in new window


This give me the text "View info Download" and not the links..

What do i need to do? Need a code, since i don't want to extract all links, but the url in the same order that i extract the info to put in a ListView.

http://imageshack.com/a/img23/9100/htyy.pnghttp://imageshack.com/a/img23/9100/htyy.png
Capture.PNG
0
Comment
Question by:Júlio
  • 6
  • 5
11 Comments
 
LVL 31

Expert Comment

by:Marco Gasi
ID: 39798127
Tried with elementos.innerHTML?
0
 

Author Comment

by:Júlio
ID: 39798143
Yes and don't work like i want.

 if (Elementos.tagName = 'SPAN') and (Elementos.classname = 'links') then
         Item.SubItems.Add(elementos.innerHTML);

Open in new window


Returns with tags classnames all mixed.
0
 
LVL 31

Accepted Solution

by:
Marco Gasi earned 500 total points
ID: 39798147
No, try to use elementos.href: this should work if you get all tags:

Elementos: Document.all.tags('A');

but perhaps it works even with all.item
0
 

Author Comment

by:Júlio
ID: 39798161
if Documento.all.tags('A') <> 0 then
         Item.SubItems.Add(elementos.href);

Open in new window


Returns: "Member not found"
0
 
LVL 31

Expert Comment

by:Marco Gasi
ID: 39798184
I'm sorry. Now I see you define Documento as OleVariant. What I suggested requires it be defined as IHTMLDocument2...
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:Júlio
ID: 39798187
But if i do that, i need to rewrite all the code. Right?
0
 
LVL 31

Expert Comment

by:Marco Gasi
ID: 39798201
I'm not sure but I think you don't. Give it a try within Button3Click event.
0
 

Author Comment

by:Júlio
ID: 39798614
Omg, i don't undestand:

procedure TForm1.Button1Click(Sender: TObject);
Var
 Documento : IHTMLDocument2;
 ArrayV    : OleVariant;
 InfoV     : IHTMLElement;

 Buffer    : String;
 http      : TidHttp;
 ListItem  : TListItem;
 I         : Integer;
begin
http := TIdHttp.Create(Self);
http.AllowCookies := True;
http.HandleRedirects := True;

Try
 Buffer := http.Get('http://xxx');
Except on E: Exception do
 Begin
  ShowMessage(e.Message);
  Exit;
 End;
End;

Documento :=  coHTMLDocument.Create as IHTMLDocument2;
ArrayV := VarArrayCreate([0,0], varVariant);
ArrayV[0] := Buffer;
Documento.Write(PSafeArray(TVarData(ArrayV).VArray));
Documento.Close;

Listview1.Items.BeginUpdate;

infoV := Documento.body.all as IHTMLElement;
for I := 0 to Documento.all.length -1  do
 Begin
  if (infoV.tagName = 'SPAN') and (infoV.className = 'browseTitleLink') then
   Begin
    Listitem := Listview1.Items.Add;
    ListItem.Caption := infoV.innerText;
   End;
 End;

Listview1.Items.EndUpdate;
end;

Open in new window


i'm rewritting.

The problem is between lines 33 and 41. What am i doing wrong? help-me, show me how.

Error: "Interface not supported"
0
 

Author Comment

by:Júlio
ID: 39798658
ok, i got it, now i need to get the link:

procedure TForm1.Button1Click(Sender: TObject);
Var
 Documento : IHTMLDocument2;
 ArrayV    : OleVariant;
 InfoV     : IHTMLElement;

 Buffer    : String;
 http      : TidHttp;
 ListItem  : TListItem;
 I         : Integer;
 ElCount   : Integer;
begin
http := TIdHttp.Create(Self);
http.AllowCookies := True;
http.HandleRedirects := True;

Try
 Buffer := http.Get('http://xxx');
Except on E: Exception do
 Begin
  ShowMessage(e.Message);
  Exit;
 End;
End;

Documento :=  coHTMLDocument.Create as IHTMLDocument2;
ArrayV := VarArrayCreate([0,0], varVariant);
ArrayV[0] := Buffer;
Documento.Write(PSafeArray(TVarData(ArrayV).VArray));
Documento.Close;

Listview1.Items.BeginUpdate;
ElCount := Documento.all.length;
//infoV := Documento.body.all as IHTMLElement;
for I := 0 to Elcount -1  do
 Begin
  infoV := Documento.all.item(I, '') as IHTMLElement;
   if (infoV.tagName = 'SPAN') and (infoV.className = 'browseTitleLink') then
    Begin
     Listitem := Listview1.Items.Add;
     ListItem.Caption := infoV.innerText;
    End;
   if (infoV.tagName = 'SPAN') and (infoV.className = 'browseInfoList') then
    ListItem.SubItems.Add(infoV.innerText);
   if (infoV.tagName = 'SPAN') and (infoV.className = 'browseSeeds') then
    ListItem.SubItems.Add(infoV.innerText);
 End;

Listview1.Items.EndUpdate;
end;

Open in new window


If i add

Var
 LinkV     : IHTMLElement;
(...)
LinkV := Documento.links.item('', I) as IHTMLElement;
  ListItem.subitems.Add(LinkV.innerText);

Open in new window


Don't work too.


UPDATE:

So easy, i can't believe:

  if (infoV.tagName = 'A') and (infoV.className = 'std-btn-small mright') then
    ListItem.SubItems.Add(InfoV.getAttribute('href', 0));

Open in new window


TY!!!
0
 

Author Closing Comment

by:Júlio
ID: 39798779
He gave the solution, it took me a while to understand the concept.

Ty Marco Gasi!
0
 
LVL 31

Expert Comment

by:Marco Gasi
ID: 39799108
I'm sorry to not have helped you more, but I have gone away (to sleep!). I'm happy you solved your problem. I never used that, but AFAI it should had worked with

  if (infoV.tagName = 'A') and (infoV.className = 'std-btn-small mright') then
    ListItem.SubItems.Add(InfoV.href);

Open in new window


but for this you should have to use look only for tags.

Thanks for points and good luck with your project.
Marco
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article explains how to create forms/units independent of other forms/units object names in a delphi project. Have you ever created a form for user input in a Delphi project and then had the need to have that same form in a other Delphi proj…
Hello everybody This Article will show you how to validate number with TEdit control, What's the TEdit control? TEdit is a standard Windows edit control on a form, it allows to user to write, read and copy/paste single line of text. Usua…
A company’s greatest vulnerability is their email. CEO fraud, ransomware and spear phishing attacks are the no1 threat to a company’s security. Cybercrime is responsible for the largest loss of money to companies today with losses projected to r…
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …

914 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now