Solved

Delphi: IHTMLDocument2, Extract Link

Posted on 2014-01-21
11
2,262 Views
Last Modified: 2014-01-21
Hi,

I'm trying to extract some info from a html code and i need to extract the url (link) from a structure block in html source.

Example, i have this html structure:

<div class="browse-info">
<span class="info">
<span class="browseTitleLink"><a href="http://xxx.com/movie/xxx">xxx</a></span><br />
<span class="browseInfoList" ><b>Size:</b> 1.85 GB</span><br />
<span class="browseInfoList" ><b>Quality:</b> 1080p</span><br />
<span class="browseInfoList" ><b>Genre:</b> Crime | Drama</span><br />
<span class="browseInfoList" ><b>IMDB Rating:</b> 6.0/10</span><br />
<span class="browseSeeds">
<span class="peers"><b>Peers:</b> 1454</span>
<span class="seeds"><b>Seeds:</b> 3412</span>
</span>
</span>
<span class="links">
<a href="http://xxx" class="std-btn-small mright">View Info<span></span></a>
<a href="http://xxx" class="std-btn-small mleft downloadDwl" data-movieID="4502" data-downloadID="4694">Download<span></span></a>
</span>
</div>
</div>
<div class="divider"></div>
</div>

Open in new window


I'm using this code to get some info:

procedure TForm1.Button3Click(Sender: TObject);
Var
  Documento : OleVariant;
  Elementos : OleVariant;
  I         : Integer;
  Item : TListItem;
  Source : TMemoryStream;
  Memo : Tmemo;
  IdHttp : TidHttp;
  Qualidade : String;
begin
Listview1.Clear;
idHttp := TIdHttp.Create(Self);
idHttp.AllowCookies := True;
idHttp.HandleRedirects := True;
memo := Tmemo.Create(Self);
Memo.Visible := False;
memo.Parent := Form1;
Source := TMemoryStream.Create;
Qualidade := 'http://xxx';
if CheckBox1.Checked then
 Qualidade := 'http:/xxx';
if CheckBox2.Checked then
 Qualidade := 'http:/xxx';
if CheckBox1.Checked and Checkbox2.Checked then
 Qualidade := 'http://xxx';
if Edit1.Text <> '' then
 Qualidade := 'http://xxx';
 Try
  Try
   IdHTTP.Get(Qualidade, Source);
   Source.Position := 0;
  Except on E: Exception do
   Begin
    ShowMessage(e.Message);
    Source.Free;
    memo.Free;
    idHttp.Free;
    Exit;
   End;
  End;
  memo.Lines.LoadFromStream(Source);
  Documento := coHTMLDocument.Create as IHTMLDocument2;
   if Source.Size > 0 then
    Documento.write(memo.Lines.Text)
   else
    Begin
     ShowMessage('erro');
     Source.Free;
     memo.Free;
     idHttp.Free;
     Exit;
   End;
  Documento.close;
  Listview1.Items.BeginUpdate;
   for i := 0 to Documento.body.all.length - 1 do
    begin
     Elementos := Documento.body.all.item(i);
      if (Elementos.tagName = 'SPAN') and (Elementos.className = 'browseTitleLink') then
       Begin
        item := Listview1.Items.Add;
        Item.Caption := Elementos.innerText;
       End;
        if (Elementos.tagName = 'SPAN') and (Elementos.className = 'browseInfoList') then
         item.SubItems.Add(Elementos.innerText);
        if (Elementos.tagName = 'SPAN') and (Elementos.className = 'browseSeeds') then
         Item.SubItems.Add(Elementos.innerText);      
    end;
  ListView1.Items.EndUpdate;
 Finally
  Source.Free;
  memo.Free;
  idHttp.Free;
 End;
end;

Open in new window


If i call:

if (Elementos.tagName = 'SPAN') and (Elementos.classname = 'links') then
         Item.SubItems.Add(elementos.innerText);

Open in new window


This give me the text "View info Download" and not the links..

What do i need to do? Need a code, since i don't want to extract all links, but the url in the same order that i extract the info to put in a ListView.

http://imageshack.com/a/img23/9100/htyy.pnghttp://imageshack.com/a/img23/9100/htyy.png
Capture.PNG
0
Comment
Question by:Júlio
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 5
11 Comments
 
LVL 31

Expert Comment

by:Marco Gasi
ID: 39798127
Tried with elementos.innerHTML?
0
 

Author Comment

by:Júlio
ID: 39798143
Yes and don't work like i want.

 if (Elementos.tagName = 'SPAN') and (Elementos.classname = 'links') then
         Item.SubItems.Add(elementos.innerHTML);

Open in new window


Returns with tags classnames all mixed.
0
 
LVL 31

Accepted Solution

by:
Marco Gasi earned 500 total points
ID: 39798147
No, try to use elementos.href: this should work if you get all tags:

Elementos: Document.all.tags('A');

but perhaps it works even with all.item
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:Júlio
ID: 39798161
if Documento.all.tags('A') <> 0 then
         Item.SubItems.Add(elementos.href);

Open in new window


Returns: "Member not found"
0
 
LVL 31

Expert Comment

by:Marco Gasi
ID: 39798184
I'm sorry. Now I see you define Documento as OleVariant. What I suggested requires it be defined as IHTMLDocument2...
0
 

Author Comment

by:Júlio
ID: 39798187
But if i do that, i need to rewrite all the code. Right?
0
 
LVL 31

Expert Comment

by:Marco Gasi
ID: 39798201
I'm not sure but I think you don't. Give it a try within Button3Click event.
0
 

Author Comment

by:Júlio
ID: 39798614
Omg, i don't undestand:

procedure TForm1.Button1Click(Sender: TObject);
Var
 Documento : IHTMLDocument2;
 ArrayV    : OleVariant;
 InfoV     : IHTMLElement;

 Buffer    : String;
 http      : TidHttp;
 ListItem  : TListItem;
 I         : Integer;
begin
http := TIdHttp.Create(Self);
http.AllowCookies := True;
http.HandleRedirects := True;

Try
 Buffer := http.Get('http://xxx');
Except on E: Exception do
 Begin
  ShowMessage(e.Message);
  Exit;
 End;
End;

Documento :=  coHTMLDocument.Create as IHTMLDocument2;
ArrayV := VarArrayCreate([0,0], varVariant);
ArrayV[0] := Buffer;
Documento.Write(PSafeArray(TVarData(ArrayV).VArray));
Documento.Close;

Listview1.Items.BeginUpdate;

infoV := Documento.body.all as IHTMLElement;
for I := 0 to Documento.all.length -1  do
 Begin
  if (infoV.tagName = 'SPAN') and (infoV.className = 'browseTitleLink') then
   Begin
    Listitem := Listview1.Items.Add;
    ListItem.Caption := infoV.innerText;
   End;
 End;

Listview1.Items.EndUpdate;
end;

Open in new window


i'm rewritting.

The problem is between lines 33 and 41. What am i doing wrong? help-me, show me how.

Error: "Interface not supported"
0
 

Author Comment

by:Júlio
ID: 39798658
ok, i got it, now i need to get the link:

procedure TForm1.Button1Click(Sender: TObject);
Var
 Documento : IHTMLDocument2;
 ArrayV    : OleVariant;
 InfoV     : IHTMLElement;

 Buffer    : String;
 http      : TidHttp;
 ListItem  : TListItem;
 I         : Integer;
 ElCount   : Integer;
begin
http := TIdHttp.Create(Self);
http.AllowCookies := True;
http.HandleRedirects := True;

Try
 Buffer := http.Get('http://xxx');
Except on E: Exception do
 Begin
  ShowMessage(e.Message);
  Exit;
 End;
End;

Documento :=  coHTMLDocument.Create as IHTMLDocument2;
ArrayV := VarArrayCreate([0,0], varVariant);
ArrayV[0] := Buffer;
Documento.Write(PSafeArray(TVarData(ArrayV).VArray));
Documento.Close;

Listview1.Items.BeginUpdate;
ElCount := Documento.all.length;
//infoV := Documento.body.all as IHTMLElement;
for I := 0 to Elcount -1  do
 Begin
  infoV := Documento.all.item(I, '') as IHTMLElement;
   if (infoV.tagName = 'SPAN') and (infoV.className = 'browseTitleLink') then
    Begin
     Listitem := Listview1.Items.Add;
     ListItem.Caption := infoV.innerText;
    End;
   if (infoV.tagName = 'SPAN') and (infoV.className = 'browseInfoList') then
    ListItem.SubItems.Add(infoV.innerText);
   if (infoV.tagName = 'SPAN') and (infoV.className = 'browseSeeds') then
    ListItem.SubItems.Add(infoV.innerText);
 End;

Listview1.Items.EndUpdate;
end;

Open in new window


If i add

Var
 LinkV     : IHTMLElement;
(...)
LinkV := Documento.links.item('', I) as IHTMLElement;
  ListItem.subitems.Add(LinkV.innerText);

Open in new window


Don't work too.


UPDATE:

So easy, i can't believe:

  if (infoV.tagName = 'A') and (infoV.className = 'std-btn-small mright') then
    ListItem.SubItems.Add(InfoV.getAttribute('href', 0));

Open in new window


TY!!!
0
 

Author Closing Comment

by:Júlio
ID: 39798779
He gave the solution, it took me a while to understand the concept.

Ty Marco Gasi!
0
 
LVL 31

Expert Comment

by:Marco Gasi
ID: 39799108
I'm sorry to not have helped you more, but I have gone away (to sleep!). I'm happy you solved your problem. I never used that, but AFAI it should had worked with

  if (infoV.tagName = 'A') and (infoV.className = 'std-btn-small mright') then
    ListItem.SubItems.Add(InfoV.href);

Open in new window


but for this you should have to use look only for tags.

Thanks for points and good luck with your project.
Marco
0

Featured Post

[Webinar] Learn How Hackers Steal Your Credentials

Do You Know How Hackers Steal Your Credentials? Join us and Skyport Systems to learn how hackers steal your credentials and why Active Directory must be secure to stop them. Thursday, July 13, 2017 10:00 A.M. PDT

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article explains how to create forms/units independent of other forms/units object names in a delphi project. Have you ever created a form for user input in a Delphi project and then had the need to have that same form in a other Delphi proj…
Have you ever had your Delphi form/application just hanging while waiting for data to load? This is the article to read if you want to learn some things about adding threads for data loading in the background. First, I'll setup a general applica…
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…
Add bar graphs to Access queries using Unicode block characters. Graphs appear on every record in the color you want. Give life to numbers. Hopes this gives you ideas on visualizing your data in new ways ~ Create a calculated field in a query: …

719 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question