[Webinar] Streamline your web hosting managementRegister Today

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 343
  • Last Modified:

Extracting web links from string

Hello,

How can I extract web links from string, for example I have string with web links and other text and I need only those web links that match special code or text.
0
H-styler
Asked:
H-styler
1 Solution
 
mokuleCommented:

I advise regular expressions.

Freeware for Delphi
http://regexpstudio.com/TRegExpr/TRegExpr.html
0
 
mokuleCommented:

This is an example of regular expression from TRegExpr test program for extracting URL from text.

(?i)                          # we need caseInsensitive mode
(FTP|HTTP)://                 # protocol
([_a-z\d\-]+(\.[_a-z\d\-]+)+) # TCP addr
((/[ _a-z\d\-\\\.]+)+)*       # unix path
0
 
KunfufaresiCommented:
Hello, not as fancy as regex but you could use this code

s:String;
i :integer;

begin
 s := texttoparse;
 repeat
  i := pos('http://',s);
  if i>0 then
  begin
   delete(s,1,i-1);
   i := pos(' ');
   if i=0 then i := length(s)+1;
   listbox1.items.add(copy(s,1,i-1)); // so the url is = copy(s,1,i-1)
   delete(s,1,i);
  end;
 until i=0;
end;

well this would work if your urls doesnt contain spaces which they shouldnt as spaces are encoded already.

Kunfu Faresi
0
Receive 1:1 tech help

Solve your biggest tech problems alongside global tech experts with 1:1 help.

 
mokuleCommented:
I advise regexpr because he wrote
"I need only those web links that match special code or text."
0
 
KunfufaresiCommented:
Yes you are right, regexp is much more advanced, but i've used it some times and still have not gotten all the hang of it.
0
 
H-stylerAuthor Commented:
Kunfufaresi with your code I can`t extract links from this type of string:

<table border=0 cellpadding=0 cellspacing=0 bgcolor=ffffff><tr><td><a href="http://www.bythebeachemails.com/scripts/runner.php?PA=116" target=_ptc onclick="javascript:reloadpage(30)"><img src=http://www.bythebeachemails.com/scripts/runner.php?REDIRECT=http%3A%2F%2Fmpam2.free.fr%2Fcash%2Fbann_cash2.gif alt="Great PTR/PTC sites" width=468 height=60 border=0></a></td></tr></table>The ad above is worth 50 cent(s)<br><br><br><br><br><b>1</b><br> <br>

And I only need to extract those links witch match for example this code: "scripts/runner.php?PA="
0
 
KunfufaresiCommented:
Hello,

well if you did a

if pos('scripts/runner.php?PA=',copy(s,1,i-1))>0 then listbox1.items.add(copy(s,1,i-1));

also you should in this case look for pos('"',s) instead of pos(' ',s) as " terminates the url not space.
0
 
ceoworksCommented:
As mokule said i suggest you to use regular expressions. Here is some usefull stuff http://www.regular-expressions.info/ and if you will search for regular expressions in www.torry.ru, you may found some freeware components.

Regards,
0

Featured Post

Learn to develop an Android App

Want to increase your earning potential in 2018? Pad your resume with app building experience. Learn how with this hands-on course.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now