Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 341
  • Last Modified:

Extracting web links from string

Hello,

How can I extract web links from string, for example I have string with web links and other text and I need only those web links that match special code or text.
0
H-styler
Asked:
H-styler
1 Solution
 
mokuleCommented:

I advise regular expressions.

Freeware for Delphi
http://regexpstudio.com/TRegExpr/TRegExpr.html
0
 
mokuleCommented:

This is an example of regular expression from TRegExpr test program for extracting URL from text.

(?i)                          # we need caseInsensitive mode
(FTP|HTTP)://                 # protocol
([_a-z\d\-]+(\.[_a-z\d\-]+)+) # TCP addr
((/[ _a-z\d\-\\\.]+)+)*       # unix path
0
 
KunfufaresiCommented:
Hello, not as fancy as regex but you could use this code

s:String;
i :integer;

begin
 s := texttoparse;
 repeat
  i := pos('http://',s);
  if i>0 then
  begin
   delete(s,1,i-1);
   i := pos(' ');
   if i=0 then i := length(s)+1;
   listbox1.items.add(copy(s,1,i-1)); // so the url is = copy(s,1,i-1)
   delete(s,1,i);
  end;
 until i=0;
end;

well this would work if your urls doesnt contain spaces which they shouldnt as spaces are encoded already.

Kunfu Faresi
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
mokuleCommented:
I advise regexpr because he wrote
"I need only those web links that match special code or text."
0
 
KunfufaresiCommented:
Yes you are right, regexp is much more advanced, but i've used it some times and still have not gotten all the hang of it.
0
 
H-stylerAuthor Commented:
Kunfufaresi with your code I can`t extract links from this type of string:

<table border=0 cellpadding=0 cellspacing=0 bgcolor=ffffff><tr><td><a href="http://www.bythebeachemails.com/scripts/runner.php?PA=116" target=_ptc onclick="javascript:reloadpage(30)"><img src=http://www.bythebeachemails.com/scripts/runner.php?REDIRECT=http%3A%2F%2Fmpam2.free.fr%2Fcash%2Fbann_cash2.gif alt="Great PTR/PTC sites" width=468 height=60 border=0></a></td></tr></table>The ad above is worth 50 cent(s)<br><br><br><br><br><b>1</b><br> <br>

And I only need to extract those links witch match for example this code: "scripts/runner.php?PA="
0
 
KunfufaresiCommented:
Hello,

well if you did a

if pos('scripts/runner.php?PA=',copy(s,1,i-1))>0 then listbox1.items.add(copy(s,1,i-1));

also you should in this case look for pos('"',s) instead of pos(' ',s) as " terminates the url not space.
0
 
ceoworksCommented:
As mokule said i suggest you to use regular expressions. Here is some usefull stuff http://www.regular-expressions.info/ and if you will search for regular expressions in www.torry.ru, you may found some freeware components.

Regards,
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now