Link to home
Start Free TrialLog in
Avatar of bcops
bcops

asked on

Pattern match / regular expression to extract a link (a href) title

Hi,
Anyone know how to do the following?

Suppose there were two string variables:
$htmlString = "lorem ipsum lorem ipsum <a href=\"http://www.some-url-or-other.com\" >LINK-TEXT</a>lorem ipsum lorem ipsum"
$urlString = "http://www.some-url-or-other.com"

Anyone know how in PHP (e.g. using a regular expression) to extract the "LINK-TEXT" from $htmlString given $urlString?
Thanks, Ben


Avatar of TeRReF
TeRReF
Flag of Netherlands image

<?php

$htmlString = "lorem ipsum lorem ipsum <a href=\"http://www.some-url-or-other.com\" >LINK-TEXT</a>lorem ipsum lorem ipsum";
$urlString = "http://www.some-url-or-other.com";

preg_match('|<a\s+href\s*=\s*"'.$urlString.'[^>]+>(.*?)</a>|', $htmlString, $match);

$linktext = $match[1];
print($linktext);

?>
Avatar of VoteyDisciple
VoteyDisciple

How about...

preg_match('/<a[^>]*href="' . preg_quote($urlString) . '"[^>]*>([^<]*)</a>/', $htmlString, $matches);
echo $matches[1];

This will work only if there are no other HTML tags inside the anchor (though, of course, in HTML it'd be perfectly legal to have all sorts of other tags in there).  There are more sophisticated ways of doing it if that's a concern, but there's a quick solution for plain-text links only.
ASKER CERTIFIED SOLUTION
Avatar of Terry Woods
Terry Woods
Flag of New Zealand image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of bcops

ASKER

TerryAtOpus solution does the trick excellently well, thanks to all for your posts!
Ben