Link to home
Start Free TrialLog in
Avatar of macgruder
macgruder

asked on

Strip out all title attributes

I want to parse a page of HTML and strip out the title attributes (and the alt ones).

So:

title = "foobar" ----> NULL

A solution must be able to deal with non-ascii characters (e.g. characters in the upper ranges of utf-8)
ASKER CERTIFIED SOLUTION
Avatar of nicholassolutions
nicholassolutions
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of star_trek
star_trek

You can use
preg_match_all("/<.*title[\s]*=[\s]*['\"]?([^'\"\s]+?)['\"]?.*>/Ui", $html, $array_elem, PREG_SET_ORDER);

$array_elem[0][1], $array_elem[1][1] should contain values of title attributes.
Avatar of macgruder

ASKER

Sorry about the delay! Been away, and forgot this one :-)

I'm giving the solution to Nicholassolutions, but that strips out as requested.

Star_trek's answer will *retrieve* the values of the titles, but not strip them out (although that is useful too).

Thanks.
but that ---> because that