Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 307
  • Last Modified:

Problem with PHP preg_match_all

I'm tryign to crap a web site but I think an ascii character code is messing it up.

Everythign works fine untill I try to get the title of a video..   I load the page into $html, in the beginnign section of my script, I need to get the video code so I have already set the $item.

The complete line in the html is "<h1><a href="/watch/265126/come-on-in-im-home-pt-13/"> come on in, i&#39;m home,  pt 1/3</a></h1>"   I want to extract the title of the video between ""<h1><a href="/watch/265126/come-on-in-im-home-pt-13/">" and "</a></h1>"

The problem is, I could extract the title as is, but the html is always changing, my variable $item will always be updated with the nex code's need.  I think the &#39; is giving me a problem on that one video which breaks my script.

$item = "<h1><a href="/watch/265126/come-on-in-im-home-pt-13/">";
preg_match_all("/".$item."(.+?)\<\/a><\/h1>/is", $html, $match );
 
 
Debug Warning: D:\www\youtube1.php line 125 - preg_match_all() [<a href='function.preg-match-all'>function.preg-match-all</a>]: Unknown modifier 'w'

Open in new window

0
megarry
Asked:
megarry
  • 4
1 Solution
 
Hube02Commented:
The problem that is generating the error has to do with the fact that there are /s in $item that are not escaped. You are using the / as a delimiter and preg_match all is seeing the / in "/watch/265126/come" as the closing delimiter and this is why it says "Unknown modifier 'w'"

try the following (note that I removed your escaping of /s and changed the delimiter to a #

$item = '<h1><a href="/watch/265126/come-on-in-im-home-pt-13/">';
preg_match_all("#".$item."(.+?)\</a></h1>#is", $html, $match );

Open in new window

0
 
Hube02Commented:
another minor change, the < should not need to be escaped

$item = "<h1><a href="/watch/265126/come-on-in-im-home-pt-13/">";
preg_match_all("/".$item."(.+?)<\/a><\/h1>/is", $html, $match );

Open in new window

0
 
Hube02Commented:
Sorry, coppied the wrong code

$item = '<h1><a href="/watch/265126/come-on-in-im-home-pt-13/">';
preg_match_all("#".$item."(.+?)</a></h1>#is", $html, $match );

Open in new window

0
 
megarryAuthor Commented:
I never knew you can use a delimiter of your choosing..  Always though you needed the "/ and /"..  Thanks! That will help not only with this problem but moving ahead!  

Works perfectly now..  :)
0
 
Hube02Commented:
You can also use sets of characters, for instance < and > or [ and ], though I don't generally use these. I like to use the # when dealing with html because this character usually does not appear in the regular expression.

They don't really cover this very well in what I've found of online documentation.

An excellent resource for anyone interested in regular expressions is Mastering Regular Expressions by Jeffrey Friedl http://www.amazon.com/Mastering-Regular-Expressions-Jeffrey-Friedl/dp/0596528124/ref=sr_1_1?ie=UTF8&s=books&qid=1239376459&sr=1-1
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now