[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 543
  • Last Modified:

Greedy regex in php

Hi everyone!

I am trying to highlight each occurance of a certain keyword in an HTML file (i.e. outside html tags). For instance, in the following html code, I want to highlight the word "keyword":

<html>
blabla
<p> bla keyword bla keyword bla</p>
<p>keyword bla keyword</p>
blabla
</html>

I'm using this regex within the eregi_replace function (the $html_code varialbe holds the contents of an html file:

eregi_replace(">([^<]*)(keyword)([^<]*)","\\1<b style="background-color:yellow">\\2</b>\\3",$html_code);

the problem is the the greedyness of the regex causes that only the last occurance of "keyword" within each html tag is highlighted, i.e the above html code becomes:

<html>
blabla
<p> bla keyword bla <b style="background-color:yellow">keyword</b> bla</p>
<p>keyword bla <b style="background-color:yellow">keyword</b></p>
blabla
</html>

As you can see, other occurances of the keyword except the last one in each html tag aren't highlighted. I know this is because the ([^<]*) part of my regex is greedy, i.e it matches the longest possible string. I have tried to add an [^(keyword)], so that it avoids matching and keyword occurances, but then it only highlights the first occurance in each html tag.

How can I addapt this regex so that it matches every occurance of the keyword in each html tag?

Thank you!
0
muntel
Asked:
muntel
  • 4
  • 2
1 Solution
 
BatalfCommented:
Maybe a simple preg_replace is a better alternative.

Example:


<?
$html_code = "<html>
blabla
<p> bla keyword bla keyword bla</p>
<p>keyword bla keyword</p>
blabla
</html>";

$html_code = preg_replace("/\b(keyword)\b/si","<span style=\"background-color:yellow;font-weight:bold\">\\1</span>",$html_code);

echo $html_code;


?>
0
 
BatalfCommented:
or maybe this if you wan't to avoid keywords inside tags


$html_code = preg_replace("/([^\B<])(keyword)([^\B>])/si","\\1<span style=\"background-color:yellow;font-weight:bold\">\\2</span>\\3",$html_code);
0
 
jdpipeCommented:
I think that you might find this is better down with Javascript. Take a look at this page:

http://www.nsftools.com/misc/SearchAndHighlight.htm

Hope that helps

JP
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
muntelAuthor Commented:
Batalf, I have tried your solution (the second one), but it doesn't avoid the occurances inside html tags (i.e. when I'm trying to highlight the word "table", everything falls apart, the html code inside tags gets highlighted).

I'm not familliar with perl regexes, could you figure out what's wrong with the /([^\B<])(keyword)([^\B>])/si regex you've provided?
0
 
BatalfCommented:
The second one seems to work when I try it, but maybe \B should be \W. Another example below.

This pattern

/([^\w<])(table)([^\w>])/si

matches
/ = Start of pattern
[^\w<] = [ ] = character class, ^ within [] means characters except the following ones. \w matches alpha numeric characters and < matches literal "<"

"si" is flages. "s" = treat big string as a single line, "i" = case insensitive.

New example:

<?php
$html_code = "<html>
blabla
<p> bla table keyword bla keyword bla</p>
<p>keyword bla keyword</p>
<table border=1>
<tr>
      <td>TEst</td>
</tr>
<tr>
      <td>TEst</td>
</tr>
</table>
blabla
</html>";

$html_code = preg_replace("/([^<\w])(table)([^\w>])/si","\\1<span style=\"background-color:yellow;font-weight:bold\">\\2</span>\\3",$html_code);

echo $html_code;


?>
0
 
BatalfCommented:
You can find more info on Perl compatible Regexp here:

http://www.php.net/manual/en/ref.pcre.php

Batalf
0
 
muntelAuthor Commented:
Thanks, Batalf! This last regex seems to do the trick! You have earned the points at steak here.

jdpipe, your suggestion is also reasonable, but it's not exactly what I need (i.e. a php solution) I will keep this solution as an alternative, should I ever need an exclusively html/javascript-based highlighting function.

Thanks, guys!
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now