Solved

Get <div> with preg_replace

Posted on 2008-06-16
9
3,499 Views
Last Modified: 2010-04-21
I have a function to extract all <div> from a html code. It gives med the div's id and content, so it can be treated in a function.

It is working very, but the problem is that I now have to get div's inside a div extracted also!

Ex.

<div id="foo">
This is just fill <div id="foo2">This is a new block</div> This is more fill
</div>

How can I do that? I don't necessarily need the parent div, just the ones inside (not containing any child div's)  
// Function to extract div's from HTML code
 
$pattern = '/(<div.*?id="([a-z09_]+)".*?>)(.*?)<\/div>/ise';
		
$replacements = get_div_content("$1", "$2", "$3");
		
$proccesed_html = preg_replace($pattern, $replacement, $html);

Open in new window

0
Comment
Question by:Thingmand
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 2
9 Comments
 
LVL 49

Expert Comment

by:Roonaan
ID: 21796521
I think you get a hard time getting this done using preg_replace.

You could try using arrays:
      $parts = explode('</div>', $html);
      foreach($parts as $i => $p) {
            $div_start = strrpos($p, '<div');
            if($div_start === false) {
                  continue;
            }
            $div = substr($p, $div_start).'</div>';
            
                       .. do something with the div html ...
      }
0
 
LVL 50

Expert Comment

by:Steve Bink
ID: 21796524
Sending $proccesed_html[0] back through the same algorithm should isolate the inner <div>.  Basically,make this a function with the possibility of recursion.

For example, say I have this HTML:

This is not in a div<div>div Stuff<div>more div stuff</div>last stuff</div>Out of div again

The first preg should match the entire first div (assuming you're using 'greedy' mode).  Submitting the contents of that div (your '(.*?)' marker) back in should isolate the second div.

BTW, (.*?) is a little redundant, yes?  Any character (.) repeated 0 or more times (*), repeated 0 or 1 times (?).  
0
 
LVL 50

Expert Comment

by:Steve Bink
ID: 21796535
Roonaan's comment addresses the point I left out: use preg_match() or preg_match_all() instead.  If you need to replace text, it might be easier to accomplish once you've already isolated the inner divs (working from the inside out)
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:Thingmand
ID: 21797056
Roonaan: Thanks for the idea

routinet: It was on my mind, that I needed to re-run the results, but I thought maybe there was a trick with regs. I don't quite follow the use of preg_match instead? Could you give a simple example?
0
 
LVL 50

Expert Comment

by:Steve Bink
ID: 21810619
On further reflection, I have to agree with Roonaan.  I don't know of a way to express the potential for nesting.  Searching the string manually sounds like what you need.
0
 

Author Comment

by:Thingmand
ID: 21826845
Well, the strange thing is that its working partly! The attach test code find these class id's:

ID: header
ID: menu
ID: page
ID: newsletter
ID: news
ID: content
ID: footer

I can't see the system!
<?php
 
	$html = <<<END
<body>
<!-- start header -->
<div id="header">
	<div id="logo"></div>
	<div id="menu"></div>
</div>
<!-- end header -->
<!-- start page -->
<div id="page">
	<!-- start sidebar -->
	<div id="sidebar">
		<div id="box5"></div>
		<!-- start newsletter form -->
		<div id="newsletter"></div>
		<!-- end newsletter form -->
		<!-- start recent news -->
		<div id="news"></div>
		<!-- end recent news -->
	</div>
	<!-- end sidebar -->
	<!-- start content -->
	<div id="content">
		<div id="box6"></div>
	</div>
	<!-- end content -->
	<div style="clear: both; height: 30px;">&nbsp;</div>
</div>
<!-- end page -->
<div id="footer"></div>
</body>
		
END;
 
 
		
	$pattern = '/(<div.*?id="([a-z09_]+)".*?>)(.*?)<\/div>/ise';
	
	$replacement = 'get_div_content("$1", "$2", "$3")';
	
	$proccesed_html = preg_replace($pattern, $replacement, $html);
		
		
	function get_div_content($orgDiv, $classID, $rules) {
 
			echo "ID: $classID<br>\n";
	}
	
	echo "<br>Pattern: <pre>" . htmlentities($pattern) . "</pre><br>\n";
	
	echo "<pre>" . htmlentities($html) . "</pre><br><br>\n";
?>

Open in new window

0
 
LVL 49

Accepted Solution

by:
Roonaan earned 500 total points
ID: 21828571
When your html is xhtml you could try parsing it as xml?
0
 

Author Comment

by:Thingmand
ID: 21829039

// $pattern = '/(<div.*?id="([a-z09_]+)".*?>)(.*?)<\/div>/ise';
 
// Should be:
 
$pattern = '/(<div.*?id="([a-z0-9_]+)".*?>)(.*?)<\/div>/ise';
 
// It dosn't change the result, though...

Open in new window

0
 

Author Closing Comment

by:Thingmand
ID: 31467728
Good damn, thats a brilliant idea! It works like a charm with xml_parser :o)
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Length of for loop to be dynamic 2 32
Php - How to get the value of json file using PHP? 29 58
$_GET call between URL 3 41
while read ID; do 4 55
I imagine that there are some, like me, who require a way of getting currency exchange rates for implementation in web project from time to time, so I thought I would share a solution that I have developed for this purpose. It turns out that Yaho…
Introduction This article is intended for those who are new to PHP error handling (https://www.experts-exchange.com/articles/11769/And-by-the-way-I-am-New-to-PHP.html).  It addresses one of the most common problems that plague beginning PHP develop…
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

756 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question