Getting content inside DIV with dynamic class name and ID.

The attached function gets the content inside the defined DIV. This function works just perfect.

The defined DIV here is:

$str = '<div class="post-333 post hentry category-libros" id="post-333">';

When I define the DIV, I want to ignore "333 post hentry category-libros" id="post-333" part. So, I want to get the content inside the DIV starting like "<div class="post-"  ... ignoring anything after "post-"

I mean to get the content inside the div with name starting with class name "post-" and ignoring the rest of name and id.

Please help.
Thank you
function get_content ($url) {
// FIND ALL OF THE DESIRED DIV
$htm = file_get_contents($url);
$str = '<div class="post-333 post hentry category-libros" id="post-333">';
$arr = explode($str, $htm);
$new = $arr[1];
$len = strlen($new);

// ACCUMULATE THE OUTPUT STRING HERE
$out = NULL;

// WE ARE INSIDE ONE DIV TAG
$cnt = 1;

// UNTIL THE END OF STRING OR UNTIL WE ARE OUT OF ALL DIV TAGS
while ($len)
{
    // COPY A CHARACTER
    $chr = substr($new,0,1);

    // IF THE DIV NESTING LEVEL INCREASES OR DECREASES
    if (substr($new,0,4) == '<div')  $cnt++;
    if (substr($new,0,5) == '</div') $cnt--;

    // ACTIVATE THIS TO FOLLOW THE COUNT OF NESTING LEVELS
    // echo " $cnt";

    // WHEN THE NESTING LEVEL GOES BACK TO ZERO
    if (!$cnt) break;

    // WHEN THE NESTING LEVEL IS STILL POSITIVE
    $len--;
    $out .= $chr;
    $new = substr($new,1);
} Return $out; }

Open in new window

FernanditosAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Ray PaseurCommented:
Sorry - I do not keep track of things from one question to the next.  Please post the test data that you want us to use, thanks.
0
FernanditosAuthor Commented:
Thank you Ray, find the test data here: http://www.frostwave.com/data.html

I want my function to get all content inside DIV:
 
<div class="post-333 post hentry category-libros" id="post-333">

Open in new window


The value "333 post hentry category-libros" id="post-333"" is dynamic and will always change, so I need to check only the DIV first part, starting with "<div class="post-" and ignore the rest of class name and id name.

Thank you so much for your support.
0
StingRaYCommented:
If you use jQuery, this problem should be easily solved by addressing the following code.

$('div[class^="post"]')

For example...

alert($('div[class^="post"]').html());
0
OWASP: Forgery and Phishing

Learn the techniques to avoid forgery and phishing attacks and the types of attacks an application or network may face.

FernanditosAuthor Commented:
I have in mind something like:

$str = '<div class="post-(.*)" id="(.*)">';

Open in new window


0
StingRaYCommented:
Ah! sorry I get you wrong.

You can use preg_split instead of explode.

function get_content ($url) {
// FIND ALL OF THE DESIRED DIV
$htm = file_get_contents($url);

$str = '{<div class="post-[^"]+"[^>]+>}';
$arr = preg_split($str, $htm);
$new = $arr[1];
$len = strlen($new);

// ACCUMULATE THE OUTPUT STRING HERE
$out = NULL;

// WE ARE INSIDE ONE DIV TAG
$cnt = 1;

// UNTIL THE END OF STRING OR UNTIL WE ARE OUT OF ALL DIV TAGS
while ($len)
{
    // COPY A CHARACTER
    $chr = substr($new,0,1);

    // IF THE DIV NESTING LEVEL INCREASES OR DECREASES
    if (substr($new,0,4) == '<div')  $cnt++;
    if (substr($new,0,5) == '</div') $cnt--;

    // ACTIVATE THIS TO FOLLOW THE COUNT OF NESTING LEVELS
    // echo " $cnt";

    // WHEN THE NESTING LEVEL GOES BACK TO ZERO
    if (!$cnt) break;

    // WHEN THE NESTING LEVEL IS STILL POSITIVE
    $len--;
    $out .= $chr;
    $new = substr($new,1);
} Return $out; }

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
FernanditosAuthor Commented:
@StingRaY that worked great, I will test once again. Thank you.

$str = '{<div class="post-[^"]+"[^>]+>}';
0
Ray PaseurCommented:
@Fernanditos: I believe that solution can work as long as you are only looking for one <div> per page, and the attributes of the <div> tags are all on one line and in an exact order.  Good test data, including edge cases, is fairly important when you're working with external input.  Consider what your programming will do with these, which are valid and equivalent HTML statements.

<div class="post-333 post hentry category-libros" id="post-333">
<div id="post-333" class="post-333 post hentry category-libros">
<div class='post-333 post hentry category-libros' id="post-333">
<div
    class="post-333
               post hentry
               category-libros"
    id="post-333">

Executive summary: Using regular expressions to parse HTML is not a very professional approach.  A state engine is more reliable.

If you are parsing HTML to try to get information from a web publisher you might want to consider asking the publishers if they expose an API.  That way you would have a formal interface which is much more dependable than trying to scrape HTML.  If the publisher wants you to have their information they will almost certainly want to expose an API that is versioned and dependable.

Anyway, good luck with your project. ~Ray
0
StingRaYCommented:
@Fernanditos: Ray is correct. The solution is not the best one. Other approach would be the better considerable, for example, Simple HTML DOM Parser (http://simplehtmldom.sourceforge.net/).
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PHP

From novice to tech pro — start learning today.