Search string for img src and alt, using PHP

I would like to search a long string for the first instance of img. If found, it would then extract the src and title. The order may or may not be consistent. I thought this could be done using preg_match_all, but it doesn't seem to be working..
while($row = mysql_fetch_array( $result )) { 

	preg_match_all('/<img[^>]+>/i',$row['article'], $imgResult); 	// Find all images
	if ($imgResult != '') {
		preg_match_all('/(title|src)=("[^"]*")/i',$imgResult[0], $imgAttributes);
		 
	}			
}

Open in new window

LVL 1
jej07Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

SamsonChungCommented:
I wouldn't call preg yet, think of it in a simpler manner...

$position_of_img = strpos($row,'img') this gives you the position of the first occurance of the text img

now, take this and alter your algorithm so it reads 'all img position' and than, extract the src using Substring.

0
Beverley PortlockCommented:
Change the second regex to be like this

     preg_match_all( '#\s+([a-z]+)\s*?=\s*?[\'|"]([^"|\']*?)[\'|"]#i', $imgResult[0], $imgAttributes );

     echo "<pre>"; print_r( $imgAttributes );  echo "</pre>"; // testing - so you can see the output

Open in new window


I'm assuming that the first regex works, but it looks like it should. This second will give you all the tags in $imgAttributes[1] and you can look for SRC and TITLE in there. The corresponding values will have the same indices in $imagAttributes[2]

Run it and see, the output is fairly self-explantory

0
jej07Author Commented:
@bportlock,

I simplified my code so you could also see an example string. It looks like I may be having trouble with both regex.
$row = '&lt;p&gt;Praesent non lorem nisi. &lt;img class="imgWhiteBorder" src="http://www.google.com/images/logos/ps_logo2.png" alt="Logo 2" title="Google\'s Logo" width="100" align="left" /&gt;In hac habitasse platea dictumst. Praesent viverra rutrum feugiat. Nulla velit lorem, sodales et porta a, aliquam et leo. Morbi sed odio at neque faucibus egestas.&lt;img class="imgWhiteBorder" src="http://www.google.com/images/logos/ps_logo2.png" alt="Logo 2" title="Google\'s Logo" width="100" align="left" /&gt;&lt;p&gt;';

preg_match_all('/<img[^>]+>/i',$row, $imgResult); 	// Find all images
echo "<pre>"; print_r( $imgResult );  echo "</pre>"; 

	if ($imgResult != '') {
		preg_match_all( '#\s+([a-z]+)\s*?=\s*?[\'|"]([^"|\']*?)[\'|"]#i', $imgResult[0], $imgAttributes );

		echo "<pre>"; print_r( $imgAttributes );  echo "</pre>"; // testing - so you can see the output
		 
	}

Open in new window

0
Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

Terry WoodsIT GuruCommented:
The < character isn't matching because the < characters in your text have been html-encoded to &lt; - do you know why that might be?
0
jej07Author Commented:
@TerryAtOpus, Good catch. It must have been inserted using the htmlspecialchars function.
Okay, the first regex works, but not the second.
preg_match_all('/<img[^>]+>/i',html_entity_decode($row), $imgResult); 	// Find all images
echo "<pre>"; print_r( $imgResult );  echo "</pre>"; 

	if ($imgResult != '') {
		preg_match_all( '#\s+([a-z]+)\s*?=\s*?[\'|"]([^"|\']*?)[\'|"]#i', $imgResult[0], $imgAttributes );

		echo "<pre>"; print_r( $imgAttributes );  echo "</pre>"; // testing - so you can see the output
		 
	}

Open in new window

0
Terry WoodsIT GuruCommented:
You're passing an array to the 2nd preg_match_all command, but it is expecting a string. This works for me:

$row = '<p>Praesent non lorem nisi. <img class="imgWhiteBorder" src="http://www.google.com/imag
es/logos/ps_logo2.png" alt="Logo 2" title="Google\'s Logo" width="100" align="left" />In hac ha
bitasse platea dictumst. Praesent viverra rutrum feugiat. Nulla velit lorem, sodales et porta a
, aliquam et leo. Morbi sed odio at neque faucibus egestas.<img class="imgWhiteBorder" src="htt
p://www.google.com/images/logos/ps_logo2.png" alt="Logo 2" title="Google\'s Logo" width="100" a
lign="left" /><p>';

preg_match_all('/<img[^>]+>/i',$row, $imgResult);       // Find all images
#echo "<pre>\n"; print_r( $imgResult ); echo "</pre>\n";

if ($imgResult != '') {
  foreach ($imgResult[0] as $img) {
    #print "\$img: $img\n";
    preg_match_all( '#\s+([a-z]+)\s*?=\s*?[\'|"]([^"|\']*?)[\'|"]#i', $img, $imgAttributes );
    echo "<pre>"; print_r( $imgAttributes ); echo "</pre>"; //
  }
}

-------
Result:
-------
<pre>Array
(
    [0] => Array
        (
            [0] =>  class="imgWhiteBorder"
            [1] =>  src="http://www.google.com/images/logos/ps_logo2.png"
            [2] =>  alt="Logo 2"
            [3] =>  title="Google'
            [4] =>  width="100"
            [5] =>  align="left"
        )

    [1] => Array
        (
            [0] => class
            [1] => src
            [2] => alt
            [3] => title
            [4] => width
            [5] => align
        )

    [2] => Array
        (
            [0] => imgWhiteBorder
            [1] => http://www.google.com/images/logos/ps_logo2.png
            [2] => Logo 2
            [3] => Google
            [4] => 100
            [5] => left
        )

)
</pre><pre>Array
(
    [0] => Array
        (
            [0] =>  class="imgWhiteBorder"
            [1] =>  src="http://www.google.com/images/logos/ps_logo2.png"
            [2] =>  alt="Logo 2"
            [3] =>  title="Google'
            [4] =>  width="100"
            [5] =>  align="left"
        )

    [1] => Array
        (
            [0] => class
            [1] => src
            [2] => alt
            [3] => title
            [4] => width
            [5] => align
        )

    [2] => Array
        (
            [0] => imgWhiteBorder
            [1] => http://www.google.com/images/logos/ps_logo2.png
            [2] => Logo 2
            [3] => Google
            [4] => 100
            [5] => left
        )

)

Open in new window

0
jej07Author Commented:
@TerryAtOpus, Thank you!

I'm looking at the results, and I'm not sure how I can pull out the src and title for only the first image. Especially if the order is different.

For example, this time I have 2  images in $row with tags in order of class, src,alt, title, width and align. But next time I might have 4 images in $row with only the alt and src tags. How do I extract just the src and title from the first image when the tag order could be different?
0
Beverley PortlockCommented:
"How do I extract just the src and title from the first image when the tag order could be different?"

That's why I did it the way I did it. You can scan array[1] for the attribute type using array_search http://www.php.net/array_search and then use this to access array[2], so in Terry's example above

$index = array_search( "alt",  $imgAttributes[1] );
$value = '';

if ( $index ) {
     $value =  $imgAttributes [2][$index];
}


To ensure that you only get the first image then simply do not use a loop for the second regex

<?php

$row = '<p>Praesent non lorem nisi. <img class="imgWhiteBorder" src="http://www.google.com/imag
es/logos/ps_logo2.png" alt="Logo 2" title="Google\'s Logo" width="100" align="left" />In hac ha
bitasse platea dictumst. Praesent viverra rutrum feugiat. Nulla velit lorem, sodales et porta a
, aliquam et leo. Morbi sed odio at neque faucibus egestas.<img class="imgWhiteBorder" src="htt
p://www.google.com/images/logos/ps_logo2.png" alt="Logo 2" title="Google\'s Logo" width="100" a
lign="left" /><p>';

preg_match_all('/<img[^>]+>/i',$row, $imgResult);       // Find all images
#echo "<pre>\n"; print_r( $imgResult ); echo "</pre>\n";

if ($imgResult != '') {
    #print "\$img: $img\n";
    preg_match_all( '#\s+([a-z]+)\s*?=\s*?[\'|"]([^"|\']*?)[\'|"]#i', $imgResult[0][0], $imgAttributes );
    echo "<pre>"; print_r( $imgAttributes ); echo "</pre>"; //
}


// Pull out the alt tag and its value if present
//

$index = array_search( "alt",  $imgAttributes[1] );
$value = '';

if ( $index ) {
     $value =  $imgAttributes [2][$index];
     echo "alt tag found and its value is '$value'<br/>";
}


?>

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
jej07Author Commented:
Fantastic!! Thank you so much for the help and for introducing me to array_search.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PHP

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.