Solved

Replace image tags with alt descriptions

Posted on 2004-04-27
14
494 Views
Last Modified: 2006-11-17
I'm trying to create a dynamically generated text-only version of my site, and what I want to do is replace all the image tags (or designated images) with their alt description, and then spit out the rest of the HTML code.

I have a script I found to strip the images using eregi_replace, but i can't figure out how to do what I want.

Any ideas, or suggestions on a better way to do this?

Thanks in advance
carlene
0
Comment
Question by:carlenevs
  • 8
  • 5
14 Comments
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10931709
Hi, you could try

//----START CODE----
$pattern = "/<img.*alt=[\"'](\w*)[\"'][^>]*>/i";
$replace = "[IMG: $1]";
$newhtml = preg_replace($pattern,$replace,$oldhtml);
//----END CODE----

you can collapse this into one line if you like:

$newhtml = preg_replace("/<img.*alt=[\"'](\w*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);

but the first block is easier for you to read

_Blue
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10931737
No, just realised that for some reason this only replaces the first image :-/

I'll keep trying

_Blue
0
 
LVL 6

Expert Comment

by:aolXFT
ID: 10931769
The BBC have a script that does this. It is however a perl Script, and you use it as a proxy to the website you want to visit.

http://www.bbc.co.uk/education/betsie/download.html
0
Three Reasons Why Backup is Strategic

Backup is strategic to your business because your data is strategic to your business. Without backup, your business will fail. This white paper explains why it is vital for you to design and immediately execute a backup strategy to protect 100 percent of your data.

 
LVL 10

Accepted Solution

by:
eeBlueShadow earned 500 total points
ID: 10931818
OK, my original code was failing when the alt tag had a space in it.

This should do the job:

//----START CODE----
$pattern = "/<img.*alt=[\"']([\w ]*)[\"'][^>]*>/i";
$replace = "[IMG: $1]";
$newhtml = preg_replace($pattern,$replace,$oldhtml);
//----END CODE----

or

$newhtml = preg_replace("/<img.*alt=[\"']([\w ]*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);

_Blue
0
 

Author Comment

by:carlenevs
ID: 10931876
eeBlueShadwow -

That works, sorta.

When I use the script as written, it only finds & replaces the first 4 of the 7 images...

I tried changing the 'alt' to 'name' (as the two images i want to change are the only ones in this section with name) and it finds the two i want (which it didn't change previously)....

how can i modify what's above to work if there's an alt AND a name tag, in any order?

sorry, i know naught about Perl, and am just getting my feet wet with php.

thanks,
carlenevs
0
 

Author Comment

by:carlenevs
ID: 10931926
Never mind, i found out why it wasn't showing the images i wanted... but it does fail if there's a . (period) in the alt tag.

Still, how would I set this up to only replace if there's a name attribute as well?

carlene
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10932518
I assume that you would only want to use the name attribute if the alt attribute isn't there...

In that case, since the replace as it is only leaves img tags that don't have an 'alt' attribute, just do another replace to pick up the name attributes:

$replace1 = preg_replace("/<img.*alt=[\"']([\w._ ]*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);
$replace2 = preg_replace("/<img.*name=[\"']([\w._ ]*)[\"'][^>]*>/i","[IMG: $1]",$replace1);

If you want to see exactly how this pattern works, copy the following into notepad or another monospaced editor (it might look OK below, if might not, but it'll be perfectly understandable in the text editor:

##START

/<img.*alt=[\"']([\w_. ]*)[\"'][^>]*>/i
/____________________________________/_  - mark the start and end of the pattern
_<img__________________________________  - find this text
_____.*________________________________  - the dot means 'any character' and the * means 'any number of . in a row'
_______alt=____________________________  - followed by this text
___________[___]_______________________  - followed by one of these characters
____________\"'________________________  - these characters = " or ' - the backslash is to tell PHP this isn't the " closing the string
________________(________)_____________  - brackets mean 'mark whatever is inside as a backreference - more on that later
_________________[_____]*______________  - any one of these characters, as many as you can find in a row
__________________\w_. ________________  - \w = any letter or digit, or an underscore, full stop or space (the only characters allowed in an attribute)
__________________________[\"']________  - followed by another quote (single or double)
_______________________________[^>]*___  - the ^ means "a character that isn't in the following set" if the ^ is the first character inside the [], so as many non '>' as possible
____________________________________>__  - followed by a closing tag
______________________________________i  - 'i' after the closing pattern tag means 'match any of those upper and lower case'

In the replace tag, the $1 is a special identifier that is replaced with the contents of the first set of brackets in the pattern. A $2 would be replace by the second set of brackets, if there were another set

##END
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10932532
By the way, the above two lines also fix the problem with periods in the tags not working.

_Blue
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10932558
Ah, I didn't read your comment properly (need to stop doing that ;))

If you only wanted to test for tags with both a name and an alt, it becomes more tricky, because you can't guarantee the order they attributes will appear in the tag. It's most likely possible, but I can't think of an easy way to do it at the moment.

_Blue
0
 

Author Comment

by:carlenevs
ID: 10932564
Hmm.

THanks for the tutorial! i'll have to study it, and maybe someday i'll figure it out....

what i was meaning with the alt and the name attributes is thus:

if the img has a name attribute, replace the tag with the alt, if no name attribute is present, leave the image.

so,

<img src="blah" name="pic" alt="REPLACED"> would change to REPLACED

but

<img src="blah2" alt="SAME"> would stay as an image tag

Does that make more sense?
0
 

Author Comment

by:carlenevs
ID: 10932578
Yeah, we posted at the same time.

Thanks for your help!

Carlene
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10932669
OK, that looks possible, and again could be solved in 2 replaces, simply by considering both possibilities - name then alt or alt then name. This is a shoddy method though, you couldn't easily extent it to depending on 3 attributes because it would need 6 replaces - 4 attributes would need 24 replaces! It should be ok for what you need it for though:

$replace1 = preg_replace("/<img[^>]+name=[\"'][\w._ ]*[\"'][^>]+alt=[\"']([\w._ ]*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);
$newhtml = preg_replace("/<img[^>]+alt=[\"']([\w._ ]*)[\"'][^>]+name=[\"'][\w._ ]*[\"'][^>]*>/i","[IMG: $1]",$replace1);

should work. If you want only a particular name to trigger the replacement, switch both
name=[\"'][\w._ ]*[\"']
with
name=[\"']triggerName[\"']

The patterns have been modified a bit from the original, if you want to try to work out why, figuring some regular expressions out is the best way to learn about them

_Blue
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10932681
I'll even give you a hint: a + sign means 'the previous character one or more times'

_Blue
0
 

Author Comment

by:carlenevs
ID: 10932717
YES!

That works perfectly! thank you!!

i thought i'd tried that before with one of the earlier iterations, and couldn't get it to work... oh well, it's working now!
0

Featured Post

3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
.htaccess 5 38
how to use a switch statement with heredoc 11 25
How do I fix this UPDATE error? 7 24
How to obtain the string from a PHP StdObject ? 6 20
Introduction HTML checkboxes provide the perfect way for a web developer to receive client input when the client's options might be none, one or many.  But the PHP code for processing the checkboxes can be confusing at first.  What if a checkbox is…
Introduction This article is intended for those who are new to PHP error handling (https://www.experts-exchange.com/articles/11769/And-by-the-way-I-am-New-to-PHP.html).  It addresses one of the most common problems that plague beginning PHP develop…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

803 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question