Solved

Replace image tags with alt descriptions

Posted on 2004-04-27
14
500 Views
Last Modified: 2006-11-17
I'm trying to create a dynamically generated text-only version of my site, and what I want to do is replace all the image tags (or designated images) with their alt description, and then spit out the rest of the HTML code.

I have a script I found to strip the images using eregi_replace, but i can't figure out how to do what I want.

Any ideas, or suggestions on a better way to do this?

Thanks in advance
carlene
0
Comment
Question by:carlenevs
  • 8
  • 5
14 Comments
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10931709
Hi, you could try

//----START CODE----
$pattern = "/<img.*alt=[\"'](\w*)[\"'][^>]*>/i";
$replace = "[IMG: $1]";
$newhtml = preg_replace($pattern,$replace,$oldhtml);
//----END CODE----

you can collapse this into one line if you like:

$newhtml = preg_replace("/<img.*alt=[\"'](\w*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);

but the first block is easier for you to read

_Blue
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10931737
No, just realised that for some reason this only replaces the first image :-/

I'll keep trying

_Blue
0
 
LVL 6

Expert Comment

by:aolXFT
ID: 10931769
The BBC have a script that does this. It is however a perl Script, and you use it as a proxy to the website you want to visit.

http://www.bbc.co.uk/education/betsie/download.html
0
Networking for the Cloud Era

Join Microsoft and Riverbed for a discussion and demonstration of enhancements to SteelConnect:
-One-click orchestration and cloud connectivity in Azure environments
-Tight integration of SD-WAN and WAN optimization capabilities
-Scalability and resiliency equal to a data center

 
LVL 10

Accepted Solution

by:
eeBlueShadow earned 500 total points
ID: 10931818
OK, my original code was failing when the alt tag had a space in it.

This should do the job:

//----START CODE----
$pattern = "/<img.*alt=[\"']([\w ]*)[\"'][^>]*>/i";
$replace = "[IMG: $1]";
$newhtml = preg_replace($pattern,$replace,$oldhtml);
//----END CODE----

or

$newhtml = preg_replace("/<img.*alt=[\"']([\w ]*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);

_Blue
0
 

Author Comment

by:carlenevs
ID: 10931876
eeBlueShadwow -

That works, sorta.

When I use the script as written, it only finds & replaces the first 4 of the 7 images...

I tried changing the 'alt' to 'name' (as the two images i want to change are the only ones in this section with name) and it finds the two i want (which it didn't change previously)....

how can i modify what's above to work if there's an alt AND a name tag, in any order?

sorry, i know naught about Perl, and am just getting my feet wet with php.

thanks,
carlenevs
0
 

Author Comment

by:carlenevs
ID: 10931926
Never mind, i found out why it wasn't showing the images i wanted... but it does fail if there's a . (period) in the alt tag.

Still, how would I set this up to only replace if there's a name attribute as well?

carlene
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10932518
I assume that you would only want to use the name attribute if the alt attribute isn't there...

In that case, since the replace as it is only leaves img tags that don't have an 'alt' attribute, just do another replace to pick up the name attributes:

$replace1 = preg_replace("/<img.*alt=[\"']([\w._ ]*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);
$replace2 = preg_replace("/<img.*name=[\"']([\w._ ]*)[\"'][^>]*>/i","[IMG: $1]",$replace1);

If you want to see exactly how this pattern works, copy the following into notepad or another monospaced editor (it might look OK below, if might not, but it'll be perfectly understandable in the text editor:

##START

/<img.*alt=[\"']([\w_. ]*)[\"'][^>]*>/i
/____________________________________/_  - mark the start and end of the pattern
_<img__________________________________  - find this text
_____.*________________________________  - the dot means 'any character' and the * means 'any number of . in a row'
_______alt=____________________________  - followed by this text
___________[___]_______________________  - followed by one of these characters
____________\"'________________________  - these characters = " or ' - the backslash is to tell PHP this isn't the " closing the string
________________(________)_____________  - brackets mean 'mark whatever is inside as a backreference - more on that later
_________________[_____]*______________  - any one of these characters, as many as you can find in a row
__________________\w_. ________________  - \w = any letter or digit, or an underscore, full stop or space (the only characters allowed in an attribute)
__________________________[\"']________  - followed by another quote (single or double)
_______________________________[^>]*___  - the ^ means "a character that isn't in the following set" if the ^ is the first character inside the [], so as many non '>' as possible
____________________________________>__  - followed by a closing tag
______________________________________i  - 'i' after the closing pattern tag means 'match any of those upper and lower case'

In the replace tag, the $1 is a special identifier that is replaced with the contents of the first set of brackets in the pattern. A $2 would be replace by the second set of brackets, if there were another set

##END
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10932532
By the way, the above two lines also fix the problem with periods in the tags not working.

_Blue
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10932558
Ah, I didn't read your comment properly (need to stop doing that ;))

If you only wanted to test for tags with both a name and an alt, it becomes more tricky, because you can't guarantee the order they attributes will appear in the tag. It's most likely possible, but I can't think of an easy way to do it at the moment.

_Blue
0
 

Author Comment

by:carlenevs
ID: 10932564
Hmm.

THanks for the tutorial! i'll have to study it, and maybe someday i'll figure it out....

what i was meaning with the alt and the name attributes is thus:

if the img has a name attribute, replace the tag with the alt, if no name attribute is present, leave the image.

so,

<img src="blah" name="pic" alt="REPLACED"> would change to REPLACED

but

<img src="blah2" alt="SAME"> would stay as an image tag

Does that make more sense?
0
 

Author Comment

by:carlenevs
ID: 10932578
Yeah, we posted at the same time.

Thanks for your help!

Carlene
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10932669
OK, that looks possible, and again could be solved in 2 replaces, simply by considering both possibilities - name then alt or alt then name. This is a shoddy method though, you couldn't easily extent it to depending on 3 attributes because it would need 6 replaces - 4 attributes would need 24 replaces! It should be ok for what you need it for though:

$replace1 = preg_replace("/<img[^>]+name=[\"'][\w._ ]*[\"'][^>]+alt=[\"']([\w._ ]*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);
$newhtml = preg_replace("/<img[^>]+alt=[\"']([\w._ ]*)[\"'][^>]+name=[\"'][\w._ ]*[\"'][^>]*>/i","[IMG: $1]",$replace1);

should work. If you want only a particular name to trigger the replacement, switch both
name=[\"'][\w._ ]*[\"']
with
name=[\"']triggerName[\"']

The patterns have been modified a bit from the original, if you want to try to work out why, figuring some regular expressions out is the best way to learn about them

_Blue
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10932681
I'll even give you a hint: a + sign means 'the previous character one or more times'

_Blue
0
 

Author Comment

by:carlenevs
ID: 10932717
YES!

That works perfectly! thank you!!

i thought i'd tried that before with one of the earlier iterations, and couldn't get it to work... oh well, it's working now!
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article discusses four methods for overlaying images in a container on a web page
This article discusses how to create an extensible mechanism for linked drop downs.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

789 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question