Solved

Replace image tags with alt descriptions

Posted on 2004-04-27
14
473 Views
Last Modified: 2006-11-17
I'm trying to create a dynamically generated text-only version of my site, and what I want to do is replace all the image tags (or designated images) with their alt description, and then spit out the rest of the HTML code.

I have a script I found to strip the images using eregi_replace, but i can't figure out how to do what I want.

Any ideas, or suggestions on a better way to do this?

Thanks in advance
carlene
0
Comment
Question by:carlenevs
  • 8
  • 5
14 Comments
 
LVL 10

Expert Comment

by:eeBlueShadow
Comment Utility
Hi, you could try

//----START CODE----
$pattern = "/<img.*alt=[\"'](\w*)[\"'][^>]*>/i";
$replace = "[IMG: $1]";
$newhtml = preg_replace($pattern,$replace,$oldhtml);
//----END CODE----

you can collapse this into one line if you like:

$newhtml = preg_replace("/<img.*alt=[\"'](\w*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);

but the first block is easier for you to read

_Blue
0
 
LVL 10

Expert Comment

by:eeBlueShadow
Comment Utility
No, just realised that for some reason this only replaces the first image :-/

I'll keep trying

_Blue
0
 
LVL 6

Expert Comment

by:aolXFT
Comment Utility
The BBC have a script that does this. It is however a perl Script, and you use it as a proxy to the website you want to visit.

http://www.bbc.co.uk/education/betsie/download.html
0
 
LVL 10

Accepted Solution

by:
eeBlueShadow earned 500 total points
Comment Utility
OK, my original code was failing when the alt tag had a space in it.

This should do the job:

//----START CODE----
$pattern = "/<img.*alt=[\"']([\w ]*)[\"'][^>]*>/i";
$replace = "[IMG: $1]";
$newhtml = preg_replace($pattern,$replace,$oldhtml);
//----END CODE----

or

$newhtml = preg_replace("/<img.*alt=[\"']([\w ]*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);

_Blue
0
 

Author Comment

by:carlenevs
Comment Utility
eeBlueShadwow -

That works, sorta.

When I use the script as written, it only finds & replaces the first 4 of the 7 images...

I tried changing the 'alt' to 'name' (as the two images i want to change are the only ones in this section with name) and it finds the two i want (which it didn't change previously)....

how can i modify what's above to work if there's an alt AND a name tag, in any order?

sorry, i know naught about Perl, and am just getting my feet wet with php.

thanks,
carlenevs
0
 

Author Comment

by:carlenevs
Comment Utility
Never mind, i found out why it wasn't showing the images i wanted... but it does fail if there's a . (period) in the alt tag.

Still, how would I set this up to only replace if there's a name attribute as well?

carlene
0
 
LVL 10

Expert Comment

by:eeBlueShadow
Comment Utility
I assume that you would only want to use the name attribute if the alt attribute isn't there...

In that case, since the replace as it is only leaves img tags that don't have an 'alt' attribute, just do another replace to pick up the name attributes:

$replace1 = preg_replace("/<img.*alt=[\"']([\w._ ]*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);
$replace2 = preg_replace("/<img.*name=[\"']([\w._ ]*)[\"'][^>]*>/i","[IMG: $1]",$replace1);

If you want to see exactly how this pattern works, copy the following into notepad or another monospaced editor (it might look OK below, if might not, but it'll be perfectly understandable in the text editor:

##START

/<img.*alt=[\"']([\w_. ]*)[\"'][^>]*>/i
/____________________________________/_  - mark the start and end of the pattern
_<img__________________________________  - find this text
_____.*________________________________  - the dot means 'any character' and the * means 'any number of . in a row'
_______alt=____________________________  - followed by this text
___________[___]_______________________  - followed by one of these characters
____________\"'________________________  - these characters = " or ' - the backslash is to tell PHP this isn't the " closing the string
________________(________)_____________  - brackets mean 'mark whatever is inside as a backreference - more on that later
_________________[_____]*______________  - any one of these characters, as many as you can find in a row
__________________\w_. ________________  - \w = any letter or digit, or an underscore, full stop or space (the only characters allowed in an attribute)
__________________________[\"']________  - followed by another quote (single or double)
_______________________________[^>]*___  - the ^ means "a character that isn't in the following set" if the ^ is the first character inside the [], so as many non '>' as possible
____________________________________>__  - followed by a closing tag
______________________________________i  - 'i' after the closing pattern tag means 'match any of those upper and lower case'

In the replace tag, the $1 is a special identifier that is replaced with the contents of the first set of brackets in the pattern. A $2 would be replace by the second set of brackets, if there were another set

##END
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 10

Expert Comment

by:eeBlueShadow
Comment Utility
By the way, the above two lines also fix the problem with periods in the tags not working.

_Blue
0
 
LVL 10

Expert Comment

by:eeBlueShadow
Comment Utility
Ah, I didn't read your comment properly (need to stop doing that ;))

If you only wanted to test for tags with both a name and an alt, it becomes more tricky, because you can't guarantee the order they attributes will appear in the tag. It's most likely possible, but I can't think of an easy way to do it at the moment.

_Blue
0
 

Author Comment

by:carlenevs
Comment Utility
Hmm.

THanks for the tutorial! i'll have to study it, and maybe someday i'll figure it out....

what i was meaning with the alt and the name attributes is thus:

if the img has a name attribute, replace the tag with the alt, if no name attribute is present, leave the image.

so,

<img src="blah" name="pic" alt="REPLACED"> would change to REPLACED

but

<img src="blah2" alt="SAME"> would stay as an image tag

Does that make more sense?
0
 

Author Comment

by:carlenevs
Comment Utility
Yeah, we posted at the same time.

Thanks for your help!

Carlene
0
 
LVL 10

Expert Comment

by:eeBlueShadow
Comment Utility
OK, that looks possible, and again could be solved in 2 replaces, simply by considering both possibilities - name then alt or alt then name. This is a shoddy method though, you couldn't easily extent it to depending on 3 attributes because it would need 6 replaces - 4 attributes would need 24 replaces! It should be ok for what you need it for though:

$replace1 = preg_replace("/<img[^>]+name=[\"'][\w._ ]*[\"'][^>]+alt=[\"']([\w._ ]*)[\"'][^>]*>/i","[IMG: $1]",$oldhtml);
$newhtml = preg_replace("/<img[^>]+alt=[\"']([\w._ ]*)[\"'][^>]+name=[\"'][\w._ ]*[\"'][^>]*>/i","[IMG: $1]",$replace1);

should work. If you want only a particular name to trigger the replacement, switch both
name=[\"'][\w._ ]*[\"']
with
name=[\"']triggerName[\"']

The patterns have been modified a bit from the original, if you want to try to work out why, figuring some regular expressions out is the best way to learn about them

_Blue
0
 
LVL 10

Expert Comment

by:eeBlueShadow
Comment Utility
I'll even give you a hint: a + sign means 'the previous character one or more times'

_Blue
0
 

Author Comment

by:carlenevs
Comment Utility
YES!

That works perfectly! thank you!!

i thought i'd tried that before with one of the earlier iterations, and couldn't get it to work... oh well, it's working now!
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Things That Drive Us Nuts Have you noticed the use of the reCaptcha feature at EE and other web sites?  It wants you to read and retype something that looks like this.Insanity!  It's not EE's fault - that's just the way reCaptcha works.  But it is …
This article discusses how to create an extensible mechanism for linked drop downs.
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now