Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

LOOKING FOR REGEX TO SPOT MISSING ALT TAGS

Posted on 2006-07-05
7
Medium Priority
?
570 Views
Last Modified: 2011-10-03
Hi there

I'm looking for a VBSCRIPT regex to look into a web page source code and return IMG tags but only those which either have the ALT tag missing - the regex would ideally return the name of the image but simply returning the whole string or nothing would be fine.

Peter.
0
Comment
Question by:fsbsupport
  • 3
  • 2
5 Comments
 
LVL 7

Expert Comment

by:yotamsher
ID: 17048643
Hi Peter
can you give some info?

What are you trying to achive?
Is this script supposed to be embeded in a web page?
Why VBSCRIPT?

Yotam
0
 

Author Comment

by:fsbsupport
ID: 17048775
It's part of a content management system - when someone edits a page - I look through the source and set a flag for pasted WORD content - which they have to remove .... and for tables that are too large. So I also want to be able to flag up a page that contains one or more images - but wherein the ALT tag is missing from the image.  I don't really need to know how many times this occurs - just that it occurs.

VBscript (server side ASP) - because that's what the system is written in - though I would assume the regular expressions are similar in most languages. I know how to actually write the code - it's just the expression I'm unclear on.

Regards

Peter.
0
 
LVL 7

Expert Comment

by:yotamsher
ID: 17049045
Just to be sure, can you post here a minimal example of page containing bad IMG and good IMG example
0
 

Author Comment

by:fsbsupport
ID: 17049341

Good example - bearing in mind that there is no guarantee that double quotes are used - might be single....

<img src="fred.gif" alt="this is a picture of fred">

bad example

<img src="fred.gif">

There may be other attributes such as size etc....
0
 
LVL 7

Accepted Solution

by:
yotamsher earned 2000 total points
ID: 17050597
Hi Peter

I had a problem of matching "Starting with IMG, and has no ALT"
but what about having two regular expressions?
the following code (working as a stand-alone vbs) assumes each Tag is in a different line
I guess you are already breaking the HTML into Tags (if not there are examples in the internet)

hope this helps

Yotam

' *****************
' filter-images.vbs
' *****************
'open the HTML file
Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile _
    (".\images.html", ForReading)

'Set up RegExp objects

Set ImgRegularExpressionObject = New RegExp
Set GoodImgRegularExpressionObject = New RegExp

With ImgRegularExpressionObject
.Pattern = "img"
.IgnoreCase = True
.Global = True
End With

With GoodImgRegularExpressionObject
.Pattern = "alt=*"
.IgnoreCase = True
.Global = True
End With

'Read the file line by line

Do Until objTextFile.AtEndOfStream
    strNextLine = objTextFile.Readline

'check for an IMG tag
   Set image_match = ImgRegularExpressionObject.Execute(strNextLine)
   If image_match.Count > 0 Then
      Set good_image_match = GoodImgRegularExpressionObject.Execute(strNextLine)
      If good_image_match.Count > 0 Then
         WScript.Echo "[" & strNextLine & "] Is a good Img tag."
      Else
         WScript.Echo "[" & strNextLine & "] Is an Img tag without ALT."
      End If
   Else
      WScript.Echo "[" & strNextLine & "] Is not an Img tag."
   End If
Loop

Set RegularExpressionObject = nothing
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Computer science students often experience many of the same frustrations when going through their engineering courses. This article presents seven tips I found useful when completing a bachelors and masters degree in computing which I believe may he…
When you discover the power of the R programming language, you are going to wonder how you ever lived without it! Learn why the language merits a place in your programming arsenal.
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

783 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question