Solved

LOOKING FOR REGEX TO SPOT MISSING ALT TAGS

Posted on 2006-07-05
7
561 Views
Last Modified: 2011-10-03
Hi there

I'm looking for a VBSCRIPT regex to look into a web page source code and return IMG tags but only those which either have the ALT tag missing - the regex would ideally return the name of the image but simply returning the whole string or nothing would be fine.

Peter.
0
Comment
Question by:fsbsupport
  • 3
  • 2
7 Comments
 
LVL 7

Expert Comment

by:yotamsher
ID: 17048643
Hi Peter
can you give some info?

What are you trying to achive?
Is this script supposed to be embeded in a web page?
Why VBSCRIPT?

Yotam
0
 

Author Comment

by:fsbsupport
ID: 17048775
It's part of a content management system - when someone edits a page - I look through the source and set a flag for pasted WORD content - which they have to remove .... and for tables that are too large. So I also want to be able to flag up a page that contains one or more images - but wherein the ALT tag is missing from the image.  I don't really need to know how many times this occurs - just that it occurs.

VBscript (server side ASP) - because that's what the system is written in - though I would assume the regular expressions are similar in most languages. I know how to actually write the code - it's just the expression I'm unclear on.

Regards

Peter.
0
 
LVL 7

Expert Comment

by:yotamsher
ID: 17049045
Just to be sure, can you post here a minimal example of page containing bad IMG and good IMG example
0
 

Author Comment

by:fsbsupport
ID: 17049341

Good example - bearing in mind that there is no guarantee that double quotes are used - might be single....

<img src="fred.gif" alt="this is a picture of fred">

bad example

<img src="fred.gif">

There may be other attributes such as size etc....
0
 
LVL 7

Accepted Solution

by:
yotamsher earned 500 total points
ID: 17050597
Hi Peter

I had a problem of matching "Starting with IMG, and has no ALT"
but what about having two regular expressions?
the following code (working as a stand-alone vbs) assumes each Tag is in a different line
I guess you are already breaking the HTML into Tags (if not there are examples in the internet)

hope this helps

Yotam

' *****************
' filter-images.vbs
' *****************
'open the HTML file
Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile _
    (".\images.html", ForReading)

'Set up RegExp objects

Set ImgRegularExpressionObject = New RegExp
Set GoodImgRegularExpressionObject = New RegExp

With ImgRegularExpressionObject
.Pattern = "img"
.IgnoreCase = True
.Global = True
End With

With GoodImgRegularExpressionObject
.Pattern = "alt=*"
.IgnoreCase = True
.Global = True
End With

'Read the file line by line

Do Until objTextFile.AtEndOfStream
    strNextLine = objTextFile.Readline

'check for an IMG tag
   Set image_match = ImgRegularExpressionObject.Execute(strNextLine)
   If image_match.Count > 0 Then
      Set good_image_match = GoodImgRegularExpressionObject.Execute(strNextLine)
      If good_image_match.Count > 0 Then
         WScript.Echo "[" & strNextLine & "] Is a good Img tag."
      Else
         WScript.Echo "[" & strNextLine & "] Is an Img tag without ALT."
      End If
   Else
      WScript.Echo "[" & strNextLine & "] Is not an Img tag."
   End If
Loop

Set RegularExpressionObject = nothing
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
This code tracks birthdays 3 73
Powershell String Parsing with Regular Expression 19 50
Apps blocked by Java 9 79
Filename to be appended with DateTimeof Download 3 55
A short article about a problem I had getting the GPS LocationListener working.
Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…

773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question