Solved

Using VBScript to parse out illegal filename characters with RegExp

Posted on 2012-03-26
7
1,055 Views
Last Modified: 2012-06-27
I'm trying to write a script that will parse out a list of file names and check them for illegal characters. I figured I'd use some sort of regular expression for it, because it seemed like the easiest way to go, but I'm not sure what that string would look like. It would need to include:
0-9
A-Z
a-z
.
_
-

 Any suggestions as to how to write it?
0
Comment
Question by:d0ughb0y
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
7 Comments
 
LVL 1

Expert Comment

by:dominicreina
ID: 37767992
Where are you getting the list of file names from?  Are these existing files or some text list?
0
 
LVL 8

Author Comment

by:d0ughb0y
ID: 37772011
They are existing files, and would be accessed via an FSO. Then, in a For Each loop, we'd pull each file name, and compare it to the RegExp. If it matches, move along. If not, dump it to an array variable. At the end, if there's anything in the array, dump that to a MsgBox or WScript.Echo.

But again, I don't know how to structure the RegExp for it.
0
 
LVL 1

Expert Comment

by:dominicreina
ID: 37772156
Maybe I'm missing something but if the files already exist then they wouldn't be able to have an illegal character in the name because the file system wouldn't allow it to have been created in the first place.  Can you provide some context and your code so I can fully understand the goal?
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 8

Author Comment

by:d0ughb0y
ID: 37772243
Yeah, I think some context would help.

The files are placed in a folder for transfer via FTP. The problem is that sometimes, the users put in a file name like, "Doe, John - 1955.pdf" as the filename. In Windows, with long filenames, that's a legitimate name. But the FTP software won't transfer it because it contains illegal characters - specifically, the comma. (Spaces, it seems okay with.)

So what I'm trying to do is include a little piece to check whether the filenames are all valid. Windows only prohibits \/:*?"<> and |. I want to also exclude ,';!@#$%^&()+={}[] because I think the FTP application may have a problem with those too. Also any Escaped characters - Believe it or not, one user gave a file a name that I couldn't see anything wrong with, but it wouldn't transfer until I renamed it entirely.

I figured it'd be easier to just tell the script what I would accept, rather than list everything I wouldn't.

Is that clearer?
0
 
LVL 1

Accepted Solution

by:
dominicreina earned 180 total points
ID: 37772770
That makes sense.  Try this out:

'Define valid characters
strValidChars = "abcdefghijklmnopqrstuvwxyz1234567890-_. "

'Define target folder
strFolderPath = "c:\scripts"

'Create File System Object
Set objFSO = CreateObject("Scripting.FileSystemObject")

'Bind to folder
Set objFolder = objFSO.GetFolder(strFolderPath)
For Each objFile In objFolder.Files
	
	'Get file name
	strFileName = objFile.Name
	
	'Parse characters of name into individual strings
	For intChar = 1 To Len(strFileName)
	
		'Make all letters lower case - This is not changing the actual file names just the strings in memory for analysis
		strChar = LCase(Mid(strFileName, intChar, 1))
		
		'Call out invalid names and add to string of invalid names
		If InStr(strValidChars, strChar) = 0 Then
			strInvalidNames = strInvalidNames & strFileName & VBCrLf
		End If
	Next
Next

'Echo list of invalid names
WScript.Echo strInvalidNames

Open in new window

0
 
LVL 8

Author Closing Comment

by:d0ughb0y
ID: 37772883
Almost perfect! I added a little bit just to prevent it from listing a file multiple times for multiple violations. The new code looks like this:


'Define valid characters
strValidChars = "abcdefghijklmnopqrstuvwxyz1234567890-_. "

'Define target folder
strFolderPath = "c:\scripts"

'Create File System Object
Set objFSO = CreateObject("Scripting.FileSystemObject")

'Bind to folder
Set objFolder = objFSO.GetFolder(strFolderPath)
For Each objFile In objFolder.Files
	
	'Get file name
	strFileName = objFile.Name
	boolBadFile = False
	
	'Parse characters of name into individual strings
	For intChar = 1 To Len(strFileName)
	
		'Make all letters lower case - This is not changing the actual file names just the strings in memory for analysis
		strChar = LCase(Mid(strFileName, intChar, 1))
		
		'Call out invalid names and add to string of invalid names
		If boolBadFile = False Then
			If InStr(strValidChars, strChar) = 0 Then
				strInvalidNames = strInvalidNames & strFileName & VBCrLf
				boolBadFile = True
			End If
		End If
	Next
Next

'Echo list of invalid names
WScript.Echo strInvalidNames

Set objFolder = Nothing
Set objFSO = Nothing

Open in new window


Thanks much!
0
 
LVL 1

Expert Comment

by:dominicreina
ID: 37772891
I'm glad it worked for you...Thanks for the points!!!
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Do you hate spam? I do, and I am willing to bet you do as well. I often wonder, though, "if people hate spam so much, why do they still post their email addresses on the web?" I'm not talking about a plain-text posting here. I am referring to the fa…
Deploying a Microsoft Access application in a Citrix environment is not difficult but takes a few steps. However, Citrix system people are often of little help, as they typically know next to nothing about Access. The script provided here will take …
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question