David Spigelman
asked on
Using VBScript to parse out illegal filename characters with RegExp
I'm trying to write a script that will parse out a list of file names and check them for illegal characters. I figured I'd use some sort of regular expression for it, because it seemed like the easiest way to go, but I'm not sure what that string would look like. It would need to include:
Any suggestions as to how to write it?
0-9
A-Z
a-z
.
_
-
Any suggestions as to how to write it?
Where are you getting the list of file names from? Are these existing files or some text list?
ASKER
They are existing files, and would be accessed via an FSO. Then, in a For Each loop, we'd pull each file name, and compare it to the RegExp. If it matches, move along. If not, dump it to an array variable. At the end, if there's anything in the array, dump that to a MsgBox or WScript.Echo.
But again, I don't know how to structure the RegExp for it.
But again, I don't know how to structure the RegExp for it.
Maybe I'm missing something but if the files already exist then they wouldn't be able to have an illegal character in the name because the file system wouldn't allow it to have been created in the first place. Can you provide some context and your code so I can fully understand the goal?
ASKER
Yeah, I think some context would help.
The files are placed in a folder for transfer via FTP. The problem is that sometimes, the users put in a file name like, "Doe, John - 1955.pdf" as the filename. In Windows, with long filenames, that's a legitimate name. But the FTP software won't transfer it because it contains illegal characters - specifically, the comma. (Spaces, it seems okay with.)
So what I'm trying to do is include a little piece to check whether the filenames are all valid. Windows only prohibits \/:*?"<> and |. I want to also exclude ,';!@#$%^&()+={}[] because I think the FTP application may have a problem with those too. Also any Escaped characters - Believe it or not, one user gave a file a name that I couldn't see anything wrong with, but it wouldn't transfer until I renamed it entirely.
I figured it'd be easier to just tell the script what I would accept, rather than list everything I wouldn't.
Is that clearer?
The files are placed in a folder for transfer via FTP. The problem is that sometimes, the users put in a file name like, "Doe, John - 1955.pdf" as the filename. In Windows, with long filenames, that's a legitimate name. But the FTP software won't transfer it because it contains illegal characters - specifically, the comma. (Spaces, it seems okay with.)
So what I'm trying to do is include a little piece to check whether the filenames are all valid. Windows only prohibits \/:*?"<> and |. I want to also exclude ,';!@#$%^&()+={}[] because I think the FTP application may have a problem with those too. Also any Escaped characters - Believe it or not, one user gave a file a name that I couldn't see anything wrong with, but it wouldn't transfer until I renamed it entirely.
I figured it'd be easier to just tell the script what I would accept, rather than list everything I wouldn't.
Is that clearer?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Almost perfect! I added a little bit just to prevent it from listing a file multiple times for multiple violations. The new code looks like this:
Thanks much!
'Define valid characters
strValidChars = "abcdefghijklmnopqrstuvwxyz1234567890-_. "
'Define target folder
strFolderPath = "c:\scripts"
'Create File System Object
Set objFSO = CreateObject("Scripting.FileSystemObject")
'Bind to folder
Set objFolder = objFSO.GetFolder(strFolderPath)
For Each objFile In objFolder.Files
'Get file name
strFileName = objFile.Name
boolBadFile = False
'Parse characters of name into individual strings
For intChar = 1 To Len(strFileName)
'Make all letters lower case - This is not changing the actual file names just the strings in memory for analysis
strChar = LCase(Mid(strFileName, intChar, 1))
'Call out invalid names and add to string of invalid names
If boolBadFile = False Then
If InStr(strValidChars, strChar) = 0 Then
strInvalidNames = strInvalidNames & strFileName & VBCrLf
boolBadFile = True
End If
End If
Next
Next
'Echo list of invalid names
WScript.Echo strInvalidNames
Set objFolder = Nothing
Set objFSO = Nothing
Thanks much!
I'm glad it worked for you...Thanks for the points!!!