Using Regex functions to find a custom format of a file name

Hello all, I have a filename from users where I need to validate and accept only if its in the format as the following:

abc-123ab-yyyymmdd
seperator:  "-"
the first three digits are the organization code (alphanumeric)
The second five digits are the name of the project (alphanumeric)
The last 8 are the date in yyyymmdd format

I am using vb.net code and need a effective way to validate the file name format?
welcome 123Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

käµfm³d 👽Commented:
Imports System.Text.RegularExpressions
Imports System.Globalization

...

Dim isMatch As Boolean = False

If Regex.IsMatch(strInput, "^[a-zA-Z0-9]{3}-[a-zA-Z0-9]{5}-[0-9]{8}$") Then
    Dim parsedDate As DateTime
    Dim datePart As String = strInput.Substring(strInput.LastIndexOf("-"c) + 1)
    Dim provider As IFormatProvider = System.Threading.Thread.CurrentThread.CurrentCulture.DateTimeFormat

    isMatch = DateTime.TryParseExact(datePart, "yyyyMMdd", provider, DateTimeStyles.None, parsedDate)
End If

Open in new window


While it's kind of possible to validate dates in regex, it's complicated and cumbersome. This is why the inner block defaults to trying to parse the date value by its format.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
welcome 123Author Commented:
I totally agree with you about not using regex for date, this should work, will try first thing tomorrow morning and update and thanks a lot for the quick response
0
it_saigeDeveloperCommented:
*No points*
I would just suggest one change to the regex to make it a little more readable:
"[\D\d]{3}-[\D\d]{5}-\d{8}"

Open in new window


But otherwise this does work nicely.  It really then just comes down to a matter of implementation preference; e.g. Using extension methods:
Imports System.Text.RegularExpressions
Imports System.Globalization
Imports System.Runtime.CompilerServices
Imports System.IO

Module Module1
	Private files As New List(Of String) From {"ABC-123AB-20141219.txt", "EFG-456EF-20141219.txt", "HIJ-789HI-20141220.txt", "KLM-012KL-20141221.txt"}

	Sub Main()
		For Each item In files
			Console.WriteLine("{0} matches the file name pattern: {1}", item, item.IsNameMatch())
		Next

		Console.ReadLine()
	End Sub
End Module

Module Extensions
	<Extension()> Public Function IsNameMatch(ByVal fileName As String) As Boolean
		Return New FileInfo(fileName).IsNameMatch()
	End Function

	<Extension()> Public Function IsNameMatch(ByVal file As FileInfo) As Boolean
		Dim dTime = DateTime.MinValue
		Dim pattern = New Regex("[\D\d]{3}-[\D\d]{5}-\d{8}")
		Dim format = New String() {"yyyyMMdd"}
		Dim matches = pattern.Matches(file.FullName)
		If matches.Count > 0 Then
			For Each [match] As Match In matches
				Return DateTime.TryParseExact(match.Value.Split("-")(2), format, CultureInfo.InvariantCulture, DateTimeStyles.NoCurrentDateDefault, dTime)
			Next
		End If
		Return False
	End Function
End Module

Open in new window

Produces the following output -Capture.JPG-saige-
0
Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

welcome 123Author Commented:
The above works great, just before closing the question can you help me with getting the value of the date part in the string which is in the format of yyyymmdd to a date property

I tried:  Date.ParseExact("20150406", "yyyyMMdd", Nothing) but gives me an error
0
it_saigeDeveloperCommented:
One way to do it:
Imports System.Text.RegularExpressions
Imports System.Globalization
Imports System.Runtime.CompilerServices
Imports System.IO

Module Module1
	Private files As New List(Of String) From {"ABC-123AB-20141219.txt", "EFG-456EF-20141219.txt", "HIJ-789HI-20141220.txt", "KLM-012KL-20141221.txt"}

	Sub Main()
		Dim [date] As Date
		Console.WriteLine("Using the extension method")
		For Each item In files
			If item.IsNameMatch([date]) Then
				Console.WriteLine("{0} matches the file name pattern: The date part is - {1}.", item, [date].ToString("yyyyMMdd"))
			Else
				Console.WriteLine("{0} does not match the file name pattern.", item)
			End If
		Next
		Console.ReadLine()
	End Sub
End Module

Module Extensions
	<Extension()> Public Function IsNameMatch(ByVal fileName As String, ByRef [date] As Date) As Boolean
		Return New FileInfo(fileName).IsNameMatch([date])
	End Function

	<Extension()> Public Function IsNameMatch(ByVal file As FileInfo, ByRef [date] As Date) As Boolean
		Dim pattern = New Regex("[\D\d]{3}-[\D\d]{5}-\d{8}")
		Dim format = New String() {"yyyyMMdd"}
		Dim matches = pattern.Matches(file.FullName)
		If matches.Count > 0 Then
			For Each [match] As Match In matches
				Return DateTime.TryParseExact(match.Value.Split("-")(2), format, CultureInfo.InvariantCulture, DateTimeStyles.NoCurrentDateDefault, [date])
			Next
		End If
		Return False
	End Function
End Module

Open in new window

Produces the following output -Capture.JPG-saige-
0
käµfm³d 👽Commented:
@it_saige

I now use [0-9] rather than "\d" because as another expert pointed out to me "\d" encompasses more than just 0-9. It encompasses digits that are found in other languages/character sets as well. Also, "\D" means anything that is not a numeric digit, so it would also capture periods, parentheses, exclamation points, letters, etc. In my opinion, that makes the pattern incorrect.
0
it_saigeDeveloperCommented:
@kaufmed, Interesting point on [0-9] vs. "\d".

With the "\D", you are correct, that would match more than just letters.  I did not even consider that when making my observation.  I figured with using the FileInfo class that it would filter out any malformed file names.

-saige-
0
welcome 123Author Commented:
Thanks a lot to kaufmed and also to  it_saige for some good inputs
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
.NET Programming

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.