Link to home
Start Free TrialLog in
Avatar of bfuchs
bfuchsFlag for United States of America

asked on

Correction of date field #2.

Hi Experts,

This is in regard of the following

https://www.experts-exchange.com/questions/29158415/Correction-of-date-field.html

Would like to expand the functionality suggested there to handle the attached list as much as possible.

So for example

01//01/19 gets converted to 01/01/19
01/011955 gets converted to 01/01/1955
01\01\1955 to 01/01/1955
01–01-1955 to 01/01/1955

and so on...

Thanks
InvalidDates.xlsx
Avatar of Gustav Brock
Gustav Brock
Flag of Denmark image

Some of these are easily handled with Replace, but some - like 906448 - takes some more.
I don't think you can expect anyone here to create a full solution for the fun only; I would consider it a paid job.
It would probably be quite a chunk of money...given the background of this particular question... :)
You already accepted a solution in the previous question...
Did you apply it to this list,...and how did it work?

Note that some of the values would be near impossible to "Fix"
for example:
2013
Bryan was
G66687y h
22
10/6/201+

Is this the only list that needs to be fixed?
If so, ...then with only around 190 entries, ...it might just be faster to use the "Replace" solution to do the bulk of the heavy lifting, ....then fix the remaining values manually...
Avatar of bfuchs

ASKER

Hi Experts,

I'm not expecting a full solution, just to enhance the current function to handle some additional cases.

You already accepted a solution in the previous question...
Did you apply it to this list,...and how did it work?
Yes big time, originally I had close to 500 instances and this reduced to less than 200.

Note that some of the values would be near impossible to "Fix"...
Of course those can be omitted.

How about if we start with the most common ones, and the rest will do it as Jeff suggested.

Attached the list according to # of occurrences.

Thanks,
Ben
BadDates.xlsx
First things first.
Try and format everything that already seems like a date, ...to a date
Select Column A
Select: "Short Date" from the Number format drop-down.
...Everything that Excel can interpret as a date will now be right justified

The most simple way to replace all the "\" and "-", ... with "/", ...would be to use the Find/Replace utility
Select Column A
Home-->Editing-->Find & Select-->Replace

Find What: \
Replace with: /
Click: Replace All

Find What: -
Replace with: /
Click: Replace All

However, ...this will only fix a small percentage of the values (only about 31 of the 190 values, ...or roughly 16%)
Leaving the other 159 values to be deleted, fixed manually, ...or dealt with using bespoke functions for each unique invalid date format/value.

FWIW
One way to avoid this issue in the future, would be to adopt one of the newer methods for allowing users to enter (8 Digit) dates in separate text-boxes or combo-boxes.

Enter two digit Month:
Provide two textboxes, ...each allowing only one number. limit the first box to 0 or 1, ...limit the second box to 1 through 9

Enter two digit day
Provide two textboxes, ...each allowing only one number. limit the first box to 0,1, 2, and 3, ...limit the second box to 0 through 9

Enter a 4 digit year
Provide four textboxes, ...each allowing only one number. limit the first box to 1 and 2, ...limit the second, third, and fourth boxes to 0 through 9

After a date is entered you can use any of the "IsDate" functions, that are available validate the date and provide whatever level of feedback/error handling you need.

Obviously you can build comboboxes/dropdows to select the Month and Day, but a full 8 digit system would force users to think about dates in this fashion
This allows you to build only valid dates based on the international date format:
YYYY-MM-DD
(Not thinking of dates in this way seems to be the reason for a lot of the errors)

In this way you can use the entered values to format the date in any country format you want, ...with the actual "Date" remaining the same.
mm/dd/yyyy
yyyy-mm/dd
dd/mm/yyyy

You can research this technique independently....

JeffCoachman
I'm not expecting a full solution, just to enhance the current function to handle some additional cases.

But where to stop? And what about the remaining values. It could easily take a day to write something that corrects all these entries, not say find an algorithm to make the "best" guess for those that don't represent a unique date.

A full solution to avoid such a mess in the future:

Entering ISO formatted date with input mask and full validation in Microsoft Access
Hi Ben,

I would also say, Gustav already wrote a parse function in the last question which I think would parse some of the cases in your list to a correct case - so that's the first step.

Then we are talking about only 200 cases - I would not even start to write a code for that, just "parse" them manually and entering the right date where possible, removing invalid entries to NULL. I think you will be faster with that method to correct your data.

Third step is to correct the database which seems to allow entering of texts where only a date is allowed. So change the datatype of the field to a date type after correcting the entries and in the frontend (which I guess is Access as you mostly used that in the past) the textbox will display a calendar icon where the user can select the date. If he enter it manually, Access will automatically show an error message if something is entered which is not a date. And so you will not get any problems with parsing anything in future.

Parsing is a bad idea if you allow the user to enter anything as a date can easily be misinterpreted. What is "12/01/01"? Is it 2012-Jan-01? Is it Dec-2001-01? Is it 1912-Jan-01? And so on. A parser would try to do it's best to find a valid date but is it valid in case of what this date represent in the database? In case of a birth date "1912" would be a very old human if this is an employee table, but if you want to save the birth date of Gene Kelley it would be correct - in case of a build date of a building it would also be valid - but not if you want to save build dates after 2000 only because the business which built them was founded in 2000.

So I would not use a parser, only let it suggest a date but then review it manually and enter the correct date manually, case by case. And in case of 200 entries I would even not take the time to write here, it would already be done in the same time...:-)

Cheers,

Christian
Avatar of bfuchs

ASKER

Hi Experts,

I've started doing like Jeff first suggested, as following

SELECT DISTINCT dbo_Skilled_Nursing_Visit_Note.Date_Of_Birth, Count(dbo_Skilled_Nursing_Visit_Note.SNV_ID) AS CountOfSNV_ID, Format([Date_Of_Birth],"Short Date") AS [Short Date]
FROM dbo_Skilled_Nursing_Visit_Note
WHERE (((ParseDate([Date_Of_Birth])) Is Null) AND ((IsDate([Date_Of_Birth]))=0))
GROUP BY dbo_Skilled_Nursing_Visit_Note.Date_Of_Birth
HAVING (((dbo_Skilled_Nursing_Visit_Note.Date_Of_Birth) Is Not Null))
ORDER BY Count(dbo_Skilled_Nursing_Visit_Note.SNV_ID) DESC;

Open in new window


And didn't see any improvement, perhaps the opposite, see second attachment.


Then took second approach of using replace, as follows

    ' Replace date typos.
    Text = Replace(Replace(Replace(Text, "O", "0"), "l", "1"), "I", "1")
    ' Replace separator typos.
    Text = Replace(Replace(Replace(Replace(Text, "_", "/"), ".", "/"), "-", "/"), ":", "/")
    Text = Replace(Replace(Replace(Replace(Text, "O", "0"), "l", "1"), "I", "1"), "d", "")
    Text = Replace(Text, "-", "/")
    Text = Replace(Text, "\", "/")
    Text = Replace(Text, ".", "")

Open in new window

However didn't got much further.
Also wondering, why didn't that took care of the first record shown in attached.

@Bit,

Then we are talking about only 200 cases - I would not even start to write a code for that, just "parse" them manually and entering the right date where possible, removing invalid entries to NULL. I think you will be faster with that method to correct your data.
Third step is to correct the database...
The problem is that I cannot change the datatype yet, and meanwhile looking for a way to keep converting all new added records to dates...



Thanks,
Ben
Untitled.png
Untitled.png
we are talking about only 200 cases

I agree. Coding this is waste of time. Do it manually which you eventually will have to anyway.
ASKER CERTIFIED SOLUTION
Avatar of Bitsqueezer
Bitsqueezer
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of bfuchs

ASKER

Thank you.