Data Profiling to identify data entry errors against standard conventions Access / Excel
Posted on 2008-06-18
I am trying to determine the most efficient way to "profile" data from my HR system to identify if data has been entered into the "name field in avvordance with the organisations standard naming conventions.
Conventions are as follows:
Standard: Enter the employee's name into PMKeyS in Lastname,Firstname Secondname format. There are some exceptions, for example when the name entered is for information purposes in a text field.
Surnames w Apostrophies: No space between the first letter, the apostrophe and the second letter, with the second letter in upper case. For example: - O'Brien
Surnames w Lower Case Letters: No space between the lower cased letter and the next letter in upper case. Ensure that the first letter after a Mc or Mac is a capital letter. For example: - McDonald and MacDonnell
Surnames w Full Stops: No space between the full stop and the next letter with it also being in upper case. For Example: - St.Clair
Surnames w Spaces : Where there is a gap the next letter should be in upper case. For Example: - Van Der Wing, De Lore & Le Fevre.
Surnames w Hyphens: No space between the last letter, the hyphen and the next letter which is to be in upper case. For Example: - Miller-Smith & Townsend-Marks.
Names w Accents: Surnames or Given Names with accents, such as René should be entered without the accent. Characters with accents cause problems with data loads to systems that interface with PMKeyS. Characters with accents can also limit search results within PMKeyS. For example, when searching for an employee by name, the employee may be excluded from search results unless the name is typed exactly the same, including the accent.
I have imported the data into an access database. I also have it in Excel. I am not really sure where to start on this one. Should I be trying to achieve this in Access or Excel, or is there another product that would serve my purpose better ? If possible I prefer to go with Access.
Sample data includes
de Somer,Spencer John
de Vries,James Scott
Any help would be appreciated. Note. I do not expect someone to write all the code etc to achieve the above. Assistance achieving a solution for the "Standard" and "Surnames w Lower Case Letters" conventions would provide enough help to gtet me on my way.
PS I am a complete novice with data profiling, this is my first attempt to automate it.