Data quality and validation

I need some form of audit checklist for data quality and validation. We have hundreds and hundreds of lines of rows of data in an access database that seems to be a complete mess when it comes to quality, data in wrong fields, missing data, inconsistent data etc. There is no application front end that controls the input of data, and there appears to be a multitude of ways that the data has ended up in the tables in the database (including manual typing). I need some sort of data quality and validation audit benchmark to say the way you are doing things is hopeless, here is a set of controls or best practices to prevent this happening again. I just want a document with “here’s control 1-15 on how to manage this to ensure data quality and useful data”.
LVL 4
pma111Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

wshark83Commented:
if its an access database you can force to users to ensure they put correct values by having field validation in place...

best way to show users how bad they are at entering details is some basic stats on each field...i.e. if you have a postcode file and it should only contain a format i.e. AANN NNAA then show them how many don't meet this requirement
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
pma111Author Commented:
Ok thanks. I was more after either a checklist of a workflow on how to audit data quality/validation. Ie a 1-10 checks, i.e. 1st check this, then this, then that etc etc.
0
ggzfabCommented:
Guess the audit depends on the definition of the fields, so start with creating a table/fields list where the field definition and possible value(s) are reorded. (You could use the Access table in designmode and fill the field description)

Next you can create GroupBy queries to count the occurrence of different values for fields with a limited number of values. Test date fields to hold only valid dates, etc.

The needed audit queries will depend on the definition and some definitions can't be tested automatically. How to tell or a Lastname holds also the Firstname when it's holding multiple strings... So manual checking will be necessary, but all based on the recorded field definition.
0
Protecting & Securing Your Critical Data

Considering 93 percent of companies file for bankruptcy within 12 months of a disaster that blocked access to their data for 10 days or more, planning for the worst is just smart business. Learn how Acronis Backup integrates security at every stage

pma111Author Commented:
Could either of you (if DBA's) give perhaps a top 5 "routes" that data could/usually will get into tables in access. Obviosuly one is input from some sort of user/application interface, but I am sure there are many others.
0
Jeffrey CoachmanMIS LiasonCommented:
Please know that without a sample database to examine, this is an extremely "Broad" question.
any "Top 5" list would depend on the *Specific* design of your existing database.

At the most basic level, the tables must be "normalized" and "Related" properly.
If this is not done first, nothing will ever work quite right.

At a higher level, you need to build tables, that list all the "Valid" values that you will need. (This is also part of the Normalization process)
Then use comboboxes in your forms to only allow those values in the database.

All this being said, ... You, as the person in charge of the project, need to have a firm grasp of these concepts, it is not enough to just get a "Top 5" list and work form that...

So you need to state the level of experience you have with database design.
If you are a beginner, you are facing an uphill battle here.

Beyond that you really need to post a sample database and point out the exact issues you need help with.

Sample database notes:
1. Back up your database(s).
2. Combine the front and back ends into one database file.
3. Remove any startup options, unless they are relevant to the issue.
4. Remove any records unless they are relevant to the issue.
5. Delete any objects that do not relate directly to the issue.
6. Remove any references to any "linked" files (files outside of the database, Images, OLE Files, ...etc)
7. Remove any references to any third party Active-x Controls (unless they are relevant to the issue)
8. Remove, obfuscate, encrypt, or otherwise disguise, any sensitive data.
9. Compile the code. (From the VBA code window, click: Debug-->Compile)
10. Run the compact/Repair utility.
11. Remove any Passwords and/or security.
12. If a form is involved in the issue, set the Modal and Popup properties to: No
    (Again, unless these properties are associated with the issue)
13. Post the explicit steps to replicate the issue.
14. Test the database before posting.

In other words, ...post a database that we can easily open and immediately see and/or troubleshoot the issue.
And if applicable, also include a clear graphical representation of the *Exact* results you are expecting, based on the sample data.


JeffCoachman
0
Helen FeddemaCommented:
For data entry, wherever possible, present users with a combo box or listbox of valid choices, as opposed to allowing free-form entry.  Use input masks where relevant, and validity checks on the AfterInsert event of forms.  If the situation is really bad (sounds like it is!), you may have to use unbound forms, with data being written to tables only after each field is checked for a valid entry.
0
ggzfabCommented:
Many applications can use Access as a database and the Access tables could also be 'linked' to e.g. an Excel spreadsheet.

Assuming the database is filled from another application, the best way to find out is to rename the database to e.g. "name_backup.mdb" and wait untill users complain that their application isn't working :-)

One possible check can be to see or there's an ODBC connection defined pointing to the database, but a .php or .asp application could also build such a connection from code...
The ODBC connections can be found under Start/Programs/Administrative Tools/Datasources (ODBC).
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Access

From novice to tech pro — start learning today.