Solved

Data quality and validation

Posted on 2012-04-12
7
341 Views
Last Modified: 2012-06-21
I need some form of audit checklist for data quality and validation. We have hundreds and hundreds of lines of rows of data in an access database that seems to be a complete mess when it comes to quality, data in wrong fields, missing data, inconsistent data etc. There is no application front end that controls the input of data, and there appears to be a multitude of ways that the data has ended up in the tables in the database (including manual typing). I need some sort of data quality and validation audit benchmark to say the way you are doing things is hopeless, here is a set of controls or best practices to prevent this happening again. I just want a document with “here’s control 1-15 on how to manage this to ensure data quality and useful data”.
0
Comment
Question by:pma111
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
7 Comments
 
LVL 6

Accepted Solution

by:
wshark83 earned 125 total points
ID: 37836862
if its an access database you can force to users to ensure they put correct values by having field validation in place...

best way to show users how bad they are at entering details is some basic stats on each field...i.e. if you have a postcode file and it should only contain a format i.e. AANN NNAA then show them how many don't meet this requirement
0
 
LVL 3

Author Comment

by:pma111
ID: 37836905
Ok thanks. I was more after either a checklist of a workflow on how to audit data quality/validation. Ie a 1-10 checks, i.e. 1st check this, then this, then that etc etc.
0
 
LVL 5

Assisted Solution

by:ggzfab
ggzfab earned 125 total points
ID: 37837018
Guess the audit depends on the definition of the fields, so start with creating a table/fields list where the field definition and possible value(s) are reorded. (You could use the Access table in designmode and fill the field description)

Next you can create GroupBy queries to count the occurrence of different values for fields with a limited number of values. Test date fields to hold only valid dates, etc.

The needed audit queries will depend on the definition and some definitions can't be tested automatically. How to tell or a Lastname holds also the Firstname when it's holding multiple strings... So manual checking will be necessary, but all based on the recorded field definition.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 3

Author Comment

by:pma111
ID: 37837536
Could either of you (if DBA's) give perhaps a top 5 "routes" that data could/usually will get into tables in access. Obviosuly one is input from some sort of user/application interface, but I am sure there are many others.
0
 
LVL 74

Assisted Solution

by:Jeffrey Coachman
Jeffrey Coachman earned 125 total points
ID: 37837832
Please know that without a sample database to examine, this is an extremely "Broad" question.
any "Top 5" list would depend on the *Specific* design of your existing database.

At the most basic level, the tables must be "normalized" and "Related" properly.
If this is not done first, nothing will ever work quite right.

At a higher level, you need to build tables, that list all the "Valid" values that you will need. (This is also part of the Normalization process)
Then use comboboxes in your forms to only allow those values in the database.

All this being said, ... You, as the person in charge of the project, need to have a firm grasp of these concepts, it is not enough to just get a "Top 5" list and work form that...

So you need to state the level of experience you have with database design.
If you are a beginner, you are facing an uphill battle here.

Beyond that you really need to post a sample database and point out the exact issues you need help with.

Sample database notes:
1. Back up your database(s).
2. Combine the front and back ends into one database file.
3. Remove any startup options, unless they are relevant to the issue.
4. Remove any records unless they are relevant to the issue.
5. Delete any objects that do not relate directly to the issue.
6. Remove any references to any "linked" files (files outside of the database, Images, OLE Files, ...etc)
7. Remove any references to any third party Active-x Controls (unless they are relevant to the issue)
8. Remove, obfuscate, encrypt, or otherwise disguise, any sensitive data.
9. Compile the code. (From the VBA code window, click: Debug-->Compile)
10. Run the compact/Repair utility.
11. Remove any Passwords and/or security.
12. If a form is involved in the issue, set the Modal and Popup properties to: No
    (Again, unless these properties are associated with the issue)
13. Post the explicit steps to replicate the issue.
14. Test the database before posting.

In other words, ...post a database that we can easily open and immediately see and/or troubleshoot the issue.
And if applicable, also include a clear graphical representation of the *Exact* results you are expecting, based on the sample data.


JeffCoachman
0
 
LVL 31

Assisted Solution

by:Helen Feddema
Helen Feddema earned 125 total points
ID: 37838736
For data entry, wherever possible, present users with a combo box or listbox of valid choices, as opposed to allowing free-form entry.  Use input masks where relevant, and validity checks on the AfterInsert event of forms.  If the situation is really bad (sounds like it is!), you may have to use unbound forms, with data being written to tables only after each field is checked for a valid entry.
0
 
LVL 5

Expert Comment

by:ggzfab
ID: 37839400
Many applications can use Access as a database and the Access tables could also be 'linked' to e.g. an Excel spreadsheet.

Assuming the database is filled from another application, the best way to find out is to rename the database to e.g. "name_backup.mdb" and wait untill users complain that their application isn't working :-)

One possible check can be to see or there's an ODBC connection defined pointing to the database, but a .php or .asp application could also build such a connection from code...
The ODBC connections can be found under Start/Programs/Administrative Tools/Datasources (ODBC).
0

Featured Post

Ransomware-A Revenue Bonanza for Service Providers

Ransomware – malware that gets on your customers’ computers, encrypts their data, and extorts a hefty ransom for the decryption keys – is a surging new threat.  The purpose of this eBook is to educate the reader about ransomware attacks.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article describes a method of delivering Word templates for use in merging Access data to Word documents, that requires no computer knowledge on the part of the recipient -- the templates are saved in table fields, and are extracted and install…
This post contains step-by-step instructions for setting up alerting in Percona Monitoring and Management (PMM) using Grafana.
Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…
Add bar graphs to Access queries using Unicode block characters. Graphs appear on every record in the color you want. Give life to numbers. Hopes this gives you ideas on visualizing your data in new ways ~ Create a calculated field in a query: …

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question