Solved

Data quality and validation

Posted on 2012-04-12
7
336 Views
Last Modified: 2012-06-21
I need some form of audit checklist for data quality and validation. We have hundreds and hundreds of lines of rows of data in an access database that seems to be a complete mess when it comes to quality, data in wrong fields, missing data, inconsistent data etc. There is no application front end that controls the input of data, and there appears to be a multitude of ways that the data has ended up in the tables in the database (including manual typing). I need some sort of data quality and validation audit benchmark to say the way you are doing things is hopeless, here is a set of controls or best practices to prevent this happening again. I just want a document with “here’s control 1-15 on how to manage this to ensure data quality and useful data”.
0
Comment
Question by:pma111
7 Comments
 
LVL 6

Accepted Solution

by:
wshark83 earned 125 total points
ID: 37836862
if its an access database you can force to users to ensure they put correct values by having field validation in place...

best way to show users how bad they are at entering details is some basic stats on each field...i.e. if you have a postcode file and it should only contain a format i.e. AANN NNAA then show them how many don't meet this requirement
0
 
LVL 3

Author Comment

by:pma111
ID: 37836905
Ok thanks. I was more after either a checklist of a workflow on how to audit data quality/validation. Ie a 1-10 checks, i.e. 1st check this, then this, then that etc etc.
0
 
LVL 5

Assisted Solution

by:ggzfab
ggzfab earned 125 total points
ID: 37837018
Guess the audit depends on the definition of the fields, so start with creating a table/fields list where the field definition and possible value(s) are reorded. (You could use the Access table in designmode and fill the field description)

Next you can create GroupBy queries to count the occurrence of different values for fields with a limited number of values. Test date fields to hold only valid dates, etc.

The needed audit queries will depend on the definition and some definitions can't be tested automatically. How to tell or a Lastname holds also the Firstname when it's holding multiple strings... So manual checking will be necessary, but all based on the recorded field definition.
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 3

Author Comment

by:pma111
ID: 37837536
Could either of you (if DBA's) give perhaps a top 5 "routes" that data could/usually will get into tables in access. Obviosuly one is input from some sort of user/application interface, but I am sure there are many others.
0
 
LVL 74

Assisted Solution

by:Jeffrey Coachman
Jeffrey Coachman earned 125 total points
ID: 37837832
Please know that without a sample database to examine, this is an extremely "Broad" question.
any "Top 5" list would depend on the *Specific* design of your existing database.

At the most basic level, the tables must be "normalized" and "Related" properly.
If this is not done first, nothing will ever work quite right.

At a higher level, you need to build tables, that list all the "Valid" values that you will need. (This is also part of the Normalization process)
Then use comboboxes in your forms to only allow those values in the database.

All this being said, ... You, as the person in charge of the project, need to have a firm grasp of these concepts, it is not enough to just get a "Top 5" list and work form that...

So you need to state the level of experience you have with database design.
If you are a beginner, you are facing an uphill battle here.

Beyond that you really need to post a sample database and point out the exact issues you need help with.

Sample database notes:
1. Back up your database(s).
2. Combine the front and back ends into one database file.
3. Remove any startup options, unless they are relevant to the issue.
4. Remove any records unless they are relevant to the issue.
5. Delete any objects that do not relate directly to the issue.
6. Remove any references to any "linked" files (files outside of the database, Images, OLE Files, ...etc)
7. Remove any references to any third party Active-x Controls (unless they are relevant to the issue)
8. Remove, obfuscate, encrypt, or otherwise disguise, any sensitive data.
9. Compile the code. (From the VBA code window, click: Debug-->Compile)
10. Run the compact/Repair utility.
11. Remove any Passwords and/or security.
12. If a form is involved in the issue, set the Modal and Popup properties to: No
    (Again, unless these properties are associated with the issue)
13. Post the explicit steps to replicate the issue.
14. Test the database before posting.

In other words, ...post a database that we can easily open and immediately see and/or troubleshoot the issue.
And if applicable, also include a clear graphical representation of the *Exact* results you are expecting, based on the sample data.


JeffCoachman
0
 
LVL 31

Assisted Solution

by:Helen_Feddema
Helen_Feddema earned 125 total points
ID: 37838736
For data entry, wherever possible, present users with a combo box or listbox of valid choices, as opposed to allowing free-form entry.  Use input masks where relevant, and validity checks on the AfterInsert event of forms.  If the situation is really bad (sounds like it is!), you may have to use unbound forms, with data being written to tables only after each field is checked for a valid entry.
0
 
LVL 5

Expert Comment

by:ggzfab
ID: 37839400
Many applications can use Access as a database and the Access tables could also be 'linked' to e.g. an Excel spreadsheet.

Assuming the database is filled from another application, the best way to find out is to rename the database to e.g. "name_backup.mdb" and wait untill users complain that their application isn't working :-)

One possible check can be to see or there's an ODBC connection defined pointing to the database, but a .php or .asp application could also build such a connection from code...
The ODBC connections can be found under Start/Programs/Administrative Tools/Datasources (ODBC).
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Read about achieving the basic levels of HRIS security in the workplace.
Using SQL Scripts we can save all the SQL queries as files that we use very frequently on our database later point of time. This is one of the feature present under SQL Workshop in Oracle Application Express.
In Microsoft Access, learn different ways of passing a string value within a string argument. Also learn what a “Type Mis-match” error is about.
In Microsoft Access, learn how to use Dlookup and other domain aggregate functions and one method of specifying a string value within a string. Specify the first argument, which is the expression to be returned: Specify the second argument, which …

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now