Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

How do I identify strange data in access 2007

Posted on 2011-03-25
9
Medium Priority
?
318 Views
Last Modified: 2012-05-11
I have a collection of data where every now and then an incorrect setting somewhere in the data that was used previously to calculate the result caused the result to be wrong.

It is somewhat fortunate that when it goes wrong ... it goes real wrong .... so at a glance it is obvious something is wrong but the sheer amount of data makes it incredibly hard to find it in the first instance.

See the data I have provided.

With the 49 items given that should be similar with an average value of .228 it is obvious that the two numbers of 2600 are incorrect .... or at least different

I have a means of segregating the data into what should be like groups

Is there a way access can flag the two odd records?

Thanks
Paul ExampleData.xls
0
Comment
Question by:Zarbs
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
9 Comments
 
LVL 120

Accepted Solution

by:
Rey Obrero (Capricorn1) earned 375 total points
ID: 35220519
if you are expecting values less than one as correct values, you can use a query to pinpoint the wrong values

select product, totalcost
from table
where totalcost >1

0
 

Author Comment

by:Zarbs
ID: 35220542
I am really trying to isolate data that is not the same.  It need not necessarily be any particular number as each "group" of data could have different values, minimums, maximums, averages, etc
0
 

Author Comment

by:Zarbs
ID: 35220550
For example if I have 40 values between 1 and 2 .... I would not consider a value of 100 to be part of that set
0
Does Your Cloud Backup Use Blockchain Technology?

Blockchain technology has already revolutionized finance thanks to Bitcoin. Now it's disrupting other areas, including the realm of data protection. Learn how blockchain is now being used to authenticate backup files and keep them safe from hackers.

 
LVL 120

Expert Comment

by:Rey Obrero (Capricorn1)
ID: 35220555
well, that will be difficult if you can not establish the correct parameters for pinpointing the incorrect values.
0
 

Author Comment

by:Zarbs
ID: 35220569
Thinking back to my school days there was something called correlation that measured whether something was part of same group.

Too long ago
0
 
LVL 40

Assisted Solution

by:als315
als315 earned 375 total points
ID: 35221536
You can continue study statistics, of course:)
 But for simple analysis you can use stdev and avg.
In this example you can set quantity of stdev from avg for estimation of value.
DB26913114.zip
0
 
LVL 93

Assisted Solution

by:Patrick Matthews
Patrick Matthews earned 375 total points
ID: 35221826
>>Thinking back to my school days there was something called correlation that measured whether something
>>was part of same group.


That's not what correlation is.  Correlation quantifies the degree to which two random variables are related.

:)

Anyway, be very careful about using an approach that simply uses sample standard deviation and sample mean.  In your sample set, the sample standard deviation is 533, despite the fact that only two members of the set are >1.  Thus, if you set a rule such as "reject any items >2 SD from the mean", you could be asking for trouble.  (In a normal distribution, approximately 95% of the observations are expected to fall within 2 SD of the mean.)

Consider that if I add a single value of 300 to your set, the sample standard deviation stays high--about 528--and so this simple approach would accept 300 as valid, despite it being wholly unlike almost every other member of the set.

So, can you formulate a couple of rules that would identify a possible outlier?
0
 
LVL 44

Assisted Solution

by:GRayL
GRayL earned 375 total points
ID: 35222900
I imported the xls file into an Access table named tblCosts  
Pick a number you want as the max value for the average under consideration. Then run this Query:

SELECT Avg(TotalCost) FROM tblCosts WHERE TotalCost < [EnterMaxValue];

0.3 yields 0.20738..
0.4 yields 0.21787..
0.5 yields 0.21816..

If all the data were together, you could enter both Min and Max values in the WHERE clause.  You get the idea.
0
 

Author Closing Comment

by:Zarbs
ID: 35223184
Thank you all very much for your suggestions and ideas.  I will muddle through.

I may be a little old for school

Cheers
Paul
0

Featured Post

Survive A High-Traffic Event with Percona

Your application or website rely on your database to deliver information about products and services to your customers. You can’t afford to have your database lose performance, lose availability or become unresponsive – even for just a few minutes.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In Part II of this series, I will discuss how to identify all open instances of Excel and enumerate the workbooks, spreadsheets, and named ranges within each of those instances.
In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
In Microsoft Access, learn different ways of passing a string value within a string argument. Also learn what a “Type Mis-match” error is about.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

671 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question