Solved

Calculating confidence for sample sizes when testing pass/failure

Posted on 2004-04-30
10
2,725 Views
Last Modified: 2013-11-13
I am designing a system for product safety inspectors. These inspectors go to a shop or warehouse, take a sample of the goods for sale and test them for safety. If they fail the test they are assigned certain failure codes. The system will allow them to store various types of information about the Inspection, including Quantity Inspected and Quantity Failed. It will also calculate optimum Sample Sizes and confidence intervals for these, as well as some kind of confidence data for the detected rate of failure.

I need to confirm that my understanding of this is correct and that the answers my system will give to the users will make sense.

I know how to calculate the confidence interval for a given sample size (assuming arbitrarily large population size and random sampling method):

cc = sqrt( Z^2 * p * (1-p) / ss )

Where
  cc = confidence interval as a percentage of detected failure rate
  Z = Z value (e.g. 1.96 for 95% confidence)
  p = failure rate (0.5 for worst case scenario)
  ss = sample size (Quantity Inspected)

What I'm not so sure about is how to measure the confidence that an inspector may have in the detected failure rate *after* the inspection. For example, if no items in the sample fail, how much confidence can he/she have that no other items in the population will fail? If I calculate the confidence interval in this circumstance, with p = 0.0, the answer is always 0 (obviously!) whatever the sample size. However, this can't be true because if I sample only 10 items, surely the confidence will be significantly less than if I had sampled 100 items? Or, is the detected failure rate inconsequential here?

Thanks
0
Comment
Question by:jpkemp
  • 5
  • 5
10 Comments
 
LVL 5

Accepted Solution

by:
PointyEars earned 500 total points
ID: 10957180
If no items fail it means that the failure rate is less than 1/sample.  With a sample size of 10, the failure rate can only be estimated to be less than 0.1 .

This in fact represents the "sensitivity" of your measurament.  To have proper estimates of failure rates, you should have a sample large enough to ensure that in most cases there is at least one failure.

On the other hand, you are probably only interested in checking whether the failure rate remains below a maximum determined on the basis of safety laws, quality target, and what-have-you.  Therefore, you might be happy with the smallest sample size which satisfies the following:
1. applicability of statistical criteria (10? 20?)
2. enough spread to have the required maximum above 1 (2? 3?)

I would ask the manufacturer to tell you what failure rate they expect.  If that were not possible, I would start with a sample of 10 and see whether it needs to be increased.  After all, there will certainly be a period of testing, calibration, verification.
0
 
LVL 5

Author Comment

by:jpkemp
ID: 10965295
The kind of failures we're talking about here are serious failures, possibly resulting in a total product recall and/or prosecution of the vendor or supplier. It would not be appropriate or useful to ask the manufacturers what failure rate they expect - the failure rate we expect would normally be zero.
0
 
LVL 5

Expert Comment

by:PointyEars
ID: 10966216
Even if their target is zero, they should define a limit below which they are satisfied.

In any case, if they set their limit so low, it means that any practicable sample will be comparatively small.  That is, no matter how large you take it, it will still give you a limited confidence.  For example, a sample of 100 with no failures tells you that the failure rate is expected to be below 1%.

If you manufacture telephone exchanges, you might be asked by your customers to ensure a maximum off-line time of 5 minutes per year.  Considering that there are more than half a million minutes in a year, this is almost equivalent to a zero-time failure.  But it is measurable.
0
 
LVL 5

Author Comment

by:jpkemp
ID: 10966355
So, you're saying the confidence interval is 1/ss? At what confidence level is this?

I need more details and a more rigorous mathematical foundation.

Thanks
0
 
LVL 5

Expert Comment

by:PointyEars
ID: 10966390
No, 1/ss is not a confidence interval.

Normally, the expected value of an event and the standard deviation of its distribution are estimated from the distribution of events occurring in the sample.  To do so you have to hipotesise a particular distribution (let's say normal/gaussian).

Then, you can ask yourself: "with this distribution, mean, and standard deviation, what is the range of values that gives me a 95% of probability of not being further away from the mean?  This is your confidence interval.

The problem in your situation is that you have no statistical data to work with.  If all samples always return zero, you have no distribution of results on which you can calculate mean and standard deviation.  Therefore, you can only say that you expect an occurrence below 1 out of ss, but cannot really say with what probability.

Sorry, I have to rush, but I din't want to leave you hanging.
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 5

Author Comment

by:jpkemp
ID: 10974025
Thanks PointyEars.

I think I'm starting to understand. My calculation of a confidence interval is meaningless with a zero mean failure rate because the formula assumes the rate for the entire population is proportional to the rate for the sample.

Considering the 1/ss formula; if we take a sample of 10 items from a crate of 100 and find zero failures, 1/ss yields 10% (i.e. up to 10 items in the crate might have failed); if a sample of 20 was taken, 1/ss is 5% (i.e. up to 5 items in the crate might have failed). Is this a valid interpretation?
0
 
LVL 5

Expert Comment

by:PointyEars
ID: 10975287
Basically yes, but not exactly.

If "m" is the expected failure rate and "d" its standard deviation, the formula
  f(n) = (m + 3*d) * n
gives you the maximum number of failures that you would expect on average in a sample of size "n".

If "m" is very low, f(n) will remain below 1, and you will measure zero failures.  Then, you will only know that
  (m + 3*d) * n < 1

So, you cannot say that m < 1/n, but that m + 3*d < 1/n.

If you assume d = sqrt(m) and solve:
  m + 3*sqrt(m) < 1/n
you get:
  m < 0.00027
  d = 0.017

BUT, can you really assume that d = sqrt(m) ?

I believe that it is more reasonable to stop at:
  m + 3*d < 1/n
0
 
LVL 5

Author Comment

by:jpkemp
ID: 10983281
What can m+3*d<1/n tell me when I only know the value of n (>0) and m=0, and how do I use this to measure confidence?
0
 
LVL 5

Expert Comment

by:PointyEars
ID: 10984053
Consider that you are looking at an upper limit.  Therefore, you can safely say that m < 1/n, because "d" certainly is > 0.

That's how you arrive to what I said in a previous email: 1/sampleSize = upper limit for the percentage of failures.

If you think about it, it makes perfectly [common] sense: if you make 10 measurements and none fails, you can just say that not more than 10% of your items fail.  This is obviously only statistically valid if your sample is large enough (10, 20...)
0
 
LVL 5

Author Comment

by:jpkemp
ID: 11004709
Ok, I think that makes sense to me.

Thanks for hanging in there.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Cumulative Distribution 2 34
Probability Distribution 8 57
Graph function 4 72
110V Lasko bladeless fan blows with a burning smell 5 63
Introduction This question got me thinking... (http://www.experts-exchange.com/questions/28707487/GLOBALS.html) Why shouldn't we use Globals? This is a simple question without a simple answer.  How do you explain these concepts to a programmer w…
Exception Handling is in the core of any application that is able to dignify its name. In this article, I'll guide you through the process of writing a DRY (Don't Repeat Yourself) Exception Handling mechanism, using Aspect Oriented Programming.
As a trusted technology advisor to your customers you are likely getting the daily question of, ‘should I put this in the cloud?’ As customer demands for cloud services increases, companies will see a shift from traditional buying patterns to new…
This is a video describing the growing solar energy use in Utah. This is a topic that greatly interests me and so I decided to produce a video about it.

910 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now