I am designing a system for product safety inspectors. These inspectors go to a shop or warehouse, take a sample of the goods for sale and test them for safety. If they fail the test they are assigned certain failure codes. The system will allow them to store various types of information about the Inspection, including Quantity Inspected and Quantity Failed. It will also calculate optimum Sample Sizes and confidence intervals for these, as well as some kind of confidence data for the detected rate of failure.

I need to confirm that my understanding of this is correct and that the answers my system will give to the users will make sense.

I know how to calculate the confidence interval for a given sample size (assuming arbitrarily large population size and random sampling method):

cc = sqrt( Z^2 * p * (1-p) / ss )

Where

cc = confidence interval as a percentage of detected failure rate

Z = Z value (e.g. 1.96 for 95% confidence)

p = failure rate (0.5 for worst case scenario)

ss = sample size (Quantity Inspected)

What I'm not so sure about is how to measure the confidence that an inspector may have in the detected failure rate *after* the inspection. For example, if no items in the sample fail, how much confidence can he/she have that no other items in the population will fail? If I calculate the confidence interval in this circumstance, with p = 0.0, the answer is always 0 (obviously!) whatever the sample size. However, this can't be true because if I sample only 10 items, surely the confidence will be significantly less than if I had sampled 100 items? Or, is the detected failure rate inconsequential here?

Thanks

This in fact represents the "sensitivity" of your measurament. To have proper estimates of failure rates, you should have a sample large enough to ensure that in most cases there is at least one failure.

On the other hand, you are probably only interested in checking whether the failure rate remains below a maximum determined on the basis of safety laws, quality target, and what-have-you. Therefore, you might be happy with the smallest sample size which satisfies the following:

1. applicability of statistical criteria (10? 20?)

2. enough spread to have the required maximum above 1 (2? 3?)

I would ask the manufacturer to tell you what failure rate they expect. If that were not possible, I would start with a sample of 10 and see whether it needs to be increased. After all, there will certainly be a period of testing, calibration, verification.