Be seen. Boost your question’s priority for more expert views and faster solutions

Hi,

If I supply say 50 values in col A, how would I calculate the confidence interval and number of samples required to achieve a defined confidence level of say 95%?

So if my samples only returned a confidence level of say 46% it would say I need an additional 100 samples to achieve 95%?

Cheers

If I supply say 50 values in col A, how would I calculate the confidence interval and number of samples required to achieve a defined confidence level of say 95%?

So if my samples only returned a confidence level of say 46% it would say I need an additional 100 samples to achieve 95%?

Cheers

the key part of the post by matthewspatrick is

"Taking a bigger sample is not going to increase your confidence level. Rather, having a larger sample makes the width of the confidence interval narrower."

He has given you a good introduction to sampling and a link to more infromation which should help you understand and answer the problem.

Thanks for that I have added to my sheet, but not making much sense, using the deviation difference between the mean and 2points before and after the mean I get around 95%+ of values in that range, and yet the proper confidence calculation to the right of the sheet only shows about 6-7% of values fall between the mean and confidence range?

Sample attached

SamplePX.xlsm

That is not how you use a confidence interval. As I indicated above, the common usage for a confidence interval is to show how the precision of a sample mean when it is used to estimate a population mean.

You

In a normally distributed population, it is true that approximately 95% of the members will be within two standard deviations of the mean. However:

That is **not the same thing** as saying "95% will be within the confidence interval for the sample mean"

In looking at how your data are being generated, your source data are not themselves normally distributed

Indeed, when I built my sample file posted in http:#a38822401, I used =ROUND(NORM.INV(RAND(),100

But now we're getting kind of far afield. I think your original question has been answered :)

To generate random data like that:

=MEDIAN(5,18,NORM.INV(RAND

The problem comes where you use something like 11.5 for the mean and 7.5 for the standard deviation. If you force all the values to be between 5 and 18, your sample will not be normally distributed, because you are truncating the tails.

The calculation of Sample size to give an accuracy (resolution) with a 95% CI is, I belive, what the essence of the question is about.

To determine sample size you need to know:

Process Deviation s

95% CI = x +/- r = x+/- 1.96s/ sqrrt(n)

95% CI ~ x+/- 2s/sqrrt(n)

Resolution r = 2s/sqrrt(n)

So for example if we wish to estimate cycle time within +/- one minute (r=1)

And we estimate the standard deviation to be five minutes (s = 5)

n = ((2*5)/1)^2 = 100

So if you know the reolution you are trying to acheive you can calculate the nescessary sample size to acheive a 95% confidence.

I hope this makes sense, I had to get my old SixSigma Black Belt notes out for this one :)

All Courses

From novice to tech pro — start learning today.

Please state more clearly what you are trying to do, because I think you may be confused as to what a confidence interval for a sample mean is :)

Suppose you drew a random, unbiased sample of 50 items from a normally distributed population. Further suppose the following:

The confidence interval for your sample mean is a function of the sample mean, the sample standard deviation, and the confidence level.

Based on the above, first you would compute the standard error:

=sample_std_dev / sqrt(sample_size)

=15 / sqrt(50)

~ 2.12

Next, find the t value associated with 95% confidence and 50 degrees of freedom. This is about 2.01.

Multiply the two to get your margin of error, which in this example is ~ 4.26.

Now, your confidence interval is (sample mean) +/- (margin of error), or 95.74 - 104.26.

What this means is that we expect that there is a 95% chance that the true population mean is a value within the range 95.74 - 104.26.

Taking a bigger sample is not going to increase your confidence level. Rather, having a larger sample,

ceteris paribus, makes the width of the confidence interval narrower.Please see the attached file for an example of how I would do this.

Q-28009322.xlsx

Patrick