Solved

Calculate Confidence Level and Number of required samples based on supplied data

Posted on 2013-01-25
11
844 Views
Last Modified: 2014-01-18
Hi,

If I supply say 50 values in col A, how would I calculate the confidence interval and number of samples required to achieve a defined confidence level of say 95%?

So if my samples only returned a confidence level of say 46% it would say I need an additional 100 samples to achieve 95%?

Cheers
0
Comment
Question by:MirageSF
11 Comments
 
LVL 92

Accepted Solution

by:
Patrick Matthews earned 500 total points
ID: 38822401
MirageSF,

Please state more clearly what you are trying to do, because I think you may be confused as to what a confidence interval for a sample mean is :)

Suppose you drew a random, unbiased sample of 50 items from a normally distributed population.  Further suppose the following:
You know neither the true population mean nor the true population variance
Your sample mean is 100
Your sample standard deviation is 15
Your desired confidence level is 95%

The confidence interval for your sample mean is a function of the sample mean, the sample standard deviation, and the confidence level.

Based on the above, first you would compute the standard error:

=sample_std_dev / sqrt(sample_size)
=15 / sqrt(50)
~ 2.12

Next, find the t value associated with 95% confidence and 50 degrees of freedom.  This is about 2.01.

Multiply the two to get your margin of error, which in this example is ~ 4.26.

Now, your confidence interval is (sample mean) +/- (margin of error), or 95.74 - 104.26.

What this means is that we expect that there is a 95% chance that the true population mean is a value within the range 95.74 - 104.26.

Taking a bigger sample is not going to increase your confidence level.  Rather, having a larger sample, ceteris paribus, makes the width of the confidence interval narrower.

Please see the attached file for an example of how I would do this.

Q-28009322.xlsx

Patrick
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 38822403
This web page does a good job of explaining it: http://stattrek.com/estimation/confidence-interval-mean.aspx
0
 
LVL 27

Expert Comment

by:aburr
ID: 38822456
Getting more experts to look at you question will not help much.
the key part of the post by matthewspatrick is

"Taking a bigger sample is not going to increase your confidence level.  Rather, having a larger sample makes the width of the confidence interval narrower."

He has given you a good introduction to sampling and a link to more infromation which should help you understand and answer the problem.
0
 

Author Comment

by:MirageSF
ID: 38822483
Hi,

Thanks for that I have added to my sheet, but not making much sense, using the deviation difference between the mean and 2points before and after the mean I get around 95%+ of values in that range, and yet the proper confidence calculation to the right of the sheet only shows about 6-7% of values fall between the mean and confidence range?

Sample attached
SamplePX.xlsm
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 38822587
You wrote:

using the deviation difference between the mean and 2points before and after the mean I get around 95%+ of values in that range, and yet the proper confidence calculation to the right of the sheet only shows about 6-7% of values fall between the mean and confidence range

That is not how you use a confidence interval.  As I indicated above, the common usage for a confidence interval is to show how the precision of a sample mean when it is used to estimate a population mean.

You cannot use a confidence interval in the way you tried above.  That is not what it is designed to do.

In a normally distributed population, it is true that approximately 95% of the members will be within two standard deviations of the mean.  However:
That is not the same thing as saying "95% will be within the confidence interval for the sample mean"
In looking at how your data are being generated, your source data are not themselves normally distributed
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:MirageSF
ID: 38822598
Is there a way to fix that so it produces similar stuff as to what is in the example, but normal dist?
0
 

Author Comment

by:MirageSF
ID: 38822690
Sweeping the net I found this interesting command NORMINV(RAND(),$B$1,$B$2), this simple command does away with my SHEET2, which created 5 random numbers and then the average of these was used to create the Central Limit Theroum, this command also allows me to define the Average base and Standard deviation base.
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 38822860
Yes, if you wanted to generate some fake data assuming a normal distribution, you would use that function.  (If you are running Excel 2007 or later, I recommend using NORM.INV() instead; that version of the function performs better both in terms of speed and fidelity.)

Indeed, when I built my sample file posted in http:#a38822401, I used =ROUND(NORM.INV(RAND(),100,15),0) to generate my data points.

But now we're getting kind of far afield.  I think your original question has been answered :)
0
 

Author Closing Comment

by:MirageSF
ID: 38822909
Thanx for that, whilst were on NORM.INV is it possible to get it also to work within a range, say 5-18?
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 38823171
Yes, you could, but be very, very careful about doing that: you could end up with data that are not in fact normally distributed.

To generate random data like that:

=MEDIAN(5,18,NORM.INV(RAND(),mean,st_dev))

The problem comes where you use something like 11.5 for the mean and 7.5 for the standard deviation.  If you force all the values to be between 5 and 18, your sample will not be normally distributed, because you are truncating the tails.
0
 
LVL 24

Expert Comment

by:Steve
ID: 38825893
OK, I thought I would add my 2 cents on this one:

The calculation of Sample size to give an accuracy (resolution) with a 95% CI is, I belive, what the essence of the question is about.

To determine sample size you need to know:
Desired Resolution r
Process Deviation s

95% CI  =  x +/- r   =  x+/- 1.96s/ sqrrt(n)
95% CI  ~  x+/- 2s/sqrrt(n)
Resolution r = 2s/sqrrt(n)

n =  (2s/r)^2

So for example if we wish to estimate cycle time within +/- one minute (r=1)
And we estimate the standard deviation to be five minutes (s = 5)

n = ((2*5)/1)^2 = 100

So if you know the reolution you are trying to acheive you can calculate the nescessary sample size to acheive a 95% confidence.

I hope this makes sense, I had to get my old SixSigma Black Belt notes out for this one :)
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Article by: Nadia
Linear search (searching each index in an array one by one) works almost everywhere but it is not optimal in many cases. Let's assume, we have a book which has 42949672960 pages. We also have a table of contents. Now we want to read the content on p…
Lithium-ion batteries area cornerstone of today's portable electronic devices, and even though they are relied upon heavily, their chemistry and origin are not of common knowledge. This article is about a device on which every smartphone, laptop, an…
This Micro Tutorial demonstrates how to create Excel charts: column, area, line, bar, and scatter charts. Formatting tips are provided as well.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now