Okay - too long since I've done this stuff - but I can tell you for definite that you can derive the formula for standard deviation from a method called the Maximum Likelihood Estimator. This is essentially a (quite complex) method which will give you an estimator for a statistic for your data. Because it is complex, it can be difficult to solve for some statistics, but (relatively) easy for the mean and variance. As part of the derivation it can be found that while dividing by N given an unbiased estimator for a population, it would give a biased estimator for a sample. Dividing by N - 1 will solve the problem for a sample. If you really want, I can try to dig out some links for MLE, but quite honestly the logic ain't easy! Essentially in the calculation of an MLE there is also a bias element. You can trade off bias for accuracy (if memory serves).
I'm sorry the explanation isn't a simple one - but it's the best I can do without trying to relearn my college notes on the topic (and that's not worth 1000 points!!!).
Main Topics
Browse All Topics





by: acerolaPosted on 2002-06-14 at 21:17:35ID: 7080054
standart deviation is the square root of the mean of the square of the deviation:
average = A
sample = x
deviation = x-A
square of deviation = (x-A)^2
mean of the square of the deviation = Sum((x-A)^2) / N
N = number of samples.
standart deviation = Sqrt(Sum((x-A)^2) / N)
That's all I know. And that is what I found on a web page:
"You use the N-1 if the estimate is unbiased". And the definition of bias is:
"A statistic is biased if, in the long run, it consistently over or underestimates the parameter it is estimating. More technically it is biased if its expected value is not equal to the parameter. A stop watch that is a little bit fast gives biased estimates of elapsed time. Bias in this sense is different from the notion of a biased sample. A statistic is positively biased if it tends to overestimate the parameter; a statistic is negatively biased if it tends to underestimate the parameter. An unbiased statistic is not necessarily an accurate statistic. If a statistic is sometimes much too high and sometimes much too low, it can still be unbiased. It would be very imprecise, however. A slightly biased statistic that systematically results in very small overestimates of a parameter could be quite efficient."
Hope it helps. I didnt get it very well...