Link to home
Start Free TrialLog in
Avatar of lebeau26
lebeau26Flag for United States of America

asked on

Standard Deviation in Access

Hi,

I was searching on EE and found this SQL statement to calculate Standard Deviation in Access :

SELECT StDev([YourField]) AS StDevOfyourfield FROM yourTable;

However, I noticed that they mention some other form of StDevP or something.  Will the statement above be good enough on my table or do I need this P function.

Thanks
D
ASKER CERTIFIED SOLUTION
Avatar of mbizup
mbizup
Flag of Kazakhstan image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of lebeau26

ASKER

Cool so Im going to need the stdevp then

So is the following statement correct?

SELECT StDevP([YourField]) AS StDevPOfyourfield FROM yourTable;
The P does the whole population as opposed to a samle of the population

From http://office.microsoft.com/en-ca/assistance/HP052090681033.aspx

DSTDEV      The estimated standard deviation in the yield of apple and pear trees
             if the data in the database is only a sample of the total orchard population. (2.97)      
DSTDEVP      The true standard deviation in the yield of apple and pear trees if the
             data in the database is the entire population. (2.65)
Hi lebeau26,
It depends whether your table contains a sample (StDev) or the whole population(StDevP).

Pete
Wait im confused now, I thought the aggregate Function was STDEV / STDEVP

What is this DSTDEV / DSTDEVP ?
From what I remember, DSTDEV calculates the standard deviation as if the data were a sample.  DSTDDEVP calculates standard deviation as if the data were the population.  It's a fairly long lesson in statistics to get a good handle on it.
>What is this DSTDEV / DSTDEVP ?
Those are domain aggregate functions that you would call through VBA

> I thought the aggregate Function was STDEV / STDEVP
Thos are SQL aggregate functions
In straight SQL, you would use STDev/StdevP, as you did in your original post.

From VBA:

Something = DStDev("YourField","YourTable")
OK here's the lesson:  http://en.wikipedia.org/wiki/Standard_deviation

Whether aggregrate or domain aggregrate, there are different formulae involved when calculated SD on a sample versus a population.
Hello lebeau26

The first "devguru" link is wrong: all domain agregate functions use the entire table, never a sample.

The "mathematical standard deviation" or "population standard deviation" (StDevP) is easily calculated by:

    Select Sqr(Avg(X^2) - Avg(X)^2) From...   (where X is your field)

However, when used as an estimator, this is slightly biased for small samples – it overestimates. So that statisticians (and mostly just everybody) use this corrected formula (StDev):

    Select Sqr( N/(N-1) * (Avg(X^2) - Avg(X)^2) ) From... (where N is the nb of values)

The difference in value between StDevP and StDev is thus exactly Sqr(Count(X)/(Count(X)-1)), a number that is very close to 1 when you have enough data, and thus insignificant for over 100 records (less than 1% difference).

Bottom line: If you need to ask, use StDev(). The other variant is used only for very special cases.

(°v°)
lebeau26,

I'm glad you liked Miriam's comment {http:#16811056}, but I would like to repeat that is is simply wrong. Both StDev() and StDevP() use all records. The first is the function referred to when people just say "standard deviation", the rationale being that you really are measuring something that is greater than your data. This being said, it probably doesn't matter.

Success with your project!
(°v°)
I should have posted this earlier... I agree, and apologize for a misleading response.
lebeau26:  If anyone is to get any value from this thread I think you should correct the error.