asked on

assumptions for parametric test data

Hi
Why is it necessary for there to be no outiers in the data for parametric tests. I know the tests assume there are no outliers in the data so that in itself is reason to make sure there aren't any.
Why do the tests assume there aren't any outliers? Is it because outliers affect the value of the mean and parametric tests rely on the mean?

Also why do parametric tests require normally distributed data and homogeneity of variance?

many thanks

WaterStreet

"Why is it necessary for there to be no outliers in the data for parametric tests."

Because the definition of parametric test (as opposed to non-parmetric) assumes the sample represents a normal population distribution so that simpler statistical methods can be used. Outliers violate this requirement.

See
http://en.wikipedia.org/wiki/Parametric_statistics
http://www.creative-wisdom.com/teaching/WBI/parametric_test.shtml
http://www.psychwiki.com/wiki/Dealing_with_Outliers

andieje

ASKER

You have said outliers voilate the requirement of a normal distribution. Does that mean that a normal distribution does not contain outliers?

aburr

Outliers are a touchy subject.
Most statistical analysis depends on the data having all errors normally distributed and independent.
What to do with outliers. The only correct way is to collect enough data so that outliers do not influence the result. (I know, this is always difficult and sometimes impossible.)
If you eliminate outliers from your data set, you can be in deep trouble Outliers are difficult to define. If you give me a data set and allow me to define what an outlier is, I can give you any result you want.
Point 3 in the last link above is particularly troublesome. Repeated outlier removal can cause you to end up with only one data point in which case you can show that your result is guaranteed to be 100% right.
Even in a hard science like physics it is easy to show that the elimination of outliers leads to missed discoveries.
So much for soapbox. For answers see next post.

aburr

Does that mean that a normal distribution does not contain outliers?
A normal distribution does not contain outliers. A small sample from a normal distribution might contain points which are called outliers by some common definition.
WaterStreet says it reaonable well
"Why is it necessary for there to be no outliers in the data for parametric tests."

Because the definition of parametric test (as opposed to non-parmetric) assumes the sample represents a normal population distribution so that simpler statistical methods can be used.
The theory on which parametric tests rest requires that you sample be taken from a population which is normal. Outlier theory assumes that you can identify outliers and that when you remove them you will have a sample which you can say comes from a population with a normal distribution. (A big assumption, often ignored.)

andieje

ASKER

i don't think i am wording my questions very well. I understand that the assumptions of parametric tests are that the data are normally distributed and that the data does not contain outliers. My question is why do the tests make these assumptions. I think I understand why the tests assume the data is normally distributed (because that allows you to make all sorts of other assumptions) but I don't understand why there can't be any outliers in the data. Perhaps my understanding of outliers is wrong. I thought outliers were more than 3 standard deviations from the mean. This is probably what is confusing me: a normal distribution has most of the values clustered around the mean but it does have some extreme values, in other words it does have some outliers. So surely your data can be normally distributed and contain outliers?

ASKER CERTIFIED SOLUTION

WaterStreet

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

andieje

ASKER

thanks, that last answer cleared it up for me