Before doing a linear regression on a data-set one needs to find out a few things about the data:
- is there is a linear relation between the dependent and independent variables
- are the variables normally distributed (if not, do a Data Transformation)
- In Excel, do I simply use the CORREL() function to determine a linear relation between the dependent and independent variables?
- How do I interpret these results (i.e. low value = low correlation, high value = high correlation)?
- To test normality, can I simply use the QQ Plot test or should I use other tests, like the K-S Test as well?
- How do I interpret the K-S Test results?
- How do I go about doing a "Data Transformation" and how do I know whether I should do it?
- Please do not confuse things with only Statistics talk when giving an answer because I might not understand what you're saying. I've uploaded a sample of a data-set.