Solved

Confused about the difference between Correlation and Rsquared?

Posted on 2011-03-20
2
716 Views
Last Modified: 2013-11-13
Hi, I'm writing a stock analysis program and in it, I want to show if 2 stocks are related (ie. how much they move in synch). I'm already calculating the Correlation Coefficient between the 2 stocks and that seems to give a good realistic figure, and now I'm tackling Rsquared. I'm starting to wonder now if I even need to bother with Rsquared if I've already done the Correlation Coefficient. Isn't Rsquared just the Correlation Coefficient times itself? (if so, that info wouldn't be of much help). I'm using this formula to calculate Rsquared (as found doing a websearch):

Rsquared := Square (Covariance(Stock1,Stock2) / (StdDev(Stock1) * StdDev(Stock2)));

... but that formula does not yield the same value as the square of the Correlation Coefficient at all. Confused about that.

   Furthermore, in my googling, it seems to say that Rsquared is an indication of how good a LINEAR REGRESSION line "fits" the data. But in my program, I'm not really comparing one stock to a linear regression line, I'm comparing one stock to another stock, so I don't really seem how Rsquared would apply in this case. Anyway, as you can probably tell, I'm not a stats whiz and I'm a little confused... maybe I just don't need to be bothering with Rsquared at all,eh? Doesn't seem like it's gonna tell me anything more than Correlation Coefficient.

Thanks
    Shawn
0
Comment
Question by:shawn226
2 Comments
 
LVL 37

Accepted Solution

by:
TommySzalapski earned 50 total points
ID: 35177790
Comparing how related two stocks are really is sort of a linear regression problem. It might not look like it, but you are looking at how well the first stock can predict the second stock. (If the are highly correlated, then you could use one to predict the other, so testing correlation is really mathematically the same as linear regression).

Yes, R square should be the square of the correlation coefficient. The correlation coefficient is often called R so R square is R^2.

The reason they aren't matching could be due to a few reasons.
1. You've miscalculated something
2. People use different definitions of the standard deviation. Often the "sample" standard deviation uses /(n-1) instead of /n to inflate the deviation slightly to adjust for the fact that you only have a sample. So if your calculations use different standard deviation methods, they won't match.
3. If R^2 is being defined as 1 - SSE/SST, then it might not match.

In short, they do measure the same thing. So you really could just pick one. If you want to go deeper in to it, you can post a sample and the two values your getting, and I'll help you sort out why. If you are fine just knowing that someone with a Math degree says you're right and you can use the correlation coeffitient to get R^2, then that's good too.
0
 

Author Closing Comment

by:shawn226
ID: 35192256
Righto Tommy, I'm just going to stick with Correlation Coefficient and not bother with Rsquared - as you say, it's just a derivative of Correlation Coefficient anyway, so it doesn't really supply anymore information than Correlation alone. I'm not gonna waste anymore time on it... thanks!

Cheers
   Shawn
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Software development teams often use in-memory caches to improve performance. They want to speed up access to, or reduce load on, a backing store (database, file system, etc.) by keeping some or all of the data in memory.   You should implement a …
The Fluent Interface Design Pattern You can use the Fluent Interface (http://en.wikipedia.org/wiki/Fluent_interface) design pattern to make your PHP code easier to read and maintain.  "Fluent Interface" is an object-oriented design pattern that r…
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
Two types of users will appreciate AOMEI Backupper Pro: 1 - Those with PCIe drives (and haven't found cloning software that works on them). 2 - Those who want a fast clone of their boot drive (no re-boots needed) and it can clone your drive wh…

773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question