Solved

Confused about the difference between Correlation and Rsquared?

Posted on 2011-03-20
2
728 Views
Last Modified: 2013-11-13
Hi, I'm writing a stock analysis program and in it, I want to show if 2 stocks are related (ie. how much they move in synch). I'm already calculating the Correlation Coefficient between the 2 stocks and that seems to give a good realistic figure, and now I'm tackling Rsquared. I'm starting to wonder now if I even need to bother with Rsquared if I've already done the Correlation Coefficient. Isn't Rsquared just the Correlation Coefficient times itself? (if so, that info wouldn't be of much help). I'm using this formula to calculate Rsquared (as found doing a websearch):

Rsquared := Square (Covariance(Stock1,Stock2) / (StdDev(Stock1) * StdDev(Stock2)));

... but that formula does not yield the same value as the square of the Correlation Coefficient at all. Confused about that.

   Furthermore, in my googling, it seems to say that Rsquared is an indication of how good a LINEAR REGRESSION line "fits" the data. But in my program, I'm not really comparing one stock to a linear regression line, I'm comparing one stock to another stock, so I don't really seem how Rsquared would apply in this case. Anyway, as you can probably tell, I'm not a stats whiz and I'm a little confused... maybe I just don't need to be bothering with Rsquared at all,eh? Doesn't seem like it's gonna tell me anything more than Correlation Coefficient.

Thanks
    Shawn
0
Comment
Question by:shawn226
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 37

Accepted Solution

by:
TommySzalapski earned 50 total points
ID: 35177790
Comparing how related two stocks are really is sort of a linear regression problem. It might not look like it, but you are looking at how well the first stock can predict the second stock. (If the are highly correlated, then you could use one to predict the other, so testing correlation is really mathematically the same as linear regression).

Yes, R square should be the square of the correlation coefficient. The correlation coefficient is often called R so R square is R^2.

The reason they aren't matching could be due to a few reasons.
1. You've miscalculated something
2. People use different definitions of the standard deviation. Often the "sample" standard deviation uses /(n-1) instead of /n to inflate the deviation slightly to adjust for the fact that you only have a sample. So if your calculations use different standard deviation methods, they won't match.
3. If R^2 is being defined as 1 - SSE/SST, then it might not match.

In short, they do measure the same thing. So you really could just pick one. If you want to go deeper in to it, you can post a sample and the two values your getting, and I'll help you sort out why. If you are fine just knowing that someone with a Math degree says you're right and you can use the correlation coeffitient to get R^2, then that's good too.
0
 

Author Closing Comment

by:shawn226
ID: 35192256
Righto Tommy, I'm just going to stick with Correlation Coefficient and not bother with Rsquared - as you say, it's just a derivative of Correlation Coefficient anyway, so it doesn't really supply anymore information than Correlation alone. I'm not gonna waste anymore time on it... thanks!

Cheers
   Shawn
0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Pretext and Context In previous  (http://www.experts-exchange.com/Programming/Theory/Software-Design/A_4457-System-Data-Modeling-Series-Identity-Management-System-Basic-System.html)article, I designed the data model of a basic Identity Management…
Introduction This article discusses the Chain of Responsibility pattern, explaining What it is;Why it is; andHow it is At the end of this article, I hope you will be able to describe the use and benefits of Chain of Responsibility.  Backgrou…
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…

623 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question