?
Solved

Confused about the difference between Correlation and Rsquared?

Posted on 2011-03-20
2
Medium Priority
?
744 Views
Last Modified: 2013-11-13
Hi, I'm writing a stock analysis program and in it, I want to show if 2 stocks are related (ie. how much they move in synch). I'm already calculating the Correlation Coefficient between the 2 stocks and that seems to give a good realistic figure, and now I'm tackling Rsquared. I'm starting to wonder now if I even need to bother with Rsquared if I've already done the Correlation Coefficient. Isn't Rsquared just the Correlation Coefficient times itself? (if so, that info wouldn't be of much help). I'm using this formula to calculate Rsquared (as found doing a websearch):

Rsquared := Square (Covariance(Stock1,Stock2) / (StdDev(Stock1) * StdDev(Stock2)));

... but that formula does not yield the same value as the square of the Correlation Coefficient at all. Confused about that.

   Furthermore, in my googling, it seems to say that Rsquared is an indication of how good a LINEAR REGRESSION line "fits" the data. But in my program, I'm not really comparing one stock to a linear regression line, I'm comparing one stock to another stock, so I don't really seem how Rsquared would apply in this case. Anyway, as you can probably tell, I'm not a stats whiz and I'm a little confused... maybe I just don't need to be bothering with Rsquared at all,eh? Doesn't seem like it's gonna tell me anything more than Correlation Coefficient.

Thanks
    Shawn
0
Comment
Question by:shawn226
2 Comments
 
LVL 37

Accepted Solution

by:
TommySzalapski earned 200 total points
ID: 35177790
Comparing how related two stocks are really is sort of a linear regression problem. It might not look like it, but you are looking at how well the first stock can predict the second stock. (If the are highly correlated, then you could use one to predict the other, so testing correlation is really mathematically the same as linear regression).

Yes, R square should be the square of the correlation coefficient. The correlation coefficient is often called R so R square is R^2.

The reason they aren't matching could be due to a few reasons.
1. You've miscalculated something
2. People use different definitions of the standard deviation. Often the "sample" standard deviation uses /(n-1) instead of /n to inflate the deviation slightly to adjust for the fact that you only have a sample. So if your calculations use different standard deviation methods, they won't match.
3. If R^2 is being defined as 1 - SSE/SST, then it might not match.

In short, they do measure the same thing. So you really could just pick one. If you want to go deeper in to it, you can post a sample and the two values your getting, and I'll help you sort out why. If you are fine just knowing that someone with a Math degree says you're right and you can use the correlation coeffitient to get R^2, then that's good too.
0
 

Author Closing Comment

by:shawn226
ID: 35192256
Righto Tommy, I'm just going to stick with Correlation Coefficient and not bother with Rsquared - as you say, it's just a derivative of Correlation Coefficient anyway, so it doesn't really supply anymore information than Correlation alone. I'm not gonna waste anymore time on it... thanks!

Cheers
   Shawn
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

The CRUD Functions CRUD, meaning "Create, Read, Update, Delete (http://en.wikipedia.org/wiki/Create,_read,_update_and_delete)" is a common term to data base developers.  It describes the essential functions of data base table maintenance.  This art…
Software development teams often use in-memory caches to improve performance. They want to speed up access to, or reduce load on, a backing store (database, file system, etc.) by keeping some or all of the data in memory.   You should implement a…
Hi, this video explains a free download that you can incorporate into your Access databases, or use stand-alone for contact management. Contacts -- Names, Addresses, Phone Numbers, eMail Addresses, Websites, Lists, Projects, Notes, Attachments…
If you are looking for an automated tool which can generate reports for Outlook emails and other items from PST file, then you can go for Kernel PST Reporter tool. The reports which are created by this tool are helpful to analyze and understand PST …

568 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question