Link to home
Start Free TrialLog in
Avatar of JamesFrog
JamesFrog

asked on

How do I calculate confidence channels above and below a trend line

Hi,

I am using a spreadsheet in OpenOffice at the moment to work out the mathematics of this but don't want to use spread sheet functions because I will transfer this to code in the near future.

I have calculated a Linear Regression line (least squares fit/ line of best fit) through a series of datapoints and now want to calculate a a confidence line of 1.5 standard deviations either side of this trend line.

An example of my data is as follows

x      y      best fit line
1      383.75      337.52
2      420              336.94
3      431.5      336.36
4      422.5      335.78
5      385.5      335.2
6      400.25      334.62
7      360      334.04
8      383      333.46
9      361.25      332.88
10      393.75      332.3
11      435.25      331.72
12      426              331.14
13      418              330.56
14      441.75      329.98
15      427.25      329.4
16      406.75      328.82
17      372.25      328.24

The lines I am looking to calculate are commonly known as a 'Linear Regression Channel' used in share trading quite often.

These lines lie equidistant from the trend line by x standard deviations and I am looking to calculate these lines for 1.5 standard deviations.

A good description of what I need is here http://www.onlinetradingconcepts.com/TechnicalAnalysis/LinRegChannel.html

Could you explain how to do this please, I'm not brilliant at maths so please be gentle on me!

Please use my demo data for the explanation
Thanks,

James
Avatar of richdiesal
richdiesal
Flag of United States of America image

I'm not familiar with OO.org in particular, but my understanding is that it is very similar to Excel, so hopefully this will work for you.

If all you are after is a line 1.5 SD of y above and below the line you have, it's pretty straightforward.  
In cell D2, add:
=C2-(1.5*STDEV(B:B))
In cell D3, add:
=C2+(1.5*STDEV(B:B))
Then fill down the columns and add the two new columns as lines in your data.

From the pages you included however, it's not clear if you're talking about the standard deviation of y (how much does data deviate from the mean of y on average?) or the standard deviation of the residuals on the regression line (also called the standard error of the estimate or SEE - how much does the actual data on average deviate from the predicted values from the regression line?).  My guess is that you actually want the SEE, as that would conceptually make more sense.  

If you want to use the SEE, these formulas should be used instead:
In cell D2, add:
=C2-(1.5*(STDEV(B:B)*SQRT(1-CORREL(B:B,C:C))*SQRT(COUNT(B:B)/(COUNT(B:B)-2))))
In cell D3, add:
=C2+(1.5*(STDEV(B:B)*SQRT(1-CORREL(B:B,C:C))*SQRT(COUNT(B:B)/(COUNT(B:B)-2))))
Then fill down the columns and add the two new columns as lines in your data.

Make sure to use smoothed lines for each of your channel lines, or rounding error will make the line look like it has tiny steps.

Hopefully one of those is what you were looking for!  I have attached what the data looks like using the bottom two sets of formulae.

x	y	best fit line		
1	383.75	337.52	293.0760199	381.9639801
2	420	336.94	292.4960199	381.3839801
3	431.5	336.36	291.9160199	380.8039801
4	422.5	335.78	291.3360199	380.2239801
5	385.5	335.2	290.7560199	379.6439801
6	400.25	334.62	290.1760199	379.0639801
7	360	334.04	289.5960199	378.4839801
8	383	333.46	289.0160199	377.9039801
9	361.25	332.88	288.4360199	377.3239801
10	393.75	332.3	287.8560199	376.7439801
11	435.25	331.72	287.2760199	376.1639801
12	426	331.14	286.6960199	375.5839801
13	418	330.56	286.1160199	375.0039801
14	441.75	329.98	285.5360199	374.4239801
15	427.25	329.4	284.9560199	373.8439801
16	406.75	328.82	284.3760199	373.2639801
17	372.25	328.24	283.7960199	372.6839801

Open in new window

Avatar of JamesFrog
JamesFrog

ASKER

Hi,

Thanks for your answer but it doesn't answer my question - what I need is an explanation of the MATHS behind that as I just want to do this in OO with the pure maths as I intend to carry this out in a java program ultimately.

By using functions like CORREL, there is no way for me to know WHAT this actually does in mathematical terms so I need an explanation of the pure maths behind it, so that I can program this in Java/ PHP or whatever.

I've already worked out the maths for the trend line (least squares fit/ linear regression line), but now need the confidence bands as explained in the link I gave.

Please help,
Thanks
J
Further desciptions of what I'm after are here

http://www.traderslog.com/linear-regression-channel.htm
http://www.babypips.com/forexpedia/Linear_Regression_Channel

But I need the mathamatical explanation of how to do this, and be able to draw lines at 1, 1.5 and 2.0 standard deviations.

thanks
James



ozo,

that doesn't really answer my question because I don't understand the maths on that page. I'm after this in 'plain english'

thanks
J
SOLUTION
Avatar of moorhouselondon
moorhouselondon
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
>use that value of X to work out the Std Dev at that point

Sorry, this might be ambiguous.  

use that same value of X with the Trend Line value at that point.  (If you were to graph this, you will see that the Trend Line and the Std Dev line are parallel).
To calculate Standard Deviation look at the "Basic Example" given here:-

http://en.wikipedia.org/wiki/Standard_deviation
The capita Greek Sigma basically translates to the plain english word "sum"
but your links don't look much like points from
y = a + b*x + gaussian random error
so the linear least square model may not be appropriate.
Do you have a model that you are trying to fit to the data?
When you say "Let us say that the distance between the latest trend point and the Standard Deviation at that same point is X." - do you mean the vertical distance from the trend line to the data point?
ozo,

The linear regression is correct as I've re-created the line of best fit and it looks exactly like the chart created in my trading platform. i'm just working on moorhouselondon's post to try and get this worked out.

J
The application of strict statistical concepts to share prices is probably not what jamesfrog is after.  I suspect the aim is to pragmatically work out when to buy shares (when the trend line value minus 1.5 Std Dev point is breached), and when to sell them (when the trend line value plus 1.5 Std Dev point is breached).  A lot of this "theory" is probably not going to work now that the current underlying economical situation is so unstable now.
Sorry, crossed posts.  X is the Standard Deviation, which in your case is fixed for any particular "snapshot" of data that is taken.
I see that I haven't been very consistent with what X is.  Calculate the Std Dev.  This is X.  Add this to the Trend Value to get +1*Std Dev, subtract to get -1*std Dev.  My apologies for confusion.
moorhouselondon,

Please could you write a demo of what you're explaining hear USING MY data as I'm coming out with strange figures and am a bit confused.

Thanks
James

(PS. The real dataset is around 200 days long but please assume the linear regression data (best fit line) is correct).

I would really appreciate you doing that as you're right in what you're explaining but I'm having difficulty getting it going. Thanks.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ozo,

So your value of 26.01 is 1 standard deviation?

So would I be correct if I ADD (1.5*26.01) to any point on the trend line this will plot an equidistant line 1.5 standard deviations from the trend line?

Thanks
James

Ozo, I think your solution is working absolutely fine, I just used your method on the real dataset and it's looking exactly like the chart I'm seeing from my trading platform... please can you just confirm my last questions.

Thanks
J
Thanks for all your help, very good!
Per my original comment, I am not sure this value is actually the one you want.  Are you absolutely sure that the SD of y is what you actually need?  The reason I ask is that over longer periods of time, the SD of y, given a linear trend over time (e.g. if there is a sizable gap between the beginning and end values of y), will be pushed quite far away from the trend line.  You are in that way biasing your ability toward finding "buy" and "sell" moments for data in which you don't have much data actually available - in other words, a line over 1 month with similar fluctuations around a trend line compared with a line over 3 months would have drastically different SDy.  The SEE (also called the standard deviation of the residuals) would not be subject to those kinds of fluctuations.

In your particular data, the two values are similar (SDy = 26.301, SEE = 26.969), but the higher the slope of the line, and the more x points you have, the more discrepant they are going to get.

Also, I believe ozo's value for the SDy is off a little - I am getting 26.301 in SPSS.  Perhaps rounding error?