asked on

How do I calculate confidence channels above and below a trend line

Hi,

I am using a spreadsheet in OpenOffice at the moment to work out the mathematics of this but don't want to use spread sheet functions because I will transfer this to code in the near future.

I have calculated a Linear Regression line (least squares fit/ line of best fit) through a series of datapoints and now want to calculate a a confidence line of 1.5 standard deviations either side of this trend line.

An example of my data is as follows

x      y      best fit line
1      383.75      337.52
2      420        336.94
3      431.5      336.36
4      422.5      335.78
5      385.5      335.2
6      400.25      334.62
7      360      334.04
8      383      333.46
9      361.25      332.88
10      393.75      332.3
11      435.25      331.72
12      426        331.14
13      418        330.56
14      441.75      329.98
15      427.25      329.4
16      406.75      328.82
17      372.25      328.24

The lines I am looking to calculate are commonly known as a 'Linear Regression Channel' used in share trading quite often.

These lines lie equidistant from the trend line by x standard deviations and I am looking to calculate these lines for 1.5 standard deviations.

A good description of what I need is here http://www.onlinetradingconcepts.com/TechnicalAnalysis/LinRegChannel.html

Could you explain how to do this please, I'm not brilliant at maths so please be gentle on me!

Please use my demo data for the explanation
Thanks,

James

richdiesal

I'm not familiar with OO.org in particular, but my understanding is that it is very similar to Excel, so hopefully this will work for you.

If all you are after is a line 1.5 SD of y above and below the line you have, it's pretty straightforward.
In cell D2, add:
=C2-(1.5*STDEV(B:B))
In cell D3, add:
=C2+(1.5*STDEV(B:B))
Then fill down the columns and add the two new columns as lines in your data.

From the pages you included however, it's not clear if you're talking about the standard deviation of y (how much does data deviate from the mean of y on average?) or the standard deviation of the residuals on the regression line (also called the standard error of the estimate or SEE - how much does the actual data on average deviate from the predicted values from the regression line?). My guess is that you actually want the SEE, as that would conceptually make more sense.

If you want to use the SEE, these formulas should be used instead:
In cell D2, add:
=C2-(1.5*(STDEV(B:B)*SQRT(1-CORREL(B:B,C:C))*SQRT(COUNT(B:B)/(COUNT(B:B)-2))))
In cell D3, add:
=C2+(1.5*(STDEV(B:B)*SQRT(1-CORREL(B:B,C:C))*SQRT(COUNT(B:B)/(COUNT(B:B)-2))))
Then fill down the columns and add the two new columns as lines in your data.

Make sure to use smoothed lines for each of your channel lines, or rounding error will make the line look like it has tiny steps.

Hopefully one of those is what you were looking for! I have attached what the data looks like using the bottom two sets of formulae.

x	y	best fit line		
1	383.75	337.52	293.0760199	381.9639801
2	420	336.94	292.4960199	381.3839801
3	431.5	336.36	291.9160199	380.8039801
4	422.5	335.78	291.3360199	380.2239801
5	385.5	335.2	290.7560199	379.6439801
6	400.25	334.62	290.1760199	379.0639801
7	360	334.04	289.5960199	378.4839801
8	383	333.46	289.0160199	377.9039801
9	361.25	332.88	288.4360199	377.3239801
10	393.75	332.3	287.8560199	376.7439801
11	435.25	331.72	287.2760199	376.1639801
12	426	331.14	286.6960199	375.5839801
13	418	330.56	286.1160199	375.0039801
14	441.75	329.98	285.5360199	374.4239801
15	427.25	329.4	284.9560199	373.8439801
16	406.75	328.82	284.3760199	373.2639801
17	372.25	328.24	283.7960199	372.6839801

Open in new window

JamesFrog

ASKER

Hi,

Thanks for your answer but it doesn't answer my question - what I need is an explanation of the MATHS behind that as I just want to do this in OO with the pure maths as I intend to carry this out in a java program ultimately.

By using functions like CORREL, there is no way for me to know WHAT this actually does in mathematical terms so I need an explanation of the pure maths behind it, so that I can program this in Java/ PHP or whatever.

I've already worked out the maths for the trend line (least squares fit/ linear regression line), but now need the confidence bands as explained in the link I gave.

Please help,
Thanks
J

ozo

the error in y is given by (35)
http://mathworld.wolfram.com/LeastSquaresFitting.html

JamesFrog

ASKER

Further desciptions of what I'm after are here

http://www.traderslog.com/linear-regression-channel.htm
http://www.babypips.com/forexpedia/Linear_Regression_Channel

But I need the mathamatical explanation of how to do this, and be able to draw lines at 1, 1.5 and 2.0 standard deviations.

thanks
James

JamesFrog

ASKER

ozo,

that doesn't really answer my question because I don't understand the maths on that page. I'm after this in 'plain english'

thanks
J

SOLUTION

moorhouselondon

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

moorhouselondon

>use that value of X to work out the Std Dev at that point

Sorry, this might be ambiguous.

use that same value of X with the Trend Line value at that point. (If you were to graph this, you will see that the Trend Line and the Std Dev line are parallel).

moorhouselondon

To calculate Standard Deviation look at the "Basic Example" given here:-

http://en.wikipedia.org/wiki/Standard_deviation

ozo

The capita Greek Sigma basically translates to the plain english word "sum"
but your links don't look much like points from
y = a + b*x + gaussian random error
so the linear least square model may not be appropriate.
Do you have a model that you are trying to fit to the data?

JamesFrog

ASKER

When you say "Let us say that the distance between the latest trend point and the Standard Deviation at that same point is X." - do you mean the vertical distance from the trend line to the data point?

JamesFrog

ASKER

ozo,

The linear regression is correct as I've re-created the line of best fit and it looks exactly like the chart created in my trading platform. i'm just working on moorhouselondon's post to try and get this worked out.

J

moorhouselondon

The application of strict statistical concepts to share prices is probably not what jamesfrog is after. I suspect the aim is to pragmatically work out when to buy shares (when the trend line value minus 1.5 Std Dev point is breached), and when to sell them (when the trend line value plus 1.5 Std Dev point is breached). A lot of this "theory" is probably not going to work now that the current underlying economical situation is so unstable now.

moorhouselondon

Sorry, crossed posts. X is the Standard Deviation, which in your case is fixed for any particular "snapshot" of data that is taken.

moorhouselondon

I see that I haven't been very consistent with what X is. Calculate the Std Dev. This is X. Add this to the Trend Value to get +1*Std Dev, subtract to get -1*std Dev. My apologies for confusion.

JamesFrog

ASKER

moorhouselondon,

Please could you write a demo of what you're explaining hear USING MY data as I'm coming out with strange figures and am a bit confused.

Thanks
James

(PS. The real dataset is around 200 days long but please assume the linear regression data (best fit line) is correct).

I would really appreciate you doing that as you're right in what you're explaining but I'm having difficulty getting it going. Thanks.

ASKER CERTIFIED SOLUTION

ozo

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

JamesFrog

ASKER

ozo,

So your value of 26.01 is 1 standard deviation?

So would I be correct if I ADD (1.5*26.01) to any point on the trend line this will plot an equidistant line 1.5 standard deviations from the trend line?

Thanks
James

JamesFrog

ASKER

Ozo, I think your solution is working absolutely fine, I just used your method on the real dataset and it's looking exactly like the chart I'm seeing from my trading platform... please can you just confirm my last questions.

Thanks
J

JamesFrog

ASKER

Thanks for all your help, very good!

richdiesal

Per my original comment, I am not sure this value is actually the one you want. Are you absolutely sure that the SD of y is what you actually need? The reason I ask is that over longer periods of time, the SD of y, given a linear trend over time (e.g. if there is a sizable gap between the beginning and end values of y), will be pushed quite far away from the trend line. You are in that way biasing your ability toward finding "buy" and "sell" moments for data in which you don't have much data actually available - in other words, a line over 1 month with similar fluctuations around a trend line compared with a line over 3 months would have drastically different SDy. The SEE (also called the standard deviation of the residuals) would not be subject to those kinds of fluctuations.

In your particular data, the two values are similar (SDy = 26.301, SEE = 26.969), but the higher the slope of the line, and the more x points you have, the more discrepant they are going to get.

Also, I believe ozo's value for the SDy is off a little - I am getting 26.301 in SPSS. Perhaps rounding error?