Henrik Klindt
asked on
Trend line problem, divide by zero
Hi
I Attached a query in which I need to calculate a linear trend line.
I found this solution, "Calculate linear trend line in TSQL", and hoped that I could use the very fine solution in my query.
Unfortunatly I get a "Divide by zero" error. See lines 83 and 84 i the attached query
Can anyone see what is wrong. And is this the way to make a trend line.
SQLQueryTrendLine.sql
I Attached a query in which I need to calculate a linear trend line.
I found this solution, "Calculate linear trend line in TSQL", and hoped that I could use the very fine solution in my query.
Unfortunatly I get a "Divide by zero" error. See lines 83 and 84 i the attached query
Can anyone see what is wrong. And is this the way to make a trend line.
SQLQueryTrendLine.sql
ASKER
I still get the divide by zero error.
Msg 8134, Level 16, State 1, Line 26
Divide by zero error encountered.
This is my result without the two trendline lines:
Msg 8134, Level 16, State 1, Line 26
Divide by zero error encountered.
This is my result without the two trendline lines:
Iyear Imonth Iweek Iday opened closed
2014 7 29 18 21 24
2014 7 29 14 8 23
2014 7 29 16 22 39
2014 7 29 15 26 19
2014 7 29 17 28 29
2014 7 30 23 25 40
2014 7 30 25 24 31
2014 7 30 24 17 22
2014 7 30 21 24 30
2014 7 30 22 23 22
2014 7 31 28 17 22
2014 7 31 31 21 25
2014 7 31 29 12 16
2014 7 31 30 24 22
2014 8 31 1 28 23
2014 8 32 4 19 24
2014 8 32 6 21 30
2014 8 32 8 15 12
2014 8 32 5 16 23
2014 8 32 7 21 26
2014 8 33 12 14 24
2014 8 33 11 14 17
2014 8 33 14 8 25
2014 8 33 15 10 15
2014 8 33 13 19 19
Try this:
SELECT
Iyear
, Imonth
, Iweek
, Iday
, opened
, closed
, VARP(Iweek * 1.0000) varp_iweek
--, (AVG(Iweek * 1.0000 * opened) - AVG(Iweek * 1.0000) * AVG(opened)) / VARP(Iweek * 1.0000) AS RegressCoeff
--, AVG(opened) - ((AVG(Iweek * 1.0000 * opened) - AVG(Iweek * 1.0000) * AVG(opened)) / VARP(Iweek * 1.0000)) * AVG(Iweek * 1.0000) AS Intercept
FROM tickets2
GROUP BY
Iyear
, Imonth
, Iweek
, Iday
, opened
, closed
It produces this from your sample:
| IYEAR | IMONTH | IWEEK | IDAY | OPENED | CLOSED | VARP_IWEEK |
|-------|--------|-------|------|--------|--------|------------|
| 2014 | 7 | 29 | 14 | 8 | 23 | 0 |
| 2014 | 7 | 29 | 15 | 26 | 19 | 0 |
| 2014 | 7 | 29 | 16 | 22 | 39 | 0 |
| 2014 | 7 | 29 | 17 | 28 | 29 | 0 |
| 2014 | 7 | 29 | 18 | 21 | 24 | 0 |
| 2014 | 7 | 30 | 21 | 24 | 30 | 0 |
| 2014 | 7 | 30 | 22 | 23 | 22 | 0 |
| 2014 | 7 | 30 | 23 | 25 | 40 | 0 |
| 2014 | 7 | 30 | 24 | 17 | 22 | 0 |
| 2014 | 7 | 30 | 25 | 24 | 31 | 0 |
| 2014 | 7 | 31 | 28 | 17 | 22 | 0 |
| 2014 | 7 | 31 | 29 | 12 | 16 | 0 |
| 2014 | 7 | 31 | 30 | 24 | 22 | 0 |
| 2014 | 7 | 31 | 31 | 21 | 25 | 0 |
| 2014 | 8 | 31 | 1 | 28 | 23 | 0 |
| 2014 | 8 | 32 | 4 | 19 | 24 | 0 |
| 2014 | 8 | 32 | 5 | 16 | 23 | 0 |
| 2014 | 8 | 32 | 6 | 21 | 30 | 0 |
| 2014 | 8 | 32 | 7 | 21 | 26 | 0 |
| 2014 | 8 | 32 | 8 | 15 | 12 | 0 |
| 2014 | 8 | 33 | 11 | 14 | 17 | 0 |
| 2014 | 8 | 33 | 12 | 14 | 24 | 0 |
| 2014 | 8 | 33 | 13 | 19 | 19 | 0 |
| 2014 | 8 | 33 | 14 | 8 | 25 | 0 |
| 2014 | 8 | 33 | 15 | 10 | 15 | 0 |
Hi
The issue is with the VAR / VARP function which is returning 0
Try this one
,
The issue is with the VAR / VARP function which is returning 0
Try this one
,
Case when VarP(Iweek * 1.0000) > 0 then (Avg(Iweek * 1.0000 * opened) - Avg(Iweek * 1.0000) * Avg(opened)) / VarP(Iweek * 1.0000) else 0 end AS RegressCoeff
,Case when VarP(Iweek * 1.0000) > 0 then Avg(opened) - ((Avg(Iweek * 1.0000 * opened) - Avg(Iweek * 1.0000) * Avg(opened)) / VarP(Iweek * 1.0000)) * Avg(Iweek * 1.0000)
else 0 end AS Intercept
FROM tickets2
ASKER
I got the same result. But why is VARP(Iweek * 1.000) always 0.
ASKER
Hi Vikas
With your solution the "divide by zero" error is gone.
But the regress and intercept is always 0.
I need to find out why VarP(Iweek * 1.000) is 0
The two trendline lines is taken from another solution. Maybe another way to calculate trendline is the solution?
With your solution the "divide by zero" error is gone.
But the regress and intercept is always 0.
I need to find out why VarP(Iweek * 1.000) is 0
The two trendline lines is taken from another solution. Maybe another way to calculate trendline is the solution?
ASKER
This is how it looks from where I took it.
I cannot see what is wrong with my query. Can anyone help?
CREATE TABLE afr (Month int, Value numeric(8, 2))
INSERT INTO afr (Month, Value) VALUES (1, 16.89)
INSERT INTO afr (Month, Value) VALUES (2, 15.87)
INSERT INTO afr (Month, Value) VALUES (3, 11.23)
INSERT INTO afr (Month, Value) VALUES (4, 14.82)
INSERT INTO afr (Month, Value) VALUES (5, 13.74)
INSERT INTO afr (Month, Value) VALUES (6, 7.62)
INSERT INTO afr (Month, Value) VALUES (7, 17.51)
INSERT INTO afr (Month, Value) VALUES (8, 13.96)
INSERT INTO afr (Month, Value) VALUES (9, 20.00)
INSERT INTO afr (Month, Value) VALUES (10, 11.38)
INSERT INTO afr (Month, Value) VALUES (11, 10.67)
INSERT INTO afr (Month, Value) VALUES (12, 11.76)
SELECT (Avg(Month * Value) - Avg(Month) * Avg(Value)) / VarP(Month) AS RegressCoeff,
Avg(Value) - ((Avg(Month * Value) - Avg(Month) * Avg(Value)) / VarP(Month)) * Avg(Month) AS Intercept
FROM afr
I cannot see what is wrong with my query. Can anyone help?
CREATE TABLE afr (Month int, Value numeric(8, 2))
INSERT INTO afr (Month, Value) VALUES (1, 16.89)
INSERT INTO afr (Month, Value) VALUES (2, 15.87)
INSERT INTO afr (Month, Value) VALUES (3, 11.23)
INSERT INTO afr (Month, Value) VALUES (4, 14.82)
INSERT INTO afr (Month, Value) VALUES (5, 13.74)
INSERT INTO afr (Month, Value) VALUES (6, 7.62)
INSERT INTO afr (Month, Value) VALUES (7, 17.51)
INSERT INTO afr (Month, Value) VALUES (8, 13.96)
INSERT INTO afr (Month, Value) VALUES (9, 20.00)
INSERT INTO afr (Month, Value) VALUES (10, 11.38)
INSERT INTO afr (Month, Value) VALUES (11, 10.67)
INSERT INTO afr (Month, Value) VALUES (12, 11.76)
SELECT (Avg(Month * Value) - Avg(Month) * Avg(Value)) / VarP(Month) AS RegressCoeff,
Avg(Value) - ((Avg(Month * Value) - Avg(Month) * Avg(Value)) / VarP(Month)) * Avg(Month) AS Intercept
FROM afr
there isn't sufficient statistical variation in Iweek
Let's face it Iweek isn't very interesting (except as an X axis perhaps)
Returns the statistical variance for the population for all values in the specified expression.http://msdn.microsoft.com/en-us/library/ms188735.aspx
Let's face it Iweek isn't very interesting (except as an X axis perhaps)
ASKER
What is sufficient statistical variation? I tried with other fields, always with the same result.
Have I missed something in this article ?
https://www.experts-exchange.com/questions/26713647/Calculate-linear-trend-line-in-TSQL.html
Have I missed something in this article ?
https://www.experts-exchange.com/questions/26713647/Calculate-linear-trend-line-in-TSQL.html
sorry, it was meant as a question "there isn't sufficient statistical variation in Iweek ?"
If you use VARP(Iweek) OVER(partition by Imonth) you get results, e.g.:
If you use VARP(Iweek) OVER(partition by Imonth) you get results, e.g.:
SELECT
Iyear
, Imonth
, Iweek
, Iday
, opened
, closed
, VARP(Iweek * 1.0) over(partition by Imonth)
, VARP(Iday * 1.0) over(partition by Iyear)
--, VARP(Iweek * 1.0000)
--, (AVG(Iweek * 1.0000 * opened) - AVG(Iweek * 1.0000) * AVG(opened)) / VARP(Iweek * 1.0000) AS RegressCoeff
--, AVG(opened) - ((AVG(Iweek * 1.0000 * opened) - AVG(Iweek * 1.0000) * AVG(opened)) / VARP(Iweek * 1.0000)) * AVG(Iweek * 1.0000) AS Intercept
FROM tickets2
;
| IYEAR | IMONTH | IWEEK | IDAY | OPENED | CLOSED | COLUMN_6 | COLUMN_7 |
|-------|--------|-------|------|--------|--------|----------------|----------|
| 2014 | 7 | 29 | 18 | 21 | 24 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 29 | 14 | 8 | 23 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 29 | 16 | 22 | 39 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 29 | 15 | 26 | 19 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 29 | 17 | 28 | 29 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 30 | 23 | 25 | 40 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 30 | 25 | 24 | 31 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 30 | 24 | 17 | 22 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 30 | 21 | 24 | 30 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 30 | 22 | 23 | 22 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 31 | 28 | 17 | 22 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 31 | 31 | 21 | 25 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 31 | 29 | 12 | 16 | 0.637755102041 | 71.4304 |
| 2014 | 7 | 31 | 30 | 24 | 22 | 0.637755102041 | 71.4304 |
| 2014 | 8 | 31 | 1 | 28 | 23 | 0.413223140496 | 71.4304 |
| 2014 | 8 | 32 | 4 | 19 | 24 | 0.413223140496 | 71.4304 |
| 2014 | 8 | 32 | 6 | 21 | 30 | 0.413223140496 | 71.4304 |
| 2014 | 8 | 32 | 8 | 15 | 12 | 0.413223140496 | 71.4304 |
| 2014 | 8 | 32 | 5 | 16 | 23 | 0.413223140496 | 71.4304 |
| 2014 | 8 | 32 | 7 | 21 | 26 | 0.413223140496 | 71.4304 |
| 2014 | 8 | 33 | 12 | 14 | 24 | 0.413223140496 | 71.4304 |
| 2014 | 8 | 33 | 11 | 14 | 17 | 0.413223140496 | 71.4304 |
| 2014 | 8 | 33 | 14 | 8 | 25 | 0.413223140496 | 71.4304 |
| 2014 | 8 | 33 | 15 | 10 | 15 | 0.413223140496 | 71.4304 |
| 2014 | 8 | 33 | 13 | 19 | 19 | 0.413223140496 | 71.4304 |
http://sqlfiddle.com/#!3/eb0ff/10
ASKER
Okay now we are getting somewhere :-)
But what does "over(partition by Imonth)" do? I dont know how it has to look like in my two trendline lines.
Can you build the two lines?
But what does "over(partition by Imonth)" do? I dont know how it has to look like in my two trendline lines.
Can you build the two lines?
well the evidence is in front of you isn't it?
column_6 and column_7 are BOTH calculated by an aggregate function VARP() but also using OVER()
( and there is no GROUP BY clause ! )
so the addition OVER() produces (and/or, repeats) an aggregating value for each row without needing a group by clause.
the PARTITION BY is similar to a group by, it controls the clusters of records that are included in the calculations.
column_6 and column_7 are BOTH calculated by an aggregate function VARP() but also using OVER()
( and there is no GROUP BY clause ! )
so the addition OVER() produces (and/or, repeats) an aggregating value for each row without needing a group by clause.
the PARTITION BY is similar to a group by, it controls the clusters of records that are included in the calculations.
ASKER
Yes im sure the evidence is right in front of me :-) I just can't see it.
I tried with this:
,tickets3 AS
(
SELECT
Iyear
,Imonth
--,Iweek
,Iday
,opened
,closed
, VARP(Iweek * 1.0) over(partition by Imonth) as Iweek
--, VARP(Iday * 1.0) over(partition by Iyear) as testday
,(Avg(Iweek * 1.0000 * opened) - Avg(Iweek * 1.0000) * Avg(opened)) / VarP(Iweek * 1.0000) AS RegressCoeff
,Avg(opened) - ((Avg(Iweek * 1.0000 * opened) - Avg(Iweek * 1.0000) * Avg(opened)) / VarP(Iweek * 1.0000)) * Avg(Iweek * 1.0000) AS Intercept
FROM tickets2
Group BY
Iyear,Imonth,Iweek,Iday,op ened,close d
)
select *
from tickets3
order by Iyear,Imonth
I still get the divide by zero.
Then I tried this version:
,tickets3 AS
(
SELECT
Iyear
,Imonth
--,Iweek
,Iday
,opened
,closed
--, VARP(Iweek * 1.0) over(partition by Imonth) as Iweek
--, VARP(Iday * 1.0) over(partition by Iyear) as testday
,(Avg(Iweek * 1.0000 * opened) - Avg(Iweek * 1.0000) * Avg(opened)) / VarP(Iweek * 1.0000) over(partition by Imonth) AS RegressCoeff
,Avg(opened) - ((Avg(Iweek * 1.0000 * opened) - Avg(Iweek * 1.0000) * Avg(opened)) / VarP(Iweek * 1.0000) over(partition by Imonth)) * Avg(Iweek * 1.0000) AS Intercept
FROM tickets2
Group BY
Iyear,Imonth,Iweek,Iday,op ened,close d
)
select *
from tickets3
order by Iyear,Imonth
Which returns this. Is this the wrong way to use over(partition by Imonth)
Iyear Imonth Iday opened closed RegressCoeff Intercept
2014 7 18 21 24 0 21
2014 7 24 17 22 0 17
2014 7 25 24 31 0 24
2014 7 30 24 22 0 24
2014 7 16 22 39 0 22
2014 7 21 24 30 0 24
2014 7 29 12 16 0 12
2014 7 31 21 25 0 21
2014 7 23 25 40 0 25
2014 7 17 28 29 0 28
2014 7 22 23 22 0 23
2014 7 15 22 13 0 22
2014 7 28 17 22 0 17
2014 8 4 19 24 0 19
2014 8 18 10 30 0 10
2014 8 1 28 23 0 28
2014 8 5 16 23 0 16
2014 8 7 23 26 0 23
2014 8 8 16 12 0 16
2014 8 12 14 24 0 14
2014 8 14 10 25 0 10
2014 8 13 21 19 0 21
2014 8 6 22 30 0 22
2014 8 11 15 17 0 15
2014 8 15 16 14 0 16
I tried with this:
,tickets3 AS
(
SELECT
Iyear
,Imonth
--,Iweek
,Iday
,opened
,closed
, VARP(Iweek * 1.0) over(partition by Imonth) as Iweek
--, VARP(Iday * 1.0) over(partition by Iyear) as testday
,(Avg(Iweek * 1.0000 * opened) - Avg(Iweek * 1.0000) * Avg(opened)) / VarP(Iweek * 1.0000) AS RegressCoeff
,Avg(opened) - ((Avg(Iweek * 1.0000 * opened) - Avg(Iweek * 1.0000) * Avg(opened)) / VarP(Iweek * 1.0000)) * Avg(Iweek * 1.0000) AS Intercept
FROM tickets2
Group BY
Iyear,Imonth,Iweek,Iday,op
)
select *
from tickets3
order by Iyear,Imonth
I still get the divide by zero.
Then I tried this version:
,tickets3 AS
(
SELECT
Iyear
,Imonth
--,Iweek
,Iday
,opened
,closed
--, VARP(Iweek * 1.0) over(partition by Imonth) as Iweek
--, VARP(Iday * 1.0) over(partition by Iyear) as testday
,(Avg(Iweek * 1.0000 * opened) - Avg(Iweek * 1.0000) * Avg(opened)) / VarP(Iweek * 1.0000) over(partition by Imonth) AS RegressCoeff
,Avg(opened) - ((Avg(Iweek * 1.0000 * opened) - Avg(Iweek * 1.0000) * Avg(opened)) / VarP(Iweek * 1.0000) over(partition by Imonth)) * Avg(Iweek * 1.0000) AS Intercept
FROM tickets2
Group BY
Iyear,Imonth,Iweek,Iday,op
)
select *
from tickets3
order by Iyear,Imonth
Which returns this. Is this the wrong way to use over(partition by Imonth)
Iyear Imonth Iday opened closed RegressCoeff Intercept
2014 7 18 21 24 0 21
2014 7 24 17 22 0 17
2014 7 25 24 31 0 24
2014 7 30 24 22 0 24
2014 7 16 22 39 0 22
2014 7 21 24 30 0 24
2014 7 29 12 16 0 12
2014 7 31 21 25 0 21
2014 7 23 25 40 0 25
2014 7 17 28 29 0 28
2014 7 22 23 22 0 23
2014 7 15 22 13 0 22
2014 7 28 17 22 0 17
2014 8 4 19 24 0 19
2014 8 18 10 30 0 10
2014 8 1 28 23 0 28
2014 8 5 16 23 0 16
2014 8 7 23 26 0 23
2014 8 8 16 12 0 16
2014 8 12 14 24 0 14
2014 8 14 10 25 0 10
2014 8 13 21 19 0 21
2014 8 6 22 30 0 22
2014 8 11 15 17 0 15
2014 8 15 16 14 0 16
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
It was the group by that caused the divide by zero error.
You can use CASE statement to avoid Divide by zero
Open in new window
It may be the case then if iweek value = 0 then this error can come