Solved

# Calculating Standard Normal Distribution in MS SQL

Posted on 2007-07-20
3,663 Views
I am trying to convert an Excel function to an MS SQL function. One of the methods used in the VBA code is the Standard Normal Distribution function NormSDist(arg1 as double) as double. Does anyone know how to calculate this using T-SQL?

Thanks!
0
Question by:garethmanuel

LVL 14

Expert Comment

STDEV function
0

LVL 14

Expert Comment

STDEV
Returns the statistical standard deviation of all values in the given expression.

Syntax
STDEV ( expression )

Arguments
expression

Is a numeric expression. Aggregate functions and subqueries are not permitted. expression is an expression of the exact numeric or approximate numeric data type category, except for the bit data type.

Return Types
float

Remarks
If STDEV is used on all items in a SELECT statement, each value in the result set is included in the calculation. STDEV can be used with numeric columns only. Null values are ignored.

Examples
This example returns the standard deviation for all royalty payments in the titles table.

USE pubs
SELECT STDEV(royalty)
FROM titles

0

LVL 14

Expert Comment

You also have STDEVP, VAR & VARP
0

LVL 92

Accepted Solution

Hello garethmanuel,

The formula for the cumulative normal distribution is:

result = (1 / sqrt(2 * pi)) * (e ^ (-(z ^ 2) / 2))

e is Euler's number, the basis for natural logarithms

z is:    (value being considered) / st dev

Incorporate that into your SQL statements.

Regards,

Patrick
0

Author Comment

Thanks matthewspatrick. Its been a while since i've done any statistics. This corresponds with the articles i just found:
Normal Distribution
http://davidmlane.com/hyperstat/A6929.html

Standard Normal Distribution
http://davidmlane.com/hyperstat/A75494.html

Where SND has mean of zero and standard deviation of 1 hence coming up with your simplified equation.

0

LVL 92

Expert Comment

gareth,

You're welcome, and a slight correction...

z is:    (value being considered) / st dev

should be...

z is:    (value being considered - mean) / st dev

Regards,

Patrick
0

Expert Comment

The above formula posted by matthewspatrick is an approximation.  The below functions will calculate the Standard Normal Distribution to 3 decimal places +/-.0002 at the 4th decimal place.

To get a value to match a value in a Table of Standard Normal Probabilites take 1 - Normal((value being considered - mean)/st dev, True).  The True/False entry tells the function if your value is in the left or right tail of the distribution curve.  True = Right Tail, False = Left Tail

Central Region Probability Function
The CRPF calculates values for -2.32 < x < 2.32 (values not in the tails of the distribution curve)

Normal Continued Fraction
The NCF calculates vaues for x < -2.32 or x > 2.32 (values in the tails of the distribution curve)

``````Public Function Normal(ByVal x As Double, ByVal upper As Boolean) As Double
'This procedure is adapted from Algorithm 304 Normal Curve Integral by I. D. Hill and S. A. Joyce
'which is published in Communications of the ACM, Volume 10, Number 6, June 1967 pages 374-375.  It has been
'adapted from the programming language ALGOL 60 to VBA by Philippe Perrault
Dim M As Double
Dim N As Double
Dim p1 As Double
Dim p2 As Double
Dim q1 As Double
Dim q2 As Double
Dim S As Double
Dim t As Double
Dim x2 As Double
Dim Y As Double
Const PI = 3.14159265359

If x = 0 Then
Normal = 0.5 'Area under half the curve
Exit Function
Else
x = Abs(x)
upper = upper Eqv x > 0
x2 = x * x
Y = 1 / (2 * PI) ^ 0.5 * exp(-0.5 * x2)
N = Y / x
End If

If upper = True And 1 - N = 1 Then
Normal = 0 'Area under whole curve since x is a large negative number
Exit Function
End If

If upper = False And N = 0 Then
Normal = 1 'None of the area under the curve since x is a large positive number
Exit Function
End If
q1 = x
p2 = Y * x
N = 1
p1 = Y
q2 = x2 + 1
If upper = True And x > 2.32 Then
M = 1 - p1 / p2
S = M
t = p2 / q2
Normal = NCF(M, N, p1, p2, q1, q2, t, x, upper)
Exit Function
End If
If upper = False And x > 2.32 Then
M = p1 / p2
S = M
Normal = NCF(M, N, p1, p2, q1, q2, t, x, upper)
Exit Function
End If
xy = x * Y
S = x
N = 3
t = 0
S = CRPF(x)
If upper = True Then
Normal = S
Else
Normal = S
End If

End Function

Public Function NCF(M, N, p1, p2, q1, q2, t, x As Double, upper As Boolean) As Double
'Uses the continued fraction method to calculate the Normal Curve Integral at the tails
Do
S = x * p2 + N * p1
p1 = p2
p2 = S
S = x * q2 + N * q1
q1 = q2
q2 = S
S = M
M = t
If upper = True Then
t = p2 / q2
Else
t = 1 - p2 / q2
End If
N = N + 1
Loop While M <> t And S <> t
NCF = t
End Function

Public Function CRPF(x As Double)
'This procedure was adapted from function 26.2.18
'Calculates the Normal Distribution in the central region of the probability function (CRPF)
'accurate to 3 decimal places
Dim B() As Variant
Const p = 0.2316419
Const PI = 3.14159265359
Dim t As Double
Dim z As Double

B = Array(0.196854, 0.115194, 0.000344, 0.019527)
CRPF = 1 - 0.5 * (1 + B(0) * x + B(1) * x ^ 2 + B(2) * x ^ 3 + B(3) * x ^ 4) ^ -4
CRPF = 1 - CRPF
End Function
``````
0

## Featured Post

JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
The viewer will learn how to use a discrete random variable to simulate the return on an investment over a period of years, create a Monte Carlo simulation using the discrete random variable, and create a graph to represent the possible returns over…
This Micro Tutorial will demonstrate how to use longer labels with horizontal bar charts instead of the vertical column chart.