Link to home
Start Free TrialLog in
Avatar of happy 1001
happy 1001

asked on

SQL Server - Query for Outlier

Hi Experts

I want to know, what are the various ways available in SQL Server by which I can find out the "Normal Value RANGE"  and "outliers" from a data table.

Here is the script that will create the Sample Database named " TestDB1001 "  and Table named " Table1001" used in this work.

CREATE DATABASE TestDB1001
GO

USE TestDB1001
GO




CREATE TABLE TestDB1001.dbo.Table1001 (
  xdate DATETIME2(0) NULL,
  sector NVARCHAR(255) NULL,
  symbol NVARCHAR(255) NULL,
  [Close Price ] FLOAT NULL,
  Volume FLOAT NULL,
  [Volume Percent ] FLOAT NULL,
  OpenInterest FLOAT NULL,
  [OpenInterest Percent] FLOAT NULL
) ON [PRIMARY]
GO




SET DATEFORMAT ymd
SET ARITHABORT, ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, QUOTED_IDENTIFIER, ANSI_NULLS, NOCOUNT ON
SET NUMERIC_ROUNDABORT, IMPLICIT_TRANSACTIONS, XACT_ABORT OFF
GO

INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-01 00:00:00.0000000', N'Realty', N'DLF', 171.45, 236.57, 16.5, 657.59007, 6.88)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-01 00:00:00.0000000', N'Realty', N'HDIL', 55.6, 89.4, 61.43, 152.6776, 17.62)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-01 00:00:00.0000000', N'Realty', N'UNITECH', 15.9, 120.13, 45, 282.43848, 6.4)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-02 00:00:00.0000000', N'Realty', N'DLF', 166.45, 420.65, 77.81, 658.68621999999993, 0.17)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-02 00:00:00.0000000', N'Realty', N'HDIL', 51.85, 124.27, 39, 147.08239999999998, -3.67)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-02 00:00:00.0000000', N'Realty', N'UNITECH', 15.35, 119.73, -0.33, 273.67355999999995, -3.1)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-03 00:00:00.0000000', N'Realty', N'DLF', 169, 369.57, -12.14, 707.74436999999989, 7.45)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-03 00:00:00.0000000', N'Realty', N'HDIL', 53.5, 87.35, -29.71, 158.88595999999998, 8.03)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-03 00:00:00.0000000', N'Realty', N'UNITECH', 15.35, 57.75, -51.77, 279.11279999999994, 1.99)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-06 00:00:00.0000000', N'Realty', N'DLF', 167.25, 217.96, -41.02, 722.15043999999989, 2.04)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-06 00:00:00.0000000', N'Realty', N'HDIL', 52.95, 50.37, -42.34, 157.18616000000006, -1.07)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-06 00:00:00.0000000', N'Realty', N'UNITECH', 15.3, 38.19, -33.87, 286.6113, 2.69)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-07 00:00:00.0000000', N'Realty', N'DLF', 164.6, 265.73, 21.92, 710.63243, -1.5899999999999999)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-07 00:00:00.0000000', N'Realty', N'HDIL', 52.15, 108.35, 115.11, 167.99551999999997, 6.87)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-07 00:00:00.0000000', N'Realty', N'UNITECH', 15, 70.69, 85.1, 290.49611999999991, 1.3599999999999999)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-08 00:00:00.0000000', N'Realty', N'DLF', 162.9, 258.36, -2.77, 697.20230999999978, -1.8900000000000001)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-08 00:00:00.0000000', N'Realty', N'HDIL', 53.3, 70.91, -34.55, 174.64184, 3.95)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-08 00:00:00.0000000', N'Realty', N'UNITECH', 14.95, 55.43, -21.59, 291.71525999999994, 0.42)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-09 00:00:00.0000000', N'Realty', N'DLF', 159.5, 186.23, -27.92, 699.0476, 0.27)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-09 00:00:00.0000000', N'Realty', N'HDIL', 50.2, 103.81, 46.4, 182.34456, 4.41)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-09 00:00:00.0000000', N'Realty', N'UNITECH', 14.7, 50.19, -9.45, 294.89627999999988, 1.09)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-10 00:00:00.0000000', N'Realty', N'DLF', 156.85, 200.27, 7.54, 699.8979999999998, 0.12)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-10 00:00:00.0000000', N'Realty', N'HDIL', 49.55, 91.61, -11.75, 180.54768, -0.98)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-10 00:00:00.0000000', N'Realty', N'UNITECH', 14.25, 56.77, 13.11, 296.6907599999999, 0.61)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-13 00:00:00.0000000', N'Realty', N'DLF', 161, 197.51, -1.38, 703.72603999999978, 0.55)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-13 00:00:00.0000000', N'Realty', N'HDIL', 49.75, 72.79, -20.54, 183.90999999999997, 1.8599999999999999)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-13 00:00:00.0000000', N'Realty', N'UNITECH', 13.8, 61.04, 7.52, 300.79445999999996, 1.38)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-14 00:00:00.0000000', N'Realty', N'DLF', 157.8, 188.98, -4.32, 728.44625999999982, 3.51)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-14 00:00:00.0000000', N'Realty', N'HDIL', 49.6, 79.6, 9.36, 203.92476000000008, 10.88)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-14 00:00:00.0000000', N'Realty', N'UNITECH', 13.5, 51.78, -15.17, 305.99279999999982, 1.73)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-15 00:00:00.0000000', N'Realty', N'DLF', 160.5, 197.29, 4.4, 728.82591000000036, 0.05)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-15 00:00:00.0000000', N'Realty', N'HDIL', 51.3, 73.09, -8.18, 208.26824, 2.13)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-15 00:00:00.0000000', N'Realty', N'UNITECH', 14.2, 125.87, 143.09, 317.5926, 3.79)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-16 00:00:00.0000000', N'Realty', N'DLF', 162.4, 208.88, 5.87, 735.66176, 0.94)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-16 00:00:00.0000000', N'Realty', N'HDIL', 50.65, 68.36, -6.47, 208.24432000000002, -0.01)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-16 00:00:00.0000000', N'Realty', N'UNITECH', 13.95, 70.59, -43.92, 314.64828000000006, -0.93)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-17 00:00:00.0000000', N'Realty', N'DLF', 155.6, 284.44, 36.17, 756.6858900000002, 2.86)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-17 00:00:00.0000000', N'Realty', N'HDIL', 48.6, 86.94, 27.18, 210.18943999999993, 0.94)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-17 00:00:00.0000000', N'Realty', N'UNITECH', 13.65, 55.82, -20.92, 313.36584, -0.41)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-20 00:00:00.0000000', N'Realty', N'DLF', 156.1, 238.3, -16.22, 784.67971000000011, 3.7)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-20 00:00:00.0000000', N'Realty', N'HDIL', 49.55, 73.1, -15.92, 213.88799999999998, 1.76)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-20 00:00:00.0000000', N'Realty', N'UNITECH', 13.8, 76.18, 36.47, 318.67127999999997, 1.69)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-21 00:00:00.0000000', N'Realty', N'DLF', 157.8, 201.98, -15.24, 788.57473000000016, 0.5)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-21 00:00:00.0000000', N'Realty', N'HDIL', 50.9, 71.67, -1.96, 218.67824000000007, 2.24)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-21 00:00:00.0000000', N'Realty', N'UNITECH', 13.75, 44.2, -41.98, 318.27546, -0.12)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-22 00:00:00.0000000', N'Realty', N'DLF', 159.1, 194.93, -3.49, 794.0979600000004, 0.7)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-22 00:00:00.0000000', N'Realty', N'HDIL', 50.8, 68.03, -5.08, 217.37912000000003, -0.59)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-22 00:00:00.0000000', N'Realty', N'UNITECH', 13.75, 23.48, -46.88, 323.47625999999991, 1.63)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-23 00:00:00.0000000', N'Realty', N'DLF', 159.05, 199.54, 2.36, 801.66750000000025, 0.95)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-23 00:00:00.0000000', N'Realty', N'HDIL', 50.45, 61.91, -9, 218.71092000000002, 0.61)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-23 00:00:00.0000000', N'Realty', N'UNITECH', 13.7, 31.98, 36.2, 322.67046000000005, -0.25)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-24 00:00:00.0000000', N'Realty', N'DLF', 152.8, 396.39, 98.65, 882.00591000000031, 10.02)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-24 00:00:00.0000000', N'Realty', N'HDIL', 48.1, 105.47, 70.36, 226.20192000000003, 3.43)
INSERT TestDB1001.dbo.Table1001(xdate, sector, symbol, [Close Price], Volume, [Volume Percent], OpenInterest, [OpenInterest Percent]) VALUES ('2014-01-24 00:00:00.0000000', N'Realty', N'UNITECH', 13.5, 89.6, 180.18, 337.96794000000006, 4.74)
GO

                             

Open in new window


Suppose I want to find out outliers for each separate stock symbol, from the above table, based on [Volume Percent ] column.

Suppose normally the [Volume Percent ] for a stock has values of +20 % to -20 %, and then on one day, suddenly it has a value of 280 %, then that is certainly an outlier value for this column for this stock.

Please suggest, which methods could be used to -
1.First define the "Normal Value RANGE" for each stock, over past x days.
2. Find out the outliers, which have values OUTSIDE of the "Normal Value RANGE" that we defined above.

Any ideas regarding how such an analysis could be done are welcome. I am not yet sure, which are the various options available for such work and whether SQL Server is capable of doing such work or if I will have to use Excel for doing this.

Thanks a lot for any help

PS:  I guess we will need to use some concepts like -
https://en.wikipedia.org/wiki/Standard_deviation
https://www.mathsisfun.com/data/standard-normal-distribution.html
etc. for doing this work, but I am not sure, how exactly it could be done.

We might need to add a few columns which will show the 1,2 and 3 Standard Deviation Value / Range, and anything outside that range might qualify as an outlier.

If it is not possible to be done in SQL Server, then I am open to doing it in Excel as well.
Avatar of Olaf Doschke
Olaf Doschke
Flag of Germany image

Indeed this is a question about mathematical concepts more than SQL or databases.

These mathematical concepts have their (more or less) simple formula and you can compute several things in SQL, eg SUM(field), AVG(field) and compute more complex things with these results. But even such coimplex things as STDEV(field). You're not the first one doing stock or math in databases ;)

Bye, Olaf.
For example you can see some figures with the following query:

Select symbol, count(*), sum([volume percent]), avg([volume percent]), stdev([volume percent])  FROM TestDB1001.dbo.Table1001 group by symbol

Open in new window


Bye, Olaf.
It may be better to post what you're trying to achieve in the Math / Science topic to determine a general approach and then move on to how to achieve that via SQL Server or Excel.
Avatar of happy 1001
happy 1001

ASKER

Thanks for the comments Experts.


It may be better to post what you're trying to achieve in the Math / Science topic to determine a general approach and then move on to how to achieve that via SQL Server or Excel.
@ Doug, that is an excellent suggestion. I should first become clear about which method suits my need, and then I should post it in the SQL Server Section to see how it could be done.
What way can I move this topic to Math/Science section ?

Thanks and regards
If you don't see a way to add this topic, request attention using the button at the bottom of the right side of your original post.
You could also just start a new question in that topic and include a link to this post.
Thank you so much for your help @Paul
:)
ASKER CERTIFIED SOLUTION
Avatar of Olaf Doschke
Olaf Doschke
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.