Solved

Cumulative Beta Distribution function as an SQL Query?

Posted on 2007-03-26
14
2,672 Views
Last Modified: 2013-11-13
I'd like to implement a "Cumulative Beta Distribution" function as an SQL query.  Any ideas on how I'd do this?  Info on this function can be found here: http://en.wikipedia.org/wiki/Beta_distribution.

Thanks in advance!
0
Comment
Question by:nbb007
  • 6
  • 5
  • 2
  • +1
14 Comments
 
LVL 50

Expert Comment

by:Lowfatspread
ID: 18796366
what would the output look like?

on the face of it if it s "simply" mathematics then you just write the equation as a function (set of functions)
and add it your select statement...

you need to translate the math/method into something that a non mathematican.statistian cann understand...

0
 

Author Comment

by:nbb007
ID: 18796576
Well, the output would be a number.  Excel has a built-in function called "BETADISTRIBUTION" that I use frequently, but I now need to be able to perform this calculation on data in an SQL Server database.  I was hoping that somebody might have this function defined as a query, as I'm not sure how to do it myself.  Is there any other info/data I could provide?
0
 
LVL 27

Expert Comment

by:aburr
ID: 18797751
I would suggest that you move this question to the
Database section
under mySQL or Access
0
 
LVL 50

Expert Comment

by:Lowfatspread
ID: 18799331
Can you confirm which database system this will run in eventually?
0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 18800149
There's a method for calculating estimates of the alpha and beta parameters here:  http://en.wikipedia.org/wiki/Beta_distribution#Parameter_estimation

SELECT AVG(x) * (AVG(x)*(1 - AVG(x))/VAR_POP(x) - 1) AS est_alpha, (1 - AVG(x)) * ( AVG(x)*(1 - AVG(x))/VAR_POP(x) - 1) AS est_beta FROM thetable WHERE ....
0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 18800153
I assumed you're working with mysql in my solution.  No guarantees about other databases.
0
 

Author Comment

by:nbb007
ID: 18801108
Yes, I'm using MySQL, but I already have queries for both Alpha and Beta:

SELECT  ((1-(dbo_rdm_port.PERSPVALUE/dbo_rdm_port.EXPVALUE))/(((dbo_rdm_port.STDDEVI+dbo_rdm_port.STDDEVC)/dbo_rdm_port.PERSPVALUE)*((dbo_rdm_port.STDDEVI+dbo_rdm_port.STDDEVC)/dbo_rdm_port.PERSPVALUE)))-(dbo_rdm_port.PERSPVALUE/dbo_rdm_port.EXPVALUE) AS ALPHA, (((1-(dbo_rdm_port.PERSPVALUE/dbo_rdm_port.EXPVALUE))/(((dbo_rdm_port.STDDEVI+dbo_rdm_port.STDDEVC)/dbo_rdm_port.PERSPVALUE)*((dbo_rdm_port.STDDEVI+dbo_rdm_port.STDDEVC)/dbo_rdm_port.PERSPVALUE)))-(dbo_rdm_port.PERSPVALUE/dbo_rdm_port.EXPVALUE))*((dbo_rdm_port.EXPVALUE/dbo_rdm_port.PERSPVALUE)-1) AS BETA

this really translates into:
alpha = (1 - u/cv^2) - u
beta = alpha(1-u)/u

where u = loss/expvalu
and     cv = (stdevi + stdevc)/loss

the problem is that I need the "Cumulative Beta Distribution" function to plug these alpha and beta parameters into...
0
What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

 
LVL 22

Expert Comment

by:NovaDenizen
ID: 18801587
Are you saying that you already have the alpha and beta values, and you want to calculate values of the cumulative distribution function for that alpha and beta?

 According to http://mathworld.wolfram.com/BetaDistribution.html you will need to implement the Regularized Beta Function (http://mathworld.wolfram.com/RegularizedBetaFunction.html).  That function is definied in terms of the Incomplete Beta Function (http://mathworld.wolfram.com/IncompleteBetaFunction.html) and the Beta Function (http://mathworld.wolfram.com/BetaFunction.html).  

The Incomplete Beta function is a nontrivial function to calculate, requiring an approximation to an infinite series.

The Beta function is easily defined in terms of factorials when alpha and beta are integers, but requires the use of the gamma function when either is not an integer.
0
 

Author Comment

by:nbb007
ID: 18801856
Yes, I need to calculate the cumulative beta distrubution for values of Alpha and Beta that I already have.  This can easily be done in Microsoft Excel with the BETADISTRIBUTION funtion, but I need to do this on a MySQL database instead of an Excel workbook.
0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 18801956
MySQL is a database server, not a numerical processing engine.  I think this sort of calculation belongs in a specialized application.

If you really must calculate the Beta Distribution function within sql, then you would be best off implementing it in C and linking your custom function into mysql.  See http://dev.mysql.com/doc/refman/5.0/en/adding-functions.html
0
 

Author Comment

by:nbb007
ID: 18802192
Well, I was hoping that I could make the query cross-platform, i.e. run my Cumulative Beta Distribution SQL Query on a MySQL database, SQL Server 200/2005 database, MS Access database, Oracle database, etc.  Perhaps there is an approximation function that could be implemented as an SQL query?
0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 18803025
You ask for something akin to plans for a scanning tunneling electron microscope that can be constructed in an afternoon using office supplies.  :)

I suggest you rethink the big picture about how you want your application to work, and put the difficult mathematical computations in a module that can handle them.
0
 

Author Comment

by:nbb007
ID: 18805245
Well, I actually have done that already.  I get the Alpha and Beta values from my database using the SQL query I previously indicated, and then I use a library I purchased from SyncFusion to calculate the Cumulative Beta Distribution for those Alpha and Beta values:
"Syncfusion.Windows.Forms.Chart.Statistics.UtilityFunctions.BetaCumulativeDistribution"

The problem is that the result of this Cumulative Beta Distribution calculation must be used in subsequent SQL queries, hence the wish to perform this calculation in SQL so I can keep everything on the database side.  My solution was to write the Cumulative Beta Distribution value calculated via my application back to the table I queried Alpha and Beta from.  This process is rather time consuming however, since the database has around 1 million records.

Does this sound like the best solution?
0
 
LVL 22

Accepted Solution

by:
NovaDenizen earned 500 total points
ID: 18807916
I'm not clear on the big picture of what you're doing.  If you need to calculate one distribution value for each value in your database, then it makes sense to automatically calculate it in an external module and store the value in the database.
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
SQL Command Tool comes with APEX under SQL Workshop. It helps us to make changes on the database directly using a graphical user interface. This helps us writing any SQL/ PLSQL queries and execute it on the database and we can create any database ob…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now