Solved

C# SQL Query considerations against a large database

Posted on 2007-03-25
6
2,031 Views
Last Modified: 2013-11-07
I am querying a database table and performing a calculation in each row returned.  I'd like to put the result of this calculation right back in the same table, but I'm unsure of how to do this properly.  The table is very large - 100,000 - 1,000,000 rows - so I can't bring the table local, it must stay on the server.  Here is what I'm currently doing:

OleDbConnection accessConnect = new OleDbConnection()
{
   try
   {
      accessConnect.Open();//Open the data connection

      OleDbDataReader tableReader = (new OleDbCommand(sqlQuery, accessConnect)).ExecuteReader();

     while (tableReader.Read())//true if there are more rows; otherwise, false
     {
        alpha = Convert.ToDouble(tableReader.GetValue(0));
        beta = Convert.ToDouble(tableReader.GetValue(1));

         distribution = CalcUtility.BetaCumulativeDistribution(alpha, beta);
      }
.............

I need to use the "distribution" value in another query against the same database, so I'm thinking the best way to do this would be to add a "DISTRUBUTION" column to the table I'm querying and adding the "distribution" value.  Would I add this data as soon as it is calculated, i.e.

SqlConnection custConn = new SqlConnection(...);
custConn.Open();
SqlCommand sqlCmd = new SqlCommand();
SqlUpdateCommand1.Connection = custConn;
SqlUpdateCommand1.CommandText="ALTER TABLE table ADD COLUMN DISTRIBUTION Double;
SqlUpdateCommand1.ExecuteNonQuery();

using (OleDbConnection accessConnect = new OleDbConnection())
{
   try
   {
      accessConnect.Open();//Open the data connection

      OleDbDataReader tableReader = (new OleDbCommand(sqlQuery, accessConnect)).ExecuteReader();

     while (tableReader.Read())//true if there are more rows; otherwise, false
     {
        alpha = Convert.ToDouble(tableReader.GetValue(0));
        beta = Convert.ToDouble(tableReader.GetValue(1));

         distribution = CalcUtility.BetaCumulativeDistribution(alpha, beta);

         SqlUpdateCommand1.CommandText="UPDATE table SET DISTRIBUTION =" + distribution.ToString() + "' WHERE stuff;"
         SqlUpdateCommand1.ExecuteNonQuery();
      }
................

These seems pretty clunky, but I think it would work.  I think the best method would be to create a dataset based on the query, calculate, add, and fill the column, and then update the server database with the dataset.  The problem is that there is too much data, so filling a local dataset isn't an option.  Is there a way I could add the "distribution" column in bulk, once all distrubution values are calculated?

Thanks in advance for the help!!
0
Comment
Question by:nbb007
  • 3
  • 2
6 Comments
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 18787729
what kind of calculation is this distribution value?
if possible, you could make that a function in the sql server directly...
0
 

Author Comment

by:nbb007
ID: 18789708
The calculation is a "Cumulative Beta Distribution", and is described here: http://en.wikipedia.org/wiki/Beta_distribution

I am using a library that contains an implementation of this function, so I'm not even sure how it is implemented.  Specifically, I am using a library that I bought from SyncFusion:
"Syncfusion.Windows.Forms.Chart.Statistics.UtilityFunctions.BetaCumulativeDistribution".  An example of it's use can be found on their website: http://www.syncfusion.com/support/evalcenter/default.aspx?cNode=468.

I would love to be able to do this calculation directly in SQL, but I have no idea how to accomplish this.  I did allot of searching on doing this calc directly in SQL before buying the SyncFusion library, so I don't know if it can be done in an SQL query...
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 18790986
>I would love to be able to do this calculation directly in SQL
do you have sql server 2005? you could create a CLR function, which maps that function and hence could be used directly in SQL.

otherwise, I would suggest that you stay with your code, with a "minor" but effective change:
instead of submitting each UPDATE individually, put some 1000 updates together, and submit them as "batch". this will avoid many server roundtrips, and work alot faster.

using System.Text;

SqlConnection custConn = new SqlConnection(...);
custConn.Open();
SqlCommand sqlCmd = new SqlCommand();
SqlUpdateCommand1.Connection = custConn;
SqlUpdateCommand1.CommandText="ALTER TABLE table ADD COLUMN DISTRIBUTION Double;
SqlUpdateCommand1.ExecuteNonQuery();

using (OleDbConnection accessConnect = new OleDbConnection())
{
   try
   {
      accessConnect.Open();//Open the data connection

      OleDbDataReader tableReader = (new OleDbCommand(sqlQuery, accessConnect)).ExecuteReader();

     StringBuilder query = new StringBuilder();
     int query_count  = 0;

     while (tableReader.Read())//true if there are more rows; otherwise, false
     {
        alpha = Convert.ToDouble(tableReader.GetValue(0));
        beta = Convert.ToDouble(tableReader.GetValue(1));

         distribution = CalcUtility.BetaCumulativeDistribution(alpha, beta);

         query_count ++;
         query.Append("UPDATE table SET DISTRIBUTION =" + distribution.ToString() + "' WHERE stuff;");
 
         if (query_count >= 1000)
         {
           SqlUpdateCommand1.CommandText= query.ToString();
           SqlUpdateCommand1.ExecuteNonQuery();
           query = new StringBuilder();
           query_count = 0;
         } // end if query_count
      } // while (dr.Read())

      if (query_count > 0)
      {
        SqlUpdateCommand1.CommandText= query.ToString();
        SqlUpdateCommand1.ExecuteNonQuery();
      }

0
What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

 

Author Comment

by:nbb007
ID: 18794766
Ok, good advice.  I am currently using SQL Server 2000, so the CLR route wouldn't work for me.  I do like the Batch approach however - SQL will execute a string of 1000 queries at once?  There is no way to use the current position of the TableReader object to update the particular row it is currently refering to, is there?
0
 
LVL 142

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 500 total points
ID: 18794795
>SQL will execute a string of 1000 queries at once?
yes.

> There is no way to use the current position of the TableReader object to update the particular row it is currently refering to, is there?
there is, but that will either do what you are currently doing (update each single line) or trying to avoid (ie read the entire data set at once...)
0
 

Expert Comment

by:bhushanvinay
ID: 20067468
Just looking at your problem you could try some thing wild.

CREATE a datatable_target
data column a
data column b
data column c -- calculated.

while r(datatable_source.read())
{
   add rows to the new table from yoru old table
  with any calulated value
}

you can make this typesafe also if you want to make a XSD and create the table ?

dont know if it helps you.

Regards
Vinay
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

For a while now I'v been searching for a circular progress control, much like the one you get when first starting your Silverlight application. I found a couple that were written in WPF and there were a few written in Silverlight, but all appeared o…
Whether you've completed a degree in computer sciences or you're a self-taught programmer, writing your first lines of code in the real world is always a challenge. Here are some of the most common pitfalls for new programmers.
This tutorial covers a step-by-step guide to install VisualVM launcher in eclipse.
The viewer will learn how to synchronize PHP projects with a remote server in NetBeans IDE 8.0 for Windows.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now