Which is the best approach SQL CLR, SQL Functions, SQL Stored Procedures?

Mr Knackered
Mr Knackered used Ask the Experts™
on
I have a SQL Server 2014 database which has a large transaction file (millions of records). Currently a process is running in C# that has to evaluate balances for customers and scans the transaction file for transactions between ranges of dates. The ranges required for each balance evaluation are set via parameters that are held in the database and vary for each customer.

The existing C# process is looping through each customer record and calls back the parameters from SQL via a SQL Stored Procedure and calculates the dates required (and other things) and then uses another Stored Procedure to fetch the result of the transactions from SQL (ie sum or average of the records). The only thing being transmitted for each customer is the parameter information and then the resulting value from the transaction table.

The existing process is performing ok but I really need to achieve a faster result. Would I be better off implementing the logic of evaluating the balance in SQL CLR or T-SQL Functions / Stored Procedures.

If using CLR then I will have to use a SQLDataReader (or similar with Context Connection) to get the various values required to calculate the dates based on the returned values and then call a stored procedure to get the required value from the transaction table. I could avoid using a stored procedure call from CLR by executing the statement if this has a performance gain.

Ideally I would like to implement a function (CLR or T-SQL) so I can use something like

INSERT INTO ResultsTable (CustomerID, Balance)
SELECT CustomerID, GetBalance(parm1ID, parm2ID, parm3Date, etc).
FROM Somewhere

I am assuming that SQL Server will be much faster at crunching the numbers than a C# program that is fetching loads of bits of the puzzle from SQL and then having to write the data back to a table in SQL afterwards.

Would really appreciate some constructive advice on the best direction to proceed down to get the best performance.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Éric MoreauSenior .Net Consultant
Top Expert 2016
Commented:
I have written an article on that topic that you can read from http://emoreau.com/Entries/Articles/2015/04/SQL-CLR-Integration-in-2015-year-not-product-version.aspx.

As for your question on the best approach, the answer is: it depends! As you can read in my article:
I have always been one of those saying that, if the feature exists natively in SQL, you should never rewrite it. Lately, I was proven wrong on a specific feature. It is probably due to the fact that XML is used and this usage of XML is very not optimized. I would never rewrite simple T-SQL features but I might do some performance tests when dealing with more complex features.

I still believe that if you are dealing with set-based operation, pure SQL will win over CLR.
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
The key to performance is getting the best clustered index on every table.  Hint: This is most often not an identity column.

It's almost certain, based on the limited amount you've stated so far, that transaction date should be part of that index.  Whether customer id should appear first also, I can't tell yet.

Do you (almost) always process one customer at a time, or do you tend to process all customers over a given time period?  Those are the types of qs that help determine the best clus index.  And, as always, we will also need to look at missing index stats and index usage stats also.

Author

Commented:
Indexes are not my question here but thanks for your input I take your point. I have the relevant indexes in place for the retrieval of the data from the transaction file and that side of things is working great. I have been performance monitoring the SQL server and checking for missing indexes and have added stats as well and the retrieval speed from the transaction table is not an issue its really quite quick.

I guess my question is more along the lines of: Are there performance advantages to using T-SQL natively for this type of processing over a coded solution running outside of SQL and using ADO.NET to access the SQL database? I guess SQL can at least optimise queries in T-SQL where it can only deal with what is being requested from the program.

If others have found that running bulk balance style evaluations over large datasets is better to be done in SQL directly then I would invest the time to replicate the process in T-SQL to see what results I get compared to the external coded approach.

I think I will just have to bite the bullet and do it and see what results I get.
Senior DBA
Most Valuable Expert 2018
Top Expert 2014
Commented:
I don't believe there are, as long the .NET code is reasonable.  Calling a stored proc might have some minor performance edge, because it's more likely to cache a plan, but that's a minor performance boost unless the code is really huge.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial