[Webinar] Streamline your web hosting managementRegister Today

x
?
Solved

A computed column (formula) that contains a JOIN?

Posted on 2008-02-07
8
Medium Priority
?
1,202 Views
Last Modified: 2012-08-13
Hi there,

My application involves a multi-step algorithm that requires numerous "parent to children" lookups (states to counties, for example).

My question: is it possible to create a computed column, view, or UDF that will automatically yield the parent (or child) values for a given record without having to perform a costly "join, update and write to disc" operation - and that yields *significant* performance benefit?

Conceptually I need to do this:

   UPDATE A SET A.ThisRecord = SUM(B.ThisRecord_Kids) FROM ThisTable AS A JOIN ThisTable AS B ON A.ThisRowID = B.ThisParentID

Obviously, this can be achieved by carrying the records in a table, and stepping through each step of the algorithm and writing the results to disc before advancing to the next step... what I'm curious to do is to compute "ThisRecord" on-the-fly and pass it onto the next step in the algorithm without writing to disc.

I'm doing about 1.5 million records at a time.

Any ideas? Alternate approaches?

Greatly appreciated!

-ash
0
Comment
Question by:ashblack
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 5

Expert Comment

by:a1x
ID: 20845205
Are you updating a record or do you just need to do a select that summarizes data?

UDF's can work if you are working with only one record in your select.  UDF's seem to run for each record, so if your select has very many records, it is very expensive.

If you can get past the strict requirements, Indexed Views can be useful for summarization.  The overhead of the calculation is incurred on the INSERT rather than the SELECT, and SQL Server enforces the integrity of the calculation.
0
 

Author Comment

by:ashblack
ID: 20845780
a1x,

Thanks for the response. What I need to do is to compute the value, and make the freshly computed value available to the next step of the algorithm. I'm currently doing this by UPDATING before stepping forward. What I'm hoping to be able to do is to only SELECT that value (and thereby avoid the write and increase performance).

I need to do this on 1.5 million records simultaneously - not one at a time. Also, I have built a UDF to accomplish this, but it didn't perform faster than the original approach using the UPDATE statement.


Thanks for the help...

-ash
0
 
LVL 20

Expert Comment

by:ikework
ID: 20850239
hi ashblack,

you could use temporary tables to hold the data after the first step and pass that table to the next step of the algorithm ..

ike
0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

 
LVL 20

Expert Comment

by:ikework
ID: 20850273
another idea is to hold the accumulated data in the parents table with a trigger on the child-table, so every update in the child updates the parent-table-sum field for that child as well.. then you have redundant data, but that might speed up your calculations
it depends on how your database is mainly used, if you have a lot of updates, this might slow down the usual work with the database a lot .. and it depends on how often you need to run that algotithm
0
 
LVL 15

Expert Comment

by:JimFive
ID: 20853344
It depends on what "the next step of the algorithm" is.
It seems to me that you have focused the question on one step of your process but the problem might be that there is a better process.  If you can combine your "steps" into a single SQL statement you might not need the intermediate update at all.
--
JimFive
0
 

Author Comment

by:ashblack
ID: 20878226
ikework, JimFive,

Thanks for your input, much appreciated.

ikework, over the weekend I designed an approach using a temporary table to hold the components of the algorithm, but found that in the long run there was no performance gain in that route... I think that the amount of data being processed by the algorithm is so large that it is impossible to avoid the interim write statements (iow, not enough physical memory). Please balk if that doesn't make sense to you!

Also, JimFive, I appreciate what you're saying about designing the algorithm to avoid writes, but in truth, each step of the algorithm has already been optimized in that direction.  Most steps involve the summation of values based on parent-child (row) relationships and then compuations occur based on relationships between those values . Iow, I'm using a big relationship-lookup during each step of the algorithm already...

If you still think I'm missing something, I'm looking forward to hearing about it.  Thanks again!
0
 
LVL 15

Expert Comment

by:JimFive
ID: 20879417
Ashblack,
Based purely on your example it is certainly possible to create a view:

CREATE VIEW PreCalc AS
SELECT ThisParentID, Sum(ThisRecord_Kids) as ThisRecord FROM B
GROUP BY ThisParentID

And then join this view into your process.

However, I suspect your problem is a bit more difficult than that solution provides for.  If so, could you sum up what data you have to start with and what you want to see at the end of your process.
--
JimFive
0
 

Accepted Solution

by:
ashblack earned 0 total points
ID: 20922146
JimFive, ikework, thanks again for the feedback.  These are valid insights that you've shared... however, in testing these techniques I have not found much long-term performance gain (and I must move onto other tasks for now).

Again, I think my issues are related in large measure to shear volume of rows being processed similtaneously, and that I might be encountering physical design problems.

Also, I need to close this question now (but don't know how! this is my first post ever...)

So thanks again for the insights, they were helpful, but not yet the cure for this issue.
Thanks again!

-ash
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Article by: evilrix
Looking for a way to avoid searching through large data sets for data that doesn't exist? A Bloom Filter might be what you need. This data structure is a probabilistic filter that allows you to avoid unnecessary searches when you know the data defin…
We take a look at the fast-evolving changes in Search Engine Optimization rules and algorithms by Google.
This is a video describing the growing solar energy use in Utah. This is a topic that greatly interests me and so I decided to produce a video about it.
Stellar Phoenix SQL Database Repair software easily fixes the suspect mode issue of SQL Server database. It is a simple process to bring the database from suspect mode to normal mode. Check out the video and fix the SQL database suspect mode problem.

612 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question