A computed column (formula) that contains a JOIN?

Hi there,

My application involves a multi-step algorithm that requires numerous "parent to children" lookups (states to counties, for example).

My question: is it possible to create a computed column, view, or UDF that will automatically yield the parent (or child) values for a given record without having to perform a costly "join, update and write to disc" operation - and that yields *significant* performance benefit?

Conceptually I need to do this:

   UPDATE A SET A.ThisRecord = SUM(B.ThisRecord_Kids) FROM ThisTable AS A JOIN ThisTable AS B ON A.ThisRowID = B.ThisParentID

Obviously, this can be achieved by carrying the records in a table, and stepping through each step of the algorithm and writing the results to disc before advancing to the next step... what I'm curious to do is to compute "ThisRecord" on-the-fly and pass it onto the next step in the algorithm without writing to disc.

I'm doing about 1.5 million records at a time.

Any ideas? Alternate approaches?

Greatly appreciated!

-ash
ashblackAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

a1xCommented:
Are you updating a record or do you just need to do a select that summarizes data?

UDF's can work if you are working with only one record in your select.  UDF's seem to run for each record, so if your select has very many records, it is very expensive.

If you can get past the strict requirements, Indexed Views can be useful for summarization.  The overhead of the calculation is incurred on the INSERT rather than the SELECT, and SQL Server enforces the integrity of the calculation.
0
ashblackAuthor Commented:
a1x,

Thanks for the response. What I need to do is to compute the value, and make the freshly computed value available to the next step of the algorithm. I'm currently doing this by UPDATING before stepping forward. What I'm hoping to be able to do is to only SELECT that value (and thereby avoid the write and increase performance).

I need to do this on 1.5 million records simultaneously - not one at a time. Also, I have built a UDF to accomplish this, but it didn't perform faster than the original approach using the UPDATE statement.


Thanks for the help...

-ash
0
ikeworkCommented:
hi ashblack,

you could use temporary tables to hold the data after the first step and pass that table to the next step of the algorithm ..

ike
0
Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

ikeworkCommented:
another idea is to hold the accumulated data in the parents table with a trigger on the child-table, so every update in the child updates the parent-table-sum field for that child as well.. then you have redundant data, but that might speed up your calculations
it depends on how your database is mainly used, if you have a lot of updates, this might slow down the usual work with the database a lot .. and it depends on how often you need to run that algotithm
0
JimFiveCommented:
It depends on what "the next step of the algorithm" is.
It seems to me that you have focused the question on one step of your process but the problem might be that there is a better process.  If you can combine your "steps" into a single SQL statement you might not need the intermediate update at all.
--
JimFive
0
ashblackAuthor Commented:
ikework, JimFive,

Thanks for your input, much appreciated.

ikework, over the weekend I designed an approach using a temporary table to hold the components of the algorithm, but found that in the long run there was no performance gain in that route... I think that the amount of data being processed by the algorithm is so large that it is impossible to avoid the interim write statements (iow, not enough physical memory). Please balk if that doesn't make sense to you!

Also, JimFive, I appreciate what you're saying about designing the algorithm to avoid writes, but in truth, each step of the algorithm has already been optimized in that direction.  Most steps involve the summation of values based on parent-child (row) relationships and then compuations occur based on relationships between those values . Iow, I'm using a big relationship-lookup during each step of the algorithm already...

If you still think I'm missing something, I'm looking forward to hearing about it.  Thanks again!
0
JimFiveCommented:
Ashblack,
Based purely on your example it is certainly possible to create a view:

CREATE VIEW PreCalc AS
SELECT ThisParentID, Sum(ThisRecord_Kids) as ThisRecord FROM B
GROUP BY ThisParentID

And then join this view into your process.

However, I suspect your problem is a bit more difficult than that solution provides for.  If so, could you sum up what data you have to start with and what you want to see at the end of your process.
--
JimFive
0
ashblackAuthor Commented:
JimFive, ikework, thanks again for the feedback.  These are valid insights that you've shared... however, in testing these techniques I have not found much long-term performance gain (and I must move onto other tasks for now).

Again, I think my issues are related in large measure to shear volume of rows being processed similtaneously, and that I might be encountering physical design problems.

Also, I need to close this question now (but don't know how! this is my first post ever...)

So thanks again for the insights, they were helpful, but not yet the cure for this issue.
Thanks again!

-ash
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2005

From novice to tech pro — start learning today.