Link to home
Start Free TrialLog in
Avatar of mSchmidt
mSchmidt

asked on

Recursive Stored Procedure - nested tables, best practice.

I am currently defining a Calculation system, it will be used to calculate production cost for different products, the system will be running on SQLEXPRESS.

The idea is that i will have some ingredients.
These ingredients are then added to a calculation (defining how much of an ingredient and so on)

This of cause is very easy to do, my problem however is figuring our the best way to allow calculations to include other calculations which in turn can hold other calculations and so on.

When an ingredients price is changed the calculations using this ingredient should be recalculated, this however could result in calculations using these calculations also to be recalculated.

A very important issue here is that the calculations are calculated in the correct order (calculations holding other calculations should not be calculated before all included calculations are calculated).

My integration idea is to keep the price in the tbl_calculation in a nullable decimal field.
When an calculation or ingredient is changed i want to change all calculation prices which are affected by this change to null.
HOW SHOULD THIS BE DONE ? (recursively in a stored procedure until no more records are affected ?)

Then to calculate them again i am thinking the best idea would be to calculate all calculations which do not hold calculations with a non-calculated price (meaning price set to null)-
HOW SHOULD THIS BE DONE ? (recursively in a stored procedure until no more records are affected ?)


MY IDEA OF A TABLE STRUCTORE;
tbl_ingridient - holds ingridients
tbl_calculation - holds calculations
tbl_calculation_ingridient - holds information of which ingridients are included in the calculations.+ amount of ingridient. (this however is a simply multiplication)
tbl_calculation_calculation - holds information of which calculation are included in the calculations. + amount of calculation (this however is a simply multiplication)

As you can see i have some ideas, however i would like to do this as correctly as possible.

Secondly if my idea is good i would like to an example of how to do a stored procedure which recursively runs until no more records are affected.
Avatar of Daniel Wilson
Daniel Wilson
Flag of United States of America image

Can you show us sample data so we understand your scenario better?

I thought it was ingredients that contained other ingredients -- not the calculations that were fundamentally nested.  Am I right?

Is it that:
Product A consist of:
 25% Compound Z
 75% Compound Y ... which consists of
     90% sub-compound alpha
      5% sub-compound beta
     5% sub-compound gamma

If that's the case your table structure is not quite right -- but again, an example would help.
Avatar of mSchmidt
mSchmidt

ASKER

Sorry must not have specified the scenario quite enough.

A Calculation may consist of a number of ingredients and/or a number of calculations.

Example
Ingredients: IngrA, IngrB, IngrC
Calculations: Calc1,Calc2,Calc3,Calc4

Calc1
  - 2,5 x IngrA
  - 3,0 x IngrB

Calc2
  - 2,5 x IngrA
  - 3,0 x IngrB
  - 2,2 x Calc1

Calc3
  - 2,5 x IngrC
  - 1,5 x Calc2
  - 1,2 x Calc1

Calc4
  - 2,5 x Calc3
  - 10,0 x Calc2

As i hope you can see from this example the calculations can become quite entangled with each other.
I would of cause make some procedures to make sure no deadlocks appear.

Please reply if you need additional information.
OK ... those things where one table links to EITHER TableA OR TableB are unpleasant ... but sometimes they happen.

Here's what I see:

tbl_Ingredient -- as you said, store everyting about the ingredient.
tbl_Calculation
  CalcID
  Name
  Maybe something else ... but I don't see much here
tbl_CalcDetail
  DetailID
  CalcID -- FK to tbl_Calculation
  Component_IngredientID -- NULLable -- populate if this calculation item multiplies an ingredient
  Component_CalcID -- NULLable -- populate if this calculation item multiplies another calculation
  Multiplier -- the number by which to multiply

If you like the table structure, I'll try to write a procedure using a recursive CTE to solve this.  

Are you using SQL Server 2005?
okay as i see it you join these two tables into one table.

tbl_calculation_ingridient - holds information of which ingridients are included in the calculations.+ amount of ingridient. (this however is a simply multiplication)
tbl_calculation_calculation - holds information of which calculation are included in the calculations. + amount of calculation (this however is a simply multiplication)

This does of cause remove an extra table but it would also result in a lot of null values which you will then propably try to join on to.
Does this fact with alot of null values not make the database perform worse than it would with two tables where no null values would appear. ? (here you would ofcause need to union a result dependent on two queries)

However as i see it your table structure would be able to handle my problem, but as i said i would like to do it as correctly as possible.
Avatar of Zberteoc
The only way i see this being done is by using computed columns with functions to return the value of a calculation. You'll have to keep a clean and accurate record of all your functions.

Simply put, any product will have it's own functions for the calculations. You have to be careful because a computed column cannot be used in another computed column.
In some ways your 4th table does clarify things ... within the Calculation, is the sequence of the details important?  If so, putting those details in 2 tables gets tricky again ... if not, your idea is probably better.

As for computing the thing ... I can't see computed columns doing it.  The computations must reference different rows ... even different tables.

I'm still seeing a recursive CTE ... which brings us back to -- is this in SQL Server 2005?  CTE's are new for 2005.
Hi

Well if it isnt a performance issue to join tables on nullable values then i actually like your idea better (there is quite some extra information in the tables and currently i need to update two tables every time) and as you say ordering is actually important.

Iam using SQLEXPRESS which is 2005.
Product A consist of:
25% Compound Z
75% Compound Y ... which consists of
    90% sub-compound alpha
     5% sub-compound beta
    5% sub-compound gamma

What exactly this means? If i have 1000 units of product A then

A:
25% * 1000 of Z
75% * 1000 of Y
(
   75%(90%) * 1000 of alpha
   75%(5%) * 1000 of beta
   75%(5%) * 1000 of gamma
)

if yes then you need something like:

product table:
===========================
Products
ProdID        identity
Deescription  varchar(100)



composition table:

ProductComposition
===================================================================
ProdCompID  identity
ProdID      int (fk to Products)
SubProdID   (fk to products for the sub elements/products/compounds)
Proportion  float or decimal (18,5)

so you will have:

Products
ProdID / Description
1    A
...
100  Z
...
120  Y
...
150  alpha
151  beta
152  gamma

ProductCompozition
ProdCompID / ProdID / SubProdID / Proportion
1    1      100     0.25
2    1      120     0.75
2    120    152     0.90
3    120    152     0.05
4    120    153     0.05


and yes, in orser to calculate the final composition for product A for instance you will need a recursive query against ProductComposition becasuse product A points to Z, Y and Y points to alpha, beta, gamma

Don't worry about the NULL values, they are not an issue.
Hi

I have read up on using CTE and i can see where you are going with this.
However in my situation where i have 2000 or more calculations which i would like to show in a list where all there sums are shown, this seems quite ineffecient.

With a recursive CTE you would propably end up calculating many of calculation many times because they are included in quite some calculations.
The hiarchy could even be 10-20 steps long.

It seems to me that it would be a way more efficient to only update the prices if a ingredient or calculation which is used in a specific calculation is changed.
This does ofcause give me some redundant data, however a calculation would only be calculated once each time.
If i could create a Storedprocedure which first set all the prices of the changed calculations to NULL (or have a bit indicating is should be updated)

Afterwards i would the calculate then calculate the calculation from the lowest level and on each recursion only updating calculations which does not include calculations which are not calculated.


PSEUDO CODE -

INGREDIENTCHANGED(ID) {
    SET PRICE TO NULL ON ALL CALCULATIONS THAT HOLD THIS INGREDIENT.
    WHILE (RESULTS AFFECTED > 0)
        SET PRICE TO NULL ON ALL CALCULATIONS THAT HOLD CALCULATIONS WHERE PRICE IS NULL
    RECALCULATE_CALCULATIONS
}
CALCULATIONCHANGED(ID) {
    SET PRICE TO NULL ON ALL CALCULATIONS THAT HOLD THIS CALCULATION.
    WHILE (RESULTS AFFECTED > 0)
        SET PRICE TO NULL ON ALL CALCULATIONS THAT HOLD CALCULATIONS WHERE PRICE IS NULL
    RECALCULATE_CALCULATIONS
}
RECALCULATE_CALCULATIONS {
     WHILE (RESULTS AFFECTED > 0)
          UPDATE PRICE ON ALL CALCULATIONS WHERE PRICE IS NULL AND THAT HOLD CALCULATIONS WHERE PRICE IS NOT NULL
}

I think this pseudo code is as effective as it can get, because it only calculates what is really necesary and the fact that it calculates them recursively makes sure that the prices are updated in the right order meaning only once for each change.

Is this a very bad idea ?
Caching the values there would improve efficiency ... at the possible expense of accuracy.  It's awefully easy for cached values to get out-of-synch with reality.
I don't think you're listening and I don't really understand why you are obsessed with NULL. It is not NULL that will help you here but a good and simple design. I don't really understand that pesudocode and  where you're aiming at.

Sorry.
DanielWilson
True however there is only two interfaces which will allow users to change these, so making sure to run the stored procedure that recalculates them each time something is changed would work, correct ?

How would you write a stored procedure that could execute something like my PSEUDO code.
or have i not described it well enough ?


Zberteoc.
I think your approach is almost the same as Daniel is suggesting however
Calc1 could be used in 1000 other calculations, Calc1 could consist of 20 other calculations which in turn consists of others.
If i use CTE it will execute the entire Calc1 tree for each calculation which holds it.

If I calculate the calculation using the idea i described in my PSEUDO CODE i would only need to calculate them once.
The idea of my PSEUDO Code is to store the cost of a calculation, making sure that they are calculated in turn.
Which means that calculation using other calculations simply use the stored cost (meaning calculations holding other calculations dont need to iterate through ingredients/calculations used by those calculations they simply need to make use of the stored cost.)

I hope this describes the idea a little better
Daniel, is it possible to create this as a stored Procedure ?
Can you provide a script of the tables as you'll actually create them?  This will be sufficiently complex that I want to get it right the first time!
ASKER CERTIFIED SOLUTION
Avatar of Daniel Wilson
Daniel Wilson
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I solved it, using several Stored procedures.

one of them are shown here, . thanks for all your help though.
ALTER PROCEDURE dbo.clearCalc
AS
      /* SET NOCOUNT ON */
      update tbl_calculation set calculation_cost_price=null where not SOMETHING
      IF @@ROWCOUNT > 0
            exec clearCalc
      ELSE
            RETURN 0