Solved

T-SQL recursive delete

Posted on 2013-01-15
4
304 Views
Last Modified: 2013-01-15
I have the following code segment which works, kinda.  

DELETE FROM PSExtract.temp.RJZ_ENTITY_F_130107
FROM PSExtract.temp.RJZ_ENTITY_F_130107 AS K
WHERE K.[Direct Parent identifier] IN
      (
      SELECT K.[Direct Parent identifier]
      FROM PSExtract.temp.RJZ_ENTITY_F_130107 AS K
      LEFT OUTER JOIN PSExtract.temp.RJZ_ENTITY_F_130107 AS C
            ON K.[Direct Parent identifier] = C.[Data provider ID]
            WHERE C.[Data provider ID] IS NULL
      );

What is going on here is that a row has two keys in it.   The primary key ([Data provider ID]) which is the first column and then a parent key ([Direct Parent identifier]) later on in a different column.  All parent keys must be made valid by having a different row exist with the parent key as the primary key.  The above code goes through all the parent keys and tries to find the row for which that key is the primary key (thus validating the parent key).   If it cannot find a primary key row for a given parent key (i.e. the parent key is invalid), it deletes the row with that parent key.  That whole process works.

The problem: When a row is deleted, in some cases the deleted row "validated" the parent key on a different row which previously was valid but is now invalid.  I run the above code segment once, it works.   BUT I run it again and it again finds rows for which the parent key is invalid.  The reason they are invalid is because the rows which validated their parent keys were deleted in the first run.  I have to execute the code several times before I finally run out of rows to delete.   At that point i know that all parent keys are validated because all parent keys are can be found as primary keys on some other row.

Question:  Is there a way to code the above code segment so that it recursively executes until it runs out of rows to delete?   I tried to set this up but wasn't able to do so in any valid manner.  You will have to be specific with your answer as I'm pretty dense.

I'm running under MS SQLServer 2008.
0
Comment
Question by:RichNH
  • 2
  • 2
4 Comments
 
LVL 1

Author Comment

by:RichNH
ID: 38779826
BTW, currently I'm getting around the problem by repeating the delete code 10X in the module.  Not very elegant but it gets me the data I need.
0
 
LVL 39

Expert Comment

by:lcohan
ID: 38779957
You could fill in a int variable with @@ROWCOUNT (Returns the number of rows affected by the last statement) and if is NOT = 0 then LOOP
0
 
LVL 39

Accepted Solution

by:
lcohan earned 500 total points
ID: 38779966
--something like:

declare @row_count int;
set @row_count = 1;

while @row_count > 0
begin
      DELETE FROM PSExtract.temp.RJZ_ENTITY_F_130107
      FROM PSExtract.temp.RJZ_ENTITY_F_130107 AS K
      WHERE K.[Direct Parent identifier] IN
              (
              SELECT K.[Direct Parent identifier]
              FROM PSExtract.temp.RJZ_ENTITY_F_130107 AS K
              LEFT OUTER JOIN PSExtract.temp.RJZ_ENTITY_F_130107 AS C
                        ON K.[Direct Parent identifier] = C.[Data provider ID]
                        WHERE C.[Data provider ID] IS NULL
              );
    set @row_count = @@ROWCOUNT;
end;
0
 
LVL 1

Author Closing Comment

by:RichNH
ID: 38780476
Man that is EXACTLY what I was looking for and works like a champ.   I've found it in the books I have too so obviously I have some reading to do.   A new area to explore.   tahnk you so much!
0

Featured Post

U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Nowadays, some of developer are too much worried about data. Who is using data, who is updating it etc. etc. Because, data is more costlier in term of money and information. So security of data is focusing concern in days. Lets' understand the Au…
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question