asked on

Deterimine old rows to delete

Hi, I am currently building a data warehouse and have an issue with records to delete from the fact table.
The source data does not update records but rather creates a new row for the change and uses an audit number together with other columns to form a composite key. I am using the following code to get the most recent record as the row may have been changed several times since the last load:

SELECT * FROM (
SELECT * ROW_NUMBER() OVER (PARTITION BY [SampleCode] ORDER BY [AuditNumber] desc) rn
FROM [DataMart].[Dim_Sample]
) tmp WHERE rn = 1
AND AuditNumber IS NOT NULL;

(In the live code the column are listed)

If this returns the 4th version of the row, I need to be able to identify the previous 3 and remove them from the fact table.

Anyone any ideas?

Many thanks

ASKER CERTIFIED SOLUTION

Máté Farkas

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

Olaf Doschke

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Olaf Doschke

A completely different idea is to keep audit data separate anyway, have one main table with current data and a history table with audit data. That way you don't need the initial query for rn=1 at all, you have a table with current rows and no others. For whatever reason you choose to not go that route, it'd enable you to keep much more audit data, as that will have no effect on the performance of queries of the main table with only current data.

I can understand a single audit table if you also sometimes want to query older states. SQL Server also offers a new feature of temporal tables since version 2016, they work with that very concept of a separate history table: https://www.mssqltips.com/sqlservertip/4674/managing-temporal-table-history-in-sql-server-2016/

You may consider using that replacing your own concept.

And last not least,you might like some new query clauses for temporal tables,especially querying data AS OF a certain date:
https://docs.microsoft.com/en-us/sql/relational-databases/tables/querying-data-in-a-system-versioned-temporal-table

Bye, Olaf.

SOLUTION

Pawan Kumar

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Pawan Kumar

Question abandoned
Provided solution