I am getting duplicate records because of NOLOCk , I understand that this is doing a uncommitted (or) dirty reads and ending up with duplicates.
Could you please suggest the Isloation level to be used as I am querying the OLTP Source table below (BASE_ASSIGNMENTS) which is continuously being updated all the time. I am reading this table and perform the ETL operations and store the aggregated data in the warehouse . Before I store in the warehouse , I do the ETL staging operation where I encounter the duplicate issue .
Please correct this query (used to load Staging from source table ) and help me with the solution.
SELECT DISTINCT 21 DATASOURCE_ID, B.KVISUMMARYINTID KVISUMMARYINTID, CAST(B.WAREHOUSE_ID AS NVARCHAR(32)) WH_ID,
FROM BASE_ASSIGNMENTS B WITH (NOLOCK)
LEFT OUTER JOIN
(SELECT ASSIGN_NUM, MIN(TRNDTE) TRNDTE FROM KVI_SUMMARY_HST WITH (NOLOCK) GROUP BY ASSIGN_NUM
ON B.ASSIGN_NUM = H.ASSIGN_NUM WHERE B.REPORT_DATE BETWEEN '2016-01-24T00:00:00' AND '2016-01-26T23:59:59'
DatabasesMicrosoft SQL ServerMicrosoft SQL Server 2005Microsoft SQL Server 2008SQL
In the ETL process there should be no need for NOLOCK at all, cause the E-T-L is an ordered process with discrete steps.