sas77
asked on
T SQL LOOKUP QUERY
Dear Experts,
I am loading a table from a source which is also a sql server 2008 table and it has following structure with sample rows.
OrderDetailID RecognitionDate RecognizedAmount
------------- ----------------------- ---------------------
4262695 2010-04-15 00:00:00 -0.0049
4262756 2010-04-15 00:00:00 -0.0049
4263732 2010-04-15 00:00:00 -0.0049
4263793 2010-04-15 00:00:00 -0.0049
4262389 2010-04-15 00:00:00 -0.0023
4215156 2010-04-15 00:00:00 -0.0023
4215157 2010-04-15 00:00:00 -0.0023
4215217 2010-04-15 00:00:00 -0.0023
4215218 2010-04-15 00:00:00 -0.0023
4261839 2010-04-15 00:00:00 -0.0023
i want to load only the records which is not existing in target that means it should strictly match complete row and see whether it is exisitng in the target or not.If not then load else do not load.
Note : as this is very huge table which consist of 7 million records please provide me with optimized query.
Thanks.
I am loading a table from a source which is also a sql server 2008 table and it has following structure with sample rows.
OrderDetailID RecognitionDate RecognizedAmount
------------- ----------------------- ---------------------
4262695 2010-04-15 00:00:00 -0.0049
4262756 2010-04-15 00:00:00 -0.0049
4263732 2010-04-15 00:00:00 -0.0049
4263793 2010-04-15 00:00:00 -0.0049
4262389 2010-04-15 00:00:00 -0.0023
4215156 2010-04-15 00:00:00 -0.0023
4215157 2010-04-15 00:00:00 -0.0023
4215217 2010-04-15 00:00:00 -0.0023
4215218 2010-04-15 00:00:00 -0.0023
4261839 2010-04-15 00:00:00 -0.0023
i want to load only the records which is not existing in target that means it should strictly match complete row and see whether it is exisitng in the target or not.If not then load else do not load.
Note : as this is very huge table which consist of 7 million records please provide me with optimized query.
Thanks.
The best solution if all the columns match is to use Set Logic which is what relational databases are best at. Use the EXCEPT operator to identify the records that are not already in the table and inset them like this
INSERT INTO NewTable (OrderDetailID, RecognitionDate, RecognizedAmount)
SELECT OrderDetailID, RecognitionDate, RecognizedAmount
FROM OldTable
EXCEPT
SELECT OrderDetailID, RecognitionDate, RecognizedAmount
FROM NewTable;
ASKER
Attached code which i am using to load fact table called fact revenue recognition, this is a incremental load and unfortunately i dont have any last_update date in source level where i can load only previous day records which should look on fact table if exisiting dont load if not load. Please provide me appropriate code which i can put in OLEDB SOURCE IN ssis package.
Thanks.
Thanks.
Target Table:
CREATE TABLE [dbo].[FactRevenueRecognition](
[RevenueRecognitionUID] [int] IDENTITY(1,1) NOT NULL,
[OrderDetailKey] [int] NOT NULL,
[OrderKey] [int] NOT NULL,
[ProductKey] [int] NOT NULL,
[AddressKey] [int] NOT NULL,
[RecognitionDateKey] [int] NOT NULL,
[RevenueRecognitionType] [char](2) NOT NULL,
[RecognizedAmount] [smallmoney] NOT NULL,
CONSTRAINT [RevenueRecognition_Fact_pk] PRIMARY KEY CLUSTERED
(
[RevenueRecognitionUID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Source Query:
SELECT
ISNULL(RR.ORDERDETAILID,'0') AS OrderDetailID,
ISNULL(O.OrderID,'0') AS OrderID,
ISNULL(P.PRODUCTID,'0') AS ProductID,
ISNULL(A.AddressID,'0') AS AddressID,
ISNULL(Production.dbo.TruncToDay(RR.RECOGNITIONDATE),'1990-01-01')AS RecognitionDate,
ISNULL(RRT.REVENUERECOGNITIONTYPE,'Unknown') AS RevenueRecognitionType,
ISNULL(RR.RECOGNIZEDAMOUNT,'0.00')AS RecognizedAmount
FROM
Production.dbo.REVENUERECOGNITION AS RR LEFT JOIN Production.dbo.ORDERDETAIL OD
ON RR.ORDERDETAILID = OD.ORDERDETAILID
LEFT JOIN
Production.dbo.PRODUCT AS P ON OD.PRODUCTID = P.PRODUCTID
LEFT JOIN
Production.dbo.Orders AS O ON OD.ORDERID = O.ORDERID
LEFT JOIN
Production.dbo.OrderAddress AS OA ON O.ORDERID = OA.ORDERID
LEFT JOIN
Production.dbo.ADDRESS AS A ON OA.ADDRESSID = A.ADDRESSID
LEFT JOIN
Production.dbo.SEMINARINFO AS SI ON P.PRODUCTID = SI.PRODUCTID
LEFT JOIN
Production.dbo.REVENUERECOGNITIONTYPE RRT ON P.REVENUERECOGNITIONTYPE = RRT.REVENUERECOGNITIONTYPE
This should work logically but it would be much better if you could that last update date into your source table to avoid having to do this.
INSERT INTO dbo.FactRevenueRecognition
SELECT
ISNULL(RR.ORDERDETAILID,'0') AS OrderDetailID,
ISNULL(O.OrderID,'0') AS OrderID,
ISNULL(P.PRODUCTID,'0') AS ProductID,
ISNULL(A.AddressID,'0') AS AddressID,
ISNULL(Production.dbo.TruncToDay(RR.RECOGNITIONDATE),'1990-01-01')AS RecognitionDate,
ISNULL(RRT.REVENUERECOGNITIONTYPE,'Unknown') AS RevenueRecognitionType,
ISNULL(RR.RECOGNIZEDAMOUNT,'0.00')AS RecognizedAmount
FROM
Production.dbo.REVENUERECOGNITION AS RR LEFT JOIN Production.dbo.ORDERDETAIL OD
ON RR.ORDERDETAILID = OD.ORDERDETAILID
LEFT JOIN
Production.dbo.PRODUCT AS P ON OD.PRODUCTID = P.PRODUCTID
LEFT JOIN
Production.dbo.Orders AS O ON OD.ORDERID = O.ORDERID
LEFT JOIN
Production.dbo.OrderAddress AS OA ON O.ORDERID = OA.ORDERID
LEFT JOIN
Production.dbo.ADDRESS AS A ON OA.ADDRESSID = A.ADDRESSID
LEFT JOIN
Production.dbo.SEMINARINFO AS SI ON P.PRODUCTID = SI.PRODUCTID
LEFT JOIN
Production.dbo.REVENUERECOGNITIONTYPE RRT ON P.REVENUERECOGNITIONTYPE = RRT.REVENUERECOGNITIONTYPE
EXCEPT
SELECT OrderDetailKey,
OrderKey,
ProductKey,
AddressKey,
RecognitionDateKey,
RevenueRecognitionType,
RecognizedAmount
FROM dbo.FactRevenueRecognition
ASKER
Dear CGLuttrell:
That query is taking long time as it should look up on 7 million records can you please provide me any other way of doing this?
That query is taking long time as it should look up on 7 million records can you please provide me any other way of doing this?
If RR.RECOGNITIONDATE in your Production.dbo.REVENUERECO GNITION table is always increasing (you would never insert an older date or update existing data) they you could try something like below for the selection of new values to insert in leu of having a last_updated_date column. You have not given exact names of the RecognitionDateKey Dimension table or columns so those are my best guesses.
SELECT
ISNULL(RR.ORDERDETAILID,'0') AS OrderDetailID,
ISNULL(O.OrderID,'0') AS OrderID,
ISNULL(P.PRODUCTID,'0') AS ProductID,
ISNULL(A.AddressID,'0') AS AddressID,
ISNULL(Production.dbo.TruncToDay(RR.RECOGNITIONDATE),'1990-01-01')AS RecognitionDate,
ISNULL(RRT.REVENUERECOGNITIONTYPE,'Unknown') AS RevenueRecognitionType,
ISNULL(RR.RECOGNIZEDAMOUNT,'0.00')AS RecognizedAmount
FROM
Production.dbo.REVENUERECOGNITION AS RR LEFT JOIN Production.dbo.ORDERDETAIL OD
ON RR.ORDERDETAILID = OD.ORDERDETAILID
LEFT JOIN
Production.dbo.PRODUCT AS P ON OD.PRODUCTID = P.PRODUCTID
LEFT JOIN
Production.dbo.Orders AS O ON OD.ORDERID = O.ORDERID
LEFT JOIN
Production.dbo.OrderAddress AS OA ON O.ORDERID = OA.ORDERID
LEFT JOIN
Production.dbo.ADDRESS AS A ON OA.ADDRESSID = A.ADDRESSID
LEFT JOIN
Production.dbo.SEMINARINFO AS SI ON P.PRODUCTID = SI.PRODUCTID
LEFT JOIN
Production.dbo.REVENUERECOGNITIONTYPE RRT ON P.REVENUERECOGNITIONTYPE = RRT.REVENUERECOGNITIONTYPE
WHERE RR.RECOGNITIONDATE > (SELECT max(RecongnitionDateKeyValue) from RecognitionDateKeyDimensionTable)
ASKER
Attached is the schema diagram and the query i am using to populate fact table revenue recognition, recognized date will look up on dimdate and get datekey load as recognitiondatekey.
Please let me know if you need anything
RR-SCHEMA.pdf
Please let me know if you need anything
RR-SCHEMA.pdf
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks
select *
from oldTable t1 left outer join newTable t2 on
t1.OrderDetailID = t2.OrderDetailID
and t1.RecognitionDate = t2.RecognitionDate
and t1.RecognizedAmount = t2.RecognizedAmount