sam2929
asked on
SAS-Update history records type 2
Hi,
We need to fix some history data right
This is type 2 dimension we have issue with first 2 rows
VALID_FROM_DTTM VALID_TO_DTTM emp_id emp_key
26AUG2011:23:59:59 12OCT2011:16:26:56 101 1
10OCT2011:23:59:59 07NOV2011:23:59:58 101 201
07NOV2011:23:59:59 12DEC2011:23:59:58 101 302
12DEC2011:23:59:59 13DEC2011:23:59:58 101 801
13DEC2011:23:59:59 22DEC2011:23:59:58 101 10001
22DEC2011:23:59:59 23DEC2011:23:59:58 101 1000005
23DEC2011:23:59:59 25DEC2011:23:59:58 101 1000008
25DEC2011:23:59:59 27DEC2011:23:59:58 101 10000011
27DEC2011:23:59:59 01JAN2012:23:59:58 101 10000013
01JAN2012:23:59:59 02JAN2012:23:59:58 101 10000022
02JAN2012:23:59:59 20JAN2012:23:59:58 101 10000045
20JAN2012:23:59:59 22JAN2012:23:59:58 101 10000067
22JAN2012:23:59:59 31DEC9999:00:00:00 101 100000987
Isuue is Second row VALID_FROM_DTTM should be 12OCT2011:16:26:57
so it should looks like
26AUG2011:23:59:59 12OCT2011:16:26:56 101 1
12OCT2011:16:26:57 07NOV2011:23:59:58 101 201
We have issue with some other emp_id too so we want to update
VALID_FROM_DTTM where VALID_TO_DTTM <> to VALID_TO_DTTM -1 sec
based upon emp_id
We need to fix some history data right
This is type 2 dimension we have issue with first 2 rows
VALID_FROM_DTTM VALID_TO_DTTM emp_id emp_key
26AUG2011:23:59:59 12OCT2011:16:26:56 101 1
10OCT2011:23:59:59 07NOV2011:23:59:58 101 201
07NOV2011:23:59:59 12DEC2011:23:59:58 101 302
12DEC2011:23:59:59 13DEC2011:23:59:58 101 801
13DEC2011:23:59:59 22DEC2011:23:59:58 101 10001
22DEC2011:23:59:59 23DEC2011:23:59:58 101 1000005
23DEC2011:23:59:59 25DEC2011:23:59:58 101 1000008
25DEC2011:23:59:59 27DEC2011:23:59:58 101 10000011
27DEC2011:23:59:59 01JAN2012:23:59:58 101 10000013
01JAN2012:23:59:59 02JAN2012:23:59:58 101 10000022
02JAN2012:23:59:59 20JAN2012:23:59:58 101 10000045
20JAN2012:23:59:59 22JAN2012:23:59:58 101 10000067
22JAN2012:23:59:59 31DEC9999:00:00:00 101 100000987
Isuue is Second row VALID_FROM_DTTM should be 12OCT2011:16:26:57
so it should looks like
26AUG2011:23:59:59 12OCT2011:16:26:56 101 1
12OCT2011:16:26:57 07NOV2011:23:59:58 101 201
We have issue with some other emp_id too so we want to update
VALID_FROM_DTTM where VALID_TO_DTTM <> to VALID_TO_DTTM -1 sec
based upon emp_id
Hi
Iowa's code will work with a slight change - one needs to reset the value of PREV_TO_VAL:
I actually prefer using the LAG function for these types of code:
Iowa's code will work with a slight change - one needs to reset the value of PREV_TO_VAL:
proc sort data = [input]; by EMP_ID VALID_FROM_DTTM; run;
data [output];
set [input];
by EMP_ID;
retain PREV_TO_VAL;
if first.EMP_ID then do;
PREV_TO_VAL = VALID_TO_DTTM;
end;
else do;
VALID_FROM_DTTM = PREV_TO_VAL + 1;
PREV_TO_VAL = VALID_TO_DTTM;
end;
DROP PREV_TO_VAL;
run;
I actually prefer using the LAG function for these types of code:
proc sort data = [input]; by EMP_ID VALID_FROM_DTTM; run;
DATA [output];
set [input];
by EMP_ID;
IF NOT First.EMP_ID THEN
VALID_FROM_DTTM = LAG(VALID_TO_DTTM) + 1;
RUN;
oh yes, theartfuldazzler thanks for pointing that out :)
ASKER
can't we modify this code to do changes just where
where VALID_TO_DTTM <> to VALID_TO_DTTM -1 sec
where VALID_TO_DTTM <> to VALID_TO_DTTM -1 sec
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
anyway have you tried loading duplicate records at initial load for example? I remembered encountering this problem when I did this some time back, the SCD Type2 Generator has a code which will adjust the date/datetime if it detects duplicate records being loaded...
i would do something like:
proc sort data = [input]; by EMP_ID VALID_FROM_DTTM; run;
data [output];
set [input];
by EMP_ID;
retain PREV_TO_VAL;
if first.EMP_ID then do;
PREV_TO_VAL = VALID_TO_DTTM;
end;
else do;
VALID_FROM_DTTM = PREV_TO_VAL + 1;
end;
run;
as usual, please do back up your original data so that it can be restored if need be. also, please do test the code and check the output to ensure that this is what you really wanted