dates expand and flag set based upon two tables

Hi,
How can i get dates expanded and flag set based upon two tables:

S_POS_VACANCY
POSITION_TCD PV_START_DT PV_END_DT VACANCY_STATUS

1004305                   2003-01-01      2004-06-20      Y
1004305                   2003-06-21      999-12-31        N

and then i have  

D_POSITION:

POSITION_SK         POSITION_TCD     EFF_DT      EXP_DT

10053014            1004305            2003-01-01            2005-08-31
10053015            1004305             2005-09-01           2005-09-01
10052018           1004305              2005-09-02           9999-12-31    


For 10004305:
•      According to S_POS_VACANCY, Position 10004305 has been vacant from 2003-01-01 until 2004-06-20, and then filled from 2004-06-21 until 9999-12-31.
•      According to D_POSITION, Position 10004305 for this time (2003-01-01 until 9999-12-31) is represented by 3 entries
 in D_POSITION (POSITION_SK): 10053014, 10053015, 10052018.
•      So, the entries in F_POS_VACANCY or target should be as follows:


POSITION_SK      DATE_SK      VACANCY_FLAG


10053014      20030101      1
10053014      20030102      1
10053014      20030103      1
10053014      20030104      1
10053014      20030105      1
10053014      20030106      1
10053014      20030107      1
(and many more similar entries for EACH DATE until 2004-06-20…)
10053014      20040619      1
10053014      20040620      1
(then the VACANCY_FLAG value switches to 0 …)      
10053014      20040621      0
10053014      20040622      0
10053014      20040623      0
(the POSITION_SK changes after 2005-08-31 and again after 2005-09-01)
10053014      20050831      0
10053015      20060901      0
10052018      20060902      0
(until the current date…)            
10052018      20130911      0
10052018      20130912      0
10052018      20130913      0
sam2929Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

sdstuberCommented:
based on previous question can we assume 12c?
0
sdstuberCommented:
is  999-12-31   a mistake?

should that be 9999-12-31  ?
0
sam2929Author Commented:
yes should be 9999-12-31
0
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

sdstuberCommented:
assuming yes to both of the previous questions then try this....

WITH FUNCTION number_list(s IN DATE, e IN DATE)
         RETURN ora_mining_number_nt
     IS
         v_list ora_mining_number_nt := ora_mining_number_nt();
         d      DATE := s;
     BEGIN
         v_list.EXTEND(LEAST(e, TRUNC(SYSDATE)) - s + 1);

         FOR i IN 1 .. LEAST(e, TRUNC(SYSDATE)) - s + 1
         LOOP
             v_list(i) := i;
         END LOOP;

         RETURN v_list;
     END;
SELECT position_sk,
       eff_dt + COLUMN_VALUE - 1 date_sk,
       NVL(
           (SELECT 1
              FROM s_pos_vacancy
             WHERE vacancy_status = 'Y'
               AND eff_dt + COLUMN_VALUE - 1 BETWEEN pv_start_dt AND pv_end_dt),
           0
       )
           vacancy_flag
  FROM d_position, TABLE(number_list(eff_dt, exp_dt));
SELECT *
  FROM d_position p, s_pos_vacancy v
 WHERE p.position_tcd = v.position_tcd
ORDER BY position_sk, date_sk

Open in new window



if version is not 12c, please specify which version you are using.

if the data for year 999 is correct, then please explain what that means
0
sam2929Author Commented:
Got below error while running function

Error starting at line 5 in command:
d      DATE := s
Error report:
Unknown Command

Error starting at line 6 in command:
BEGIN
         v_list.EXTEND(LEAST(e, TRUNC(SYSDATE)) - s + 1);

         FOR i IN 1 .. LEAST(e, TRUNC(SYSDATE)) - s + 1
         LOOP
             v_list(i) := i;
         END LOOP;

         Return V_List;
     END;
Error report:
ORA-06550: line 2, column 30:
PLS-00201: identifier 'E' must be declared
ORA-06550: line 2, column 10:
PL/SQL: Statement ignored
ORA-06550: line 4, column 30:
PLS-00201: identifier 'E' must be declared
ORA-06550: line 4, column 10:
PL/SQL: Statement ignored
ORA-06550: line 9, column 10:
PLS-00372: In a procedure, RETURN statement cannot contain an expression
ORA-06550: line 9, column 10:
PL/SQL: Statement ignored
06550. 00000 -  "line %s, column %s:\n%s"
*Cause:    Usually a PL/SQL compilation error.
*Action:
0
sdstuberCommented:
are you using version 12?
0
sdstuberCommented:
oops, I had multiple statements smooshed together and missed a condition on the where clause of the vacancy lookup

again, you need Oracle version 12 for this to work


WITH FUNCTION number_list(s IN DATE, e IN DATE)
         RETURN ora_mining_number_nt
     IS
         v_list ora_mining_number_nt := ora_mining_number_nt();
         d      DATE := s;
     BEGIN
         v_list.EXTEND(LEAST(e, TRUNC(SYSDATE)) - s + 1);

         FOR i IN 1 .. LEAST(e, TRUNC(SYSDATE)) - s + 1
         LOOP
             v_list(i) := i;
         END LOOP;

         RETURN v_list;
     END;
SELECT position_sk,
       eff_dt + COLUMN_VALUE - 1 date_sk,
       NVL(
           (SELECT 1
              FROM s_pos_vacancy v
             WHERE vacancy_status = 'Y'
               AND eff_dt BETWEEN pv_start_dt AND pv_end_dt
               AND p.position_tcd = v.position_tcd),
           0
       )
           vacancy_flag
  FROM d_position p, TABLE(number_list(eff_dt, exp_dt))
ORDER BY date_sk, position_sk;
0
sdstuberCommented:
if you've got 11gR2 you can try this.. (it will work on 12 too)


WITH x(position_sk,
       position_tcd,
       eff_dt,
       exp_dt)
     AS (SELECT position_sk,
                position_tcd,
                eff_dt,
                exp_dt
           FROM d_position
         UNION ALL
         SELECT position_sk,
                position_tcd,
                eff_dt + 1,
                exp_dt
           FROM x
          WHERE eff_dt + 1 <= LEAST(exp_dt, TRUNC(SYSDATE)))
SELECT position_sk,
       eff_dt date_sk,
       NVL(
           (SELECT 1
              FROM s_pos_vacancy v
             WHERE vacancy_status = 'Y'
               AND eff_dt BETWEEN pv_start_dt AND pv_end_dt
               AND x.position_tcd = v.position_tcd),
           0
       )
           vacancy_flag
  FROM x
ORDER BY date_sk,position_sk
0
sdstuberCommented:
and finally, for 10g

SELECT position_sk,
       eff_dt + COLUMN_VALUE - 1 date_sk,
       NVL(
           (SELECT 1
              FROM s_pos_vacancy v
             WHERE vacancy_status = 'Y'
               AND eff_dt BETWEEN pv_start_dt AND pv_end_dt
               AND p.position_tcd = v.position_tcd),
           0
       )
           vacancy_flag
  FROM d_position p,
       TABLE(
           SELECT COLLECT(LEVEL)
             FROM DUAL
           CONNECT BY LEVEL <= LEAST(exp_dt, TRUNC(SYSDATE)) - eff_dt + 1
       )
ORDER BY date_sk, position_sk
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
sam2929Author Commented:
getting below error:

ORA-01841: (full) year must be between -4713 and +9999, and not be 0
01841. 00000 -  "(full) year must be between -4713 and +9999, and not be 0"
*Cause:    Illegal year entered
*Action:   Input year in the specified range

WITH x(position_sk,
       position_tcd,
       effective_date,
       expiry_date)
     AS (SELECT position_sk,
                position_tcd,
                effective_date,
                expiry_date
           FROM d_position
         UNION ALL
         SELECT position_sk,
                position_tcd,
                effective_date + 1,
                expiry_date
           FROM x
          WHERE effective_date + 1 <= LEAST(expiry_date, TRUNC(SYSDATE)))
SELECT position_sk,
       effective_date date_sk,
       Nvl(
           (SELECT 1
              From Hrprstageadm.S_Pos_Vacancy V
             WHERE pos_vacancy_status = 'Y'
               AND effective_date BETWEEN pv_start_dt AND pv_end_dt
               AND x.position_tcd = v.position_tcd),
           0
       )
           vacancy_flag
  From X
ORDER BY date_sk,position_sk
0
sdstuberCommented:
what data causes this failure?  All of the queries run with the data above.

you still haven't answered  what version  is it a secret?
0
sam2929Author Commented:
its 11g ,
i think failure is due to effective_date + 1  what  if eff_date is April 30 then it will try to
so April 31 and i guess that will cause the error
0
sdstuberCommented:
>>>   if eff_date is April 30 then it will try to  so April 31

no, that's not a problem, oracle knows how to increment buy a day correctly
0
sdstuberCommented:
Do you have any values with an eff_dt of   9999-12-31  ?
0
sam2929Author Commented:
Hi,

After adding filter To_Char(effective_date,'YYYYMMDD')<>'99991231' query start running but taking forever so i stopped it after 1 hr.

Explain plan showing cost is really low 758

Count for table are as below :

D_POSITION count is :108426

S_POS_VACANCY count  is :149880

Can we make this run fast

Thanks.


WITH x(position_sk,
       position_tcd,
       effective_date,
       expiry_date)
     AS (SELECT position_sk,
                position_tcd,
                effective_date,
                expiry_date
           FROM d_position
         --  where position_tcd ='10004305'
where To_Char(effective_date,'YYYYMMDD')<>'99991231'
         UNION ALL
         SELECT position_sk,
                position_tcd,
                effective_date + 1,
                expiry_date
           FROM x
          WHERE effective_date + 1 <= LEAST(expiry_date, TRUNC(SYSDATE))
          and To_Char(effective_date,'YYYYMMDD')<>'99991231')
SELECT position_sk,
       effective_date date_sk,
       Nvl(
           (SELECT 1
              From Hrprstageadm.S_Pos_Vacancy V
             WHERE pos_vacancy_status = 'Y'
            ----  and position_tcd ='10004305'
              AND effective_date BETWEEN pv_start_dt AND pv_end_dt
              and To_Char(effective_date,'YYYYMMDD')<>'99991231'
              -- and position_tcd ='10004305'
               AND x.position_tcd = v.position_tcd),
           0
       )
           vacancy_flag
  From X
ORDER BY date_sk,position_sk
0
sam2929Author Commented:
any ideas?
0
sdstuberCommented:
don't do this...

and To_Char(effective_date,'YYYYMMDD')<>'99991231'

try

and effective_date <> to_date('99991231','yyyymmdd')

I'm making the assumption that your effective_dates are truncated


if not try  effective_date < to_date('99991231','yyyymmdd')
0
sdstuberCommented:
Explain plan showing cost is really low 758


cost has no meaning for performance.  Don't use it as a tuning guide
0
sdstuberCommented:
D_POSITION count is :108426


Think about this... if each position has just 10 days of data, that's a million records.

Your sample data has hundreds of days of data for some records.  So that's going to be many millions of rows returned,  and that's not even taking into account the lookups of vacancy to find the Y/N flags


The 5 rows of data you have above expand to 3912 rows returned.

Are you sure you want this query to return all data in this form?
0
sdstuberCommented:
another thing to consider
do you want positions that have an effective date in the future?

if not, then add this condition

and  eff_dt < trunc(sysdate)

doing this, you won't need the 9999-12-31 check
0
sam2929Author Commented:
yes you right roughly we will have 150 million rows with this logic

so any idea if we can select data in subset like 100000 rows and then insert and then do other subset?
0
sdstuberCommented:
you can wrap your current query in parentheses  and then increment through rows in chunks...

--- First 100k
select * from
(select  x.*, rownum rn from (
    your query
)x  where rownum <= 100000)
where rn >= 1


--- Second 100k
select * from
(select  x.*, rownum rn from (
    your query
)x where rownum <= 200000)
where rn >= 100001

Nesting the results like that helps the optimizer
You could do something simpler like this...


select * from
(select  x.*, rownum rn from (
    your query
)x ) where rn between 100001 and 200000

but it's less efficient
0
sam2929Author Commented:
Hi,
i am trying to break your sql in two parts as performance is really bad so i am thinking
having two stage tables for two sql below and then join them with
 position_tcd .

Problem is query 2) is not giving me levels

query 1)
SELECT v.*
From Hrprstageadm.S_Pos_Vacancy v
inner join d_position x
on x.position_tcd = v.position_tcd
AND x.effective_date BETWEEN v.pv_start_dt AND v.pv_end_dt and x.effective_date < TRUNC(SYSDATE) WHERE v.pos_vacancy_status = 'Y'
and v.position_tcd ='10004305'

query 2)

SELECT position_sk,
                position_tcd,
                effective_date,
                expiry_date
           FROM d_position
          where position_tcd ='10004305'
       and effective_date < TRUNC(SYSDATE)
      -- order by 3
         UNION ALL
         SELECT position_sk,
                position_tcd,
                effective_date + 1,
                expiry_date
           FROM d_position
          WHERE effective_date + 1 <= LEAST(expiry_date, TRUNC(SYSDATE))
           --and effective_date < TRUNC(SYSDATE)
           and position_tcd ='10004305'
0
sdstuberCommented:
you need the recursive with clause to get the levels

what kind of performance are you expecting with 150 million rows?


note: "fast as possible" isn't an answer
0
sam2929Author Commented:
after 11 hrs no row returned so no good so who can i break this in two sqls load in stage table and then do join
0
sdstuberCommented:
most likely result of loading into a staging table then querying the staging table is to make the process slower.

what kind of performance are you expecting with 150 million rows?
0
sam2929Author Commented:
i didn't get any rows select after 11 hrs for full load so sure we have issues but when i
filter position_tcd it run in no time so performance wise i want to select and load rows in 4-5 hrs
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.