fill in date gaps with data

Hello

My original table1 has the following data and my requirement is to fill in date gaps
along with customer_id and value - see table2 as required output



table1

Customer_ID    Value       Date
001            4.5         31/01/2011
001            3.5         31/03/2011
001            6.2         30/06/2011
002            0.2         30/04/2011
002            1.3         31/08/2011

table2
Customer_ID    Value       Date
001            4.5         31/01/2011
001            4.5         28/02/2011 -- added row
001            3.5         31/03/2011
001            6.2         30/04/2011 -- added row
001            6.2         31/05/2011 -- added row
001            6.2         30/06/2011
002            0.2         30/04/2011
002            0.2         31/05/2011 -- added row
002            0.2         30/06/2011 -- added row
002            1.3         31/08/2011

Regards
philsivyerAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Naveen KumarProduction Manager / Application Support ManagerCommented:
I do not quite understand your requirement.

why 31/07/2011 is not there for customer id 002 ?
0
philsivyerAuthor Commented:
Well spotted - it should be - sorry!
0
Naveen KumarProduction Manager / Application Support ManagerCommented:
simple stored procedure to take the min and max date for each customer and then add the missing months records within a loop can be done. Is it ok if i give this stored proc with this logic or you need a direct single insert statement for doing the same.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

philsivyerAuthor Commented:
I need direct single insert statement please
0
skullnobrainsCommented:
you need a insert ... select with a subquery in order to do this
i do not believe it should be done without an external language or a stored procedure
it will be awfully complicated to write (though feasible) and catastrophic performance-wise
0
Naveen KumarProduction Manager / Application Support ManagerCommented:
i am just giving here procedure version of it.....

create or replace procedure proc1 as
cursor c1 is select customer_id cid, min(date1) d1, max(date1) d2  
from table_1
--WHERE CUSTOMER_ID = 1
group by customer_id;
temp_date date;
temp_value number;
DUMMY_VAR NUMBER;
counter number;

begin

for x in c1
loop

counter :=months_between(x.d2,x.d1);
DBMS_OUTPUT.PUT_LINE('COUNTER VALUE IS : '|| COUNTER );  
temp_date := x.d1;

for y in 1.. COUNTER
loop

DBMS_OUTPUT.PUT_LINE('VALUE OF LOOP COUNTER Y IS : '|| Y );
temp_date := add_months(x.d1,y-1);

dbms_output.put_line('value..:' || temp_date || '*' || X.CId );

begin

select value into temp_value from table_1 where customer_id = x.cid and date1 = Temp_date;
dbms_output.put_line('value..: IS ' || TEMP_VALUE );

EXCEPTION
WHEN NO_DATA_FOUND THEN
NULL;
END;

begin
select 1 INTO DUMMY_VAR from table_1 where customer_id = x.cid and date1 = Temp_date;
EXCEPTION
WHEN NO_DATA_FOUND THEN
insert into table_1 VALUES(X.CID ,TEMP_VALUE, TEMP_DATE );
dbms_output.put_line('inserted RECORD ' || temp_date || '*' || X.CId || '*' || TEMP_VALUE);
END;

COMMIT;

 
end loop;

end loop;

end;
/

EXEC PROC1;

I tested this and it works for me. I need to spend some time to come up with a direct insert...
0
Christoffer SwanströmPartnerCommented:
How about this (includes my testing table and sample data as per your example):
CREATE TABLE swc_tst (
customer_id VARCHAR2(32)
,val NUMBER
,dt DATE
);

INSERT INTO swc_tst VALUES('001', 4.5, TO_DATE('31.01.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('001', 3.5, TO_DATE('31.03.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('001', 6.2, TO_DATE('30.06.2011', 'dd.mm.yyyy'));

INSERT INTO swc_tst VALUES('002', 0.2, TO_DATE('30.04.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('002', 1.3, TO_DATE('31.08.2011', 'dd.mm.yyyy'));

SELECT
  customer_id
  ,month_end
  ,FIRST_VALUE(val) OVER(PARTITION BY customer_id, grp_nb ORDER BY month_end) AS val
FROM
(SELECT
  asd.*
  ,SUM(flg_beg_grp) OVER(PARTITION BY customer_id ORDER BY month_end) AS grp_nb
FROM
(SELECT
  cus_mt.customer_id
  ,all_months.month_end
  ,val.val
  ,CASE WHEN val.val IS NOT NULL THEN 1 ELSE 0 END AS flg_beg_grp
FROM
  (SELECT
    ADD_MONTHS(max_dt, -level + 1) AS month_end
  FROM
    (SELECT
      MAX(dt) AS max_dt
      ,MIN(dt) AS min_dt
      ,(EXTRACT(YEAR FROM MAX(dt)) - EXTRACT(YEAR FROM MIN(dt))) * 12 + (EXTRACT(MONTH FROM MAX(dt)) - EXTRACT(MONTH FROM MIN(dt))) + 1 AS nb_months
    FROM
      swc_tst
    )
  CONNECT BY level <= nb_months
  ) all_months
INNER JOIN
  (SELECT
    customer_id
    ,MIN(dt) AS cus_min_dt
    ,MAX(dt) AS cus_max_dt
  FROM
    swc_tst
  GROUP BY
    customer_id
  ) cus_mt
ON
  all_months.month_end BETWEEN cus_mt.cus_min_dt AND cus_mt.cus_max_dt
LEFT JOIN
  swc_tst val
ON
  val.customer_id = cus_mt.customer_id
AND
  all_months.month_end = val.dt
) asd
)qwe
ORDER BY
  customer_id
  ,month_end
;

Open in new window

0
philsivyerAuthor Commented:
tosse
Thanks for this - question.

My actual customer_id is made up as per the following..
LPAD(PT_ID,6)||LPAD(Fc_ID,6)||LPAD(CL_ID,6)||LPAD(S_ID,6)||LPAD(FT_ID,6) - see example 2 rows below

     8     1    95     0     1
    37   487   184   184     1

The PT_ID is my product id and if I select say PT_ID = 37 as per row 2 then your query works fine - however,
if I allow all PT_IDs then the dates work but no val is returned.
I apologise as I should have been clearer as to what the customer_id should have appeared like.
If it's a big change then happy to award points as per my original question - if not, then what is the change in your query to accomodate new customer_id.

Regards
0
Christoffer SwanströmPartnerCommented:
Not sure what exactly you mean. Could you show an example how your original data looks like and how the corresponding output should look like?
0
philsivyerAuthor Commented:
ORIGINAL DATA
Customer_ID           Date                         Value      
37487123761      30/04/2004      0.28      
37487123761      31/12/2007      0.28      
37487123761      31/12/2010      0.28      
37487184171      31/12/2009      0.28      
2696111261      31/01/2010      0.5      
3014511201      31/08/2007      0.33      

OUTPUT
Customer_ID           Date                         Value      
37487123761      30/04/2004            
37487123761      31/12/2007            
37487123761      31/12/2010            
37487184171      31/12/2009            
2696111261      31/01/2010            
3014511201      31/08/2007            

If I am selective in terms of choosing a Product_ID eg 37 then everything works OK and returns the value.
The Product_ID is first 2 digits in customer_id
0
Naveen KumarProduction Manager / Application Support ManagerCommented:
To me it looks like the below needs a tweak in the tosse insert query because the query is not taking care of the customer id when looking at the number of months between max/min dates. The below is taking max / min for all the records in the table which should not be the case and should look at the same logic but for each customer id.

(
SELECT
      MAX(dt) AS max_dt
      ,MIN(dt) AS min_dt
      ,(EXTRACT(YEAR FROM MAX(dt)) - EXTRACT(YEAR FROM MIN(dt))) * 12 + (EXTRACT(MONTH FROM MAX(dt)) - EXTRACT(MONTH FROM MIN(dt))) + 1 AS nb_months
    FROM
      swc_tst )

Ignore my comment if i am wrong here.....
0
philsivyerAuthor Commented:
What is the change please - I added
Group by customer_id - but the query then went to 15mins and I stopped it
0
Naveen KumarProduction Manager / Application Support ManagerCommented:
no - not just adding group by there is not the fix otherwise i would have just told that right. This query needs to be modified to tweak that logic into this which means the outer query ( join ) should be modified accordingly to take care of this grouping logic into it otherwise it is not going to work for you.

I hope tosse is already working on that bit for you. Thanks,
0
Christoffer SwanströmPartnerCommented:
nav_kum_v:

the months per customer are taken into account by joining cus_mt to all_months. There might be a more elegant way to do this, but it should work --> in my opinion no need to change that part
0
Christoffer SwanströmPartnerCommented:
philsivyer:

I actually don't understand the problem with your customer id. The logic of my query should work with any customer id, whether it is of the format 001, 1234456 or whatever.

Can you show the complete query which you are running which gives you wrong behavior when taking all customer_id's
0
Naveen KumarProduction Manager / Application Support ManagerCommented:
Tosse, philsivyer : if that is the case, then ignore my comments.
0
philsivyerAuthor Commented:
My query to create initial data set ...lets say I create a table from this called swc_tst as per your query
SELECT
CUSTOMER_ID,
FEE_TYPE_ID,
FEE_TYPE,
DT,
VAL,
FPML_FEE
FROM
(
SELECT
CAST(PRODUCT_ID||FINAL_CLIENT_ID||CLASS_ID||SERIES_ID||FEE_TYPE_ID AS INT) AS CUSTOMER_ID,
FEE_TYPE_ID,
FEE_TYPE,
VALID_FROM AS DT,
SUM(VALUE) AS VAL,
CAST(CASE WHEN SUM(VALUE) = 0 THEN 0 ELSE 1-SUM(VALUE) END AS DEC(14,4)) AS FPML_FEE
FROM
(
SELECT
PRODUCT_ID,
FINAL_CLIENT_ID,
CLASS_ID,
NVL(SERIES_ID,0) AS SERIES_ID,
FEE_TYPE_ID,
FEE_TYPE,
VALID_FROM,
CAST((VALUE/100) AS DEC(3,2)) AS VALUE
FROM
V_REBATE_RULES REB_RULES
INNER JOIN FEE_RULE_DETAIL  FRD ON REB_RULES.FEE_ASSOCIATION_RULE_ID = FRD.ID
INNER JOIN VAL_TYPE VT ON VT.ID =  FRD.VALUE_TYPE_ID
--WHERE PRODUCT_ID = 37
WHERE FEE_TYPE_ID = 1
)P1
GROUP BY
CAST(PRODUCT_ID||FINAL_CLIENT_ID||CLASS_ID||SERIES_ID||FEE_TYPE_ID AS INT),
FEE_TYPE_ID,
FEE_TYPE,
VALID_FROM
)

..................

I then run your query which works if I select use the "WHERE PRODUCT_ID = 37
" - if I include all products then the val column is all null
0
Christoffer SwanströmPartnerCommented:
I'm going to be offline for the next day or two, I'll try to look at it on Saturday or Sunday.
0
philsivyerAuthor Commented:
Thanks
0
Christoffer SwanströmPartnerCommented:
I'm not sure if I misunderstood something, but it seems to work regardless of the id. I tried the following, using the id´s yoiu used in one example.
CREATE TABLE swc_tst (
customer_id VARCHAR2(32)
,val NUMBER
,dt DATE
);

INSERT INTO swc_tst VALUES('37487123761', 4.5, TO_DATE('31.01.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('37487123761', 3.5, TO_DATE('31.03.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('37487123761', 6.2, TO_DATE('30.06.2011', 'dd.mm.yyyy'));

INSERT INTO swc_tst VALUES('2696111261', 0.2, TO_DATE('30.04.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('2696111261', 1.2, TO_DATE('30.09.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('3014511201', 1.3, TO_DATE('31.08.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('3014511201', 1.0, TO_DATE('31.10.2011', 'dd.mm.yyyy'));

SELECT
  customer_id
  ,month_end
  ,FIRST_VALUE(val) OVER(PARTITION BY customer_id, grp_nb ORDER BY month_end) AS val
FROM
(SELECT
  asd.*
  ,SUM(flg_beg_grp) OVER(PARTITION BY customer_id ORDER BY month_end) AS grp_nb
FROM
(SELECT
  cus_mt.customer_id
  ,all_months.month_end
  ,val.val
  ,CASE WHEN val.val IS NOT NULL THEN 1 ELSE 0 END AS flg_beg_grp
FROM
  (SELECT
    ADD_MONTHS(max_dt, -level + 1) AS month_end
  FROM
    (SELECT
      MAX(dt) AS max_dt
      ,MIN(dt) AS min_dt
      ,(EXTRACT(YEAR FROM MAX(dt)) - EXTRACT(YEAR FROM MIN(dt))) * 12 + (EXTRACT(MONTH FROM MAX(dt)) - EXTRACT(MONTH FROM MIN(dt))) + 1 AS nb_months
    FROM
      swc_tst
    )
  CONNECT BY level <= nb_months
  ) all_months
INNER JOIN
  (SELECT
    customer_id
    ,MIN(dt) AS cus_min_dt
    ,MAX(dt) AS cus_max_dt
  FROM
    swc_tst
  GROUP BY
    customer_id
  ) cus_mt
ON
  all_months.month_end BETWEEN cus_mt.cus_min_dt AND cus_mt.cus_max_dt
LEFT JOIN
  swc_tst val
ON
  val.customer_id = cus_mt.customer_id
AND
  all_months.month_end = val.dt
) asd
)qwe
ORDER BY
  customer_id
  ,month_end
;

Open in new window

0
philsivyerAuthor Commented:
Thanks
For some reason my end it does not work (only if I select a given ID) - I have attached exel file as the output table if this helps.
Regards

DATA.xls
0
philsivyerAuthor Commented:
Just noticed that it is returning one value aganst the MAX DATE of column dt !

nav kum v  did mention .....
"To me it looks like the below needs a tweak in the tosse insert query because the query is not taking care of the customer id when looking at the number of months between max/min dates. The below is taking max / min for all the records in the table which should not be the case and should look at the same logic but for each customer id. "

Regards

0
Christoffer SwanströmPartnerCommented:
I already commented on that, the all_months subquery is then joined to the max and min months per customer. Maybe this could have been done more elegantly in one go, but this works also (unless I'm missing something).
0
Christoffer SwanströmPartnerCommented:
When I run the script I posted earlier I get the following output:

2696111261	30.04.2011	0,2
2696111261	31.05.2011	0,2
2696111261	30.06.2011	0,2
2696111261	31.07.2011	0,2
2696111261	31.08.2011	0,2
2696111261	30.09.2011	1,2
3014511201	31.08.2011	1,3
3014511201	30.09.2011	1,3
3014511201	31.10.2011	1
37487123761	31.01.2011	4,5
37487123761	28.02.2011	4,5
37487123761	31.03.2011	3,5
37487123761	30.04.2011	3,5
37487123761	31.05.2011	3,5
37487123761	30.06.2011	6,2

Open in new window


Is that what you would expect? Do you get a different result?
0
philsivyerAuthor Commented:
The best way I can answer this is that - if you open attached file at 10:42 am today you will see a data set whereby the first two characters in the column "customer_id" are product ID's. Assume this data is in a table called table1 -  I then run your query againsta table1 and the data returned is correct in so far as filling in the gap months but there is only one row returned with a value populated in column val where as I would expect to see every row have a value in column val. However, it does return a single value in column val and that is against the maximum date.

Now, if I run my original data set selecting just the one product_id (now all first two characters in column customer_id are all the same) and then re-populate  table1 and run your query it runs fine and all the values in column val are returned.

The weird thing is that when I run the test data as provided by you and the query it looks ok.

Regards
0
Christoffer SwanströmPartnerCommented:
I think I found the problem. In your original example all dates were at the end of the month and I had made the assumption that this will always be the case. Try the following (with data from the Excel file you posted):


DROP TABLE swc_tst;
CREATE TABLE swc_tst (
customer_id VARCHAR2(32)
,val NUMBER
,dt DATE
);

INSERT INTO swc_tst VALUES('131018101XXXXXXXXXXX', 0.1, TO_DATE('30.11.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('161188401XXXXXXXXXXX', 0.5, TO_DATE('30.06.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('16158401XXXXXXXXXXXX', 0.25, TO_DATE('30.06.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('161938401XXXXXXXXXXX', 0.24, TO_DATE('31.01.2008', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('162258401XXXXXXXXXXX', 0.25, TO_DATE('28.02.2005', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('162258401XXXXXXXXXXX', 0.25, TO_DATE('30.09.2008', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('163018401XXXXXXXXXXX', 0.24, TO_DATE('31.12.2008', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('163118301XXXXXXXXXXX', 0.1, TO_DATE('14.12.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('163728401XXXXXXXXXXX', 0.24, TO_DATE('31.12.2009', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('163828301XXXXXXXXXXX', 0.24, TO_DATE('31.12.2008', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('164168301XXXXXXXXXXX', 0.1, TO_DATE('24.11.2011', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('164278401XXXXXXXXXXX', 0.24, TO_DATE('30.06.2007', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('164278401XXXXXXXXXXX', 0.24, TO_DATE('31.01.2008', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('164278401XXXXXXXXXXX', 0.24, TO_DATE('31.03.2009', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('164428401XXXXXXXXXXX', 0.5, TO_DATE('31.05.2006', 'dd.mm.yyyy'));
INSERT INTO swc_tst VALUES('164548401XXXXXXXXXXX', 0.24, TO_DATE('30.06.2007', 'dd.mm.yyyy'));



SELECT
  customer_id
  ,month_end
  ,FIRST_VALUE(val) OVER(PARTITION BY customer_id, grp_nb ORDER BY month_end) AS val
FROM
(SELECT
  asd.*
  ,SUM(flg_beg_grp) OVER(PARTITION BY customer_id ORDER BY month_end) AS grp_nb
FROM
(SELECT
  cus_mt.customer_id
  ,all_months.month_end
  ,val.val
  ,CASE WHEN val.val IS NOT NULL THEN 1 ELSE 0 END AS flg_beg_grp
FROM
  (SELECT
    ADD_MONTHS(max_dt, -level + 1) AS month_end
  FROM
    (SELECT
      MAX(TRUNC((dt), 'mm')) AS max_dt
      ,MIN(TRUNC((dt), 'mm'))  AS min_dt
      ,(EXTRACT(YEAR FROM MAX(TRUNC((dt), 'mm'))) - EXTRACT(YEAR FROM MIN(TRUNC((dt), 'mm')))) * 12 + (EXTRACT(MONTH FROM MAX(TRUNC((dt), 'mm'))) - EXTRACT(MONTH FROM MIN(TRUNC((dt), 'mm')))) + 1 AS nb_months
    FROM
      swc_tst
    )
  CONNECT BY level <= nb_months
  ) all_months
INNER JOIN
  (SELECT
    customer_id
    ,MIN(TRUNC((dt), 'mm')) AS cus_min_dt
    ,MAX(TRUNC((dt), 'mm')) AS cus_max_dt
  FROM
    swc_tst
  GROUP BY
    customer_id
  ) cus_mt
ON
  all_months.month_end BETWEEN cus_mt.cus_min_dt AND cus_mt.cus_max_dt
LEFT JOIN
  swc_tst val
ON
  val.customer_id = cus_mt.customer_id
AND
  all_months.month_end = TRUNC(val.dt, 'mm')
) asd
)qwe
ORDER BY
  customer_id
  ,month_end
;

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
philsivyerAuthor Commented:
Thanks for this - will try later
0
philsivyerAuthor Commented:
tosse

Thanks for this - now works fine.
Sorry as to late confirmation - Christmas New Year and all that.
Good work!

Regards
0
philsivyerAuthor Commented:
Thanks
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.