pm620wh
asked on
Oracle 11g How to insert missing rows in to a sequence based on a date and a group
Hi
I'm wondering if the following is possible and if so, can you show me please:
I have a set of rows grouped by a Unit ID and Date and sorted by a date e.g.
Date UnitID Distance Travelled
15/11/2012 509 2005
16/11/2012 509 1920
19/11/2012 509 1815
15/11/2012 510 2122
16/11/2012 510 1999
18/11/2012 510 2121
The SQL to retreive the rows look for records between two dates e.g. 15/11/2012 and 19/11/2012
Now you might have noticed that the dates don't run sequentially, that is to say, there are some dates missing, for example Unid 509 is missing the 17th and 18th and Unit 510 is missing the 17th and 19th
What I want to do is display a result set that includes the missing rows between the date range specified in the query. For each missing row, we would add in the Date and UnitID and a zero value for the distance travelled. So, if our query were to ask for rows between the 15th November 2012 and the 19th November 2012 we would get something like the following:
Date UnitID Distance Travelled
15/11/2012 509 2005
16/11/2012 509 1920
17/11/2012 509 0
18/11/2012 509 0
19/11/2012 509 1815
15/11/2012 510 2122
16/11/2012 510 1999
17/11/2012 510 0
18/11/2012 510 2121
19/11/2012 510 0
The rows in bold are those there were inserted by the SQL code.
In reality, my query will reteive data between two date ranges that span a couple of months.
I'm wondering if the following is possible and if so, can you show me please:
I have a set of rows grouped by a Unit ID and Date and sorted by a date e.g.
Date UnitID Distance Travelled
15/11/2012 509 2005
16/11/2012 509 1920
19/11/2012 509 1815
15/11/2012 510 2122
16/11/2012 510 1999
18/11/2012 510 2121
The SQL to retreive the rows look for records between two dates e.g. 15/11/2012 and 19/11/2012
Now you might have noticed that the dates don't run sequentially, that is to say, there are some dates missing, for example Unid 509 is missing the 17th and 18th and Unit 510 is missing the 17th and 19th
What I want to do is display a result set that includes the missing rows between the date range specified in the query. For each missing row, we would add in the Date and UnitID and a zero value for the distance travelled. So, if our query were to ask for rows between the 15th November 2012 and the 19th November 2012 we would get something like the following:
Date UnitID Distance Travelled
15/11/2012 509 2005
16/11/2012 509 1920
17/11/2012 509 0
18/11/2012 509 0
19/11/2012 509 1815
15/11/2012 510 2122
16/11/2012 510 1999
17/11/2012 510 0
18/11/2012 510 2121
19/11/2012 510 0
The rows in bold are those there were inserted by the SQL code.
In reality, my query will reteive data between two date ranges that span a couple of months.
Another one just about as ugly as the one above.
From the simple test below, the execution plan on this one appears to be not as bad on resources but your mileage may vary.
From the simple test below, the execution plan on this one appears to be not as bad on resources but your mileage may vary.
drop table tab1 purge;
create table tab1(myDate date, UnitID number, Distance number);
insert into tab1 values(to_Date('15/11/2012','DD/MM/YYYY'),509,2005);
insert into tab1 values(to_Date('16/11/2012','DD/MM/YYYY'),509,1920);
insert into tab1 values(to_Date('19/11/2012','DD/MM/YYYY'),509,1815);
insert into tab1 values(to_Date('15/11/2012','DD/MM/YYYY'),510,2122);
insert into tab1 values(to_Date('16/11/2012','DD/MM/YYYY'),510,1999);
insert into tab1 values(to_Date('18/11/2012','DD/MM/YYYY'),510,2121);
commit;
with mydata as (
select myDate,
UnitID,
Distance,
min(mydate) over() min_date_in_range,
(max(mydate) over()-min(mydate) over()) days_in_range
from tab1
where mydate between to_date('15/11/2012','DD/MM/YYYY') and to_date('19/11/2012','DD/MM/YYYY')
)
select mydate, unitid, sum(distance) from
(
select mydate, unitid, distance from mydata
union all
select min_date_in_range+level, unitid, 0 from mydata connect by level <= days_in_range
)
group by mydate, unitid
order by unitid, mydate
/
ASKER
Hi Everyone
Thank you very much for your feedback; I've made a slight error in my approach :-(
Really sorry but would you kindly take another look at let me know your views on this please?
The result set will will saved as a view and I won't know the starting date until the query runs (that is to say, I won't actually specifiy a date range in the SQL, I will just retrieve all the records in the table that meet some other criteria).
What will happen is that users will view the data via a report created in Business Objects and it is Business Objects that will connect to the view. Users will then be able to filter the data and reduce the number of records displayed.
For my purposes. I need to find the first date in the result set and work forwards from there inserting the missing rows for each Unit ID.
Thank you for your patience. Any help would be greatly appreciated.
Thank you very much for your feedback; I've made a slight error in my approach :-(
Really sorry but would you kindly take another look at let me know your views on this please?
The result set will will saved as a view and I won't know the starting date until the query runs (that is to say, I won't actually specifiy a date range in the SQL, I will just retrieve all the records in the table that meet some other criteria).
What will happen is that users will view the data via a report created in Business Objects and it is Business Objects that will connect to the view. Users will then be able to filter the data and reduce the number of records displayed.
For my purposes. I need to find the first date in the result set and work forwards from there inserting the missing rows for each Unit ID.
Thank you for your patience. Any help would be greatly appreciated.
ASKER
Hi
Is anyone able to help me refine the approach please so that the query works without a start or end date being specified please?
Thank you in advance.
Is anyone able to help me refine the approach please so that the query works without a start or end date being specified please?
Thank you in advance.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi Everyone
Thank you very much for you feedback and help; the SQL provided by sdstuber is a great learning tool and I'm playing with the SQL to see how it works altough the idea provided by slightwv gave us food for thought and our Business Objects team and now looking in to trying to fix the issue of missing entries in the report itself as opposed to use placing the overhead on Oracle.
Thank you all once again.
Thank you very much for you feedback and help; the SQL provided by sdstuber is a great learning tool and I'm playing with the SQL to see how it works altough the idea provided by slightwv gave us food for thought and our Business Objects team and now looking in to trying to fix the issue of missing entries in the report itself as opposed to use placing the overhead on Oracle.
Thank you all once again.
WITH basedata
AS (put your query here),
units AS (SELECT DISTINCT unitid FROM basedata),
days
AS (SELECT TO_DATE('15/11/2012', 'dd/mm/yyyy') + LEVEL - 1 d
FROM DUAL
CONNECT BY TO_DATE('15/11/2012', 'dd/mm/yyyy') + LEVEL - 1 <=
TO_DATE('19/11/2012', 'dd/mm/yyyy'))
SELECT days.d, units.unitid, nvl(basedata.distance,0) distance
FROM days
CROSS JOIN units
LEFT JOIN basedata ON days.d = basedata.yourdate AND units.unitid = basedata.unitid
ORDER BY unitid, d