Steve Berger
asked on
How to remove the overlapping data?
Hi,
I have data like below.
MODEL PRICE START_DATE_ACTIVE END_DATE_ACTIVE
SONY 100 1/1/1901 12/31/2005
SONY 200 1/1/2006 12/31/2007
SONY 300 1/1/2008 12/31/2008
SONY 400 2/1/2008 12/31/2009
SONY 500 1/1/2009 12/31/2010
The above the model SONY is having the price range that depends on the start_date_active and end_date_active column. For example in the first row, the model SONY price is 100 from 1/1/2006 to 12/31/2007. The same model SONY price is 200 from 1/1/2008 to 12/31/2008. So the model price will come like this. But in some cases the date will be overlapping. See the row number 3 and 4. I should remove the rownumber 4. How can i remove the overlapping date data?
Thanks
I have data like below.
MODEL PRICE START_DATE_ACTIVE END_DATE_ACTIVE
SONY 100 1/1/1901 12/31/2005
SONY 200 1/1/2006 12/31/2007
SONY 300 1/1/2008 12/31/2008
SONY 400 2/1/2008 12/31/2009
SONY 500 1/1/2009 12/31/2010
The above the model SONY is having the price range that depends on the start_date_active and end_date_active column. For example in the first row, the model SONY price is 100 from 1/1/2006 to 12/31/2007. The same model SONY price is 200 from 1/1/2008 to 12/31/2008. So the model price will come like this. But in some cases the date will be overlapping. See the row number 3 and 4. I should remove the rownumber 4. How can i remove the overlapping date data?
Thanks
ASKER
Hi thanks for your reply. It is returning no rows.
Can you clarify:
Row number 3 and 4 are overlapping but rownumber 4 exceeds the end dat of rownumber 3. Also rownumber 4 and 5 are overlapping but rownumber 5 end date exceeds the end date of row number4.
What is your ouptut if you don't have rownumber 5 in your sample set?
ASKER
Hi sdstuber, i added one more condition in this query. It is not fetching any rows. In my table, i have 10000 rows. I want to see only the particular item. So i have added one AND condition in that query like below. Please verify my below query is correct or not.
SELECT * FROM sample_table
WHERE ROWID IN
(SELECT rid
FROM (SELECT ROWID rid,
new_start_date sda,
new_end_date eda,
LAG(new_end_date)
OVER (
PARTITION BY ITEM_NAME
ORDER BY new_start_date, new_end_date
)
prevend,
LEAD(new_start_date)
OVER (
PARTITION BY ITEM_NAME
ORDER BY new_start_date, new_end_date
)
nextstart
FROM sample_table)
WHERE sda < prevend AND eda > nextstart)
AND ITEM_NAME = 'PHILIPS'
SELECT * FROM sample_table
WHERE ROWID IN
(SELECT rid
FROM (SELECT ROWID rid,
new_start_date sda,
new_end_date eda,
LAG(new_end_date)
OVER (
PARTITION BY ITEM_NAME
ORDER BY new_start_date, new_end_date
)
prevend,
LEAD(new_start_date)
OVER (
PARTITION BY ITEM_NAME
ORDER BY new_start_date, new_end_date
)
nextstart
FROM sample_table)
WHERE sda < prevend AND eda > nextstart)
AND ITEM_NAME = 'PHILIPS'
that will work or you can put it in the inner query
SELECT *
FROM sample_table
WHERE ROWID IN
(SELECT rid
FROM (SELECT ROWID rid,
new_start_date sda,
new_end_date eda,
LAG(new_end_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
prevend,
LEAD(new_start_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
nextstart
FROM sample_table
WHERE item_name = 'PHILIPS')
WHERE sda < prevend AND eda > nextstart)
ASKER
Hi Sharath,
you are correct. In my case overlapping means, it should not start with same year again with another row. For the above scenario, rownum 3 is ended with 12/31/2008. So that particular model price is deactivated after this date. So it has to start the new price from 1/1/2009 date. So nownum 5 is valid. In between rownum 3 and 5, again the start date started from the year which is already mentioned in rownum 3. So we should not consider this right. I thing now you can understand right.
Thanks
you are correct. In my case overlapping means, it should not start with same year again with another row. For the above scenario, rownum 3 is ended with 12/31/2008. So that particular model price is deactivated after this date. So it has to start the new price from 1/1/2009 date. So nownum 5 is valid. In between rownum 3 and 5, again the start date started from the year which is already mentioned in rownum 3. So we should not consider this right. I thing now you can understand right.
Thanks
ASKER
Hi sdstuber,
If i change the condition like below, it is giving the overlapping data for me. See my attached file for my actual data.
WHERE sda < prevend AND eda < nextstart). Can you tell me what is the reason?
TESTING-DATA.xls
If i change the condition like below, it is giving the overlapping data for me. See my attached file for my actual data.
WHERE sda < prevend AND eda < nextstart). Can you tell me what is the reason?
TESTING-DATA.xls
the problem was your original sample violated both ends
so, sda < prevend AND eda > nextstart found it.
but your new sample only violates on one end.
since all "good data will have eda < nextstart that's not really a helpful condition.
so sda < prevend AND eda < nextstart
would fail to find bad data like your original sample
so, sda < prevend AND eda > nextstart found it.
but your new sample only violates on one end.
since all "good data will have eda < nextstart that's not really a helpful condition.
so sda < prevend AND eda < nextstart
would fail to find bad data like your original sample
changing the AND to OR will work except then you'll get 2 rows returns for each overlap
because a violation on either end will guarantee the other row will fail on the other end.
from your last example
2008/12/31 (end) on line 6 > 2008/12/01 (start) on line 7 so that's an overlap
but by the same logic
2008/12/01 (start) on line 7 < 2008/12/31 (end) on line 6 so that's an overlap
because a violation on either end will guarantee the other row will fail on the other end.
from your last example
2008/12/31 (end) on line 6 > 2008/12/01 (start) on line 7 so that's an overlap
but by the same logic
2008/12/01 (start) on line 7 < 2008/12/31 (end) on line 6 so that's an overlap
we could check on just one end, but then if you have cases like in your original example then you'll get false positives around the problem row.
To illustrate, from your first example, the 400 line is the problem.
if we check only the end then the 300 line will show up as a problem as well as the 400 line.
if we check only the start then the 500 line will show up as a problem as well as the 400 line
SONY 300 1/1/2008 12/31/2008
SONY 400 2/1/2008 12/31/2009
SONY 500 1/1/2009 12/31/2010
To illustrate, from your first example, the 400 line is the problem.
if we check only the end then the 300 line will show up as a problem as well as the 400 line.
if we check only the start then the 500 line will show up as a problem as well as the 400 line
SONY 300 1/1/2008 12/31/2008
SONY 400 2/1/2008 12/31/2009
SONY 500 1/1/2009 12/31/2010
ASKER
I don't want to remove 2 overlapping records. I want remove only the line number 7. becasue that start date is not ended to other rows. For example, if the end_date is 2008/12/31, the next row should start from 1/1/2009. But line number 6 is starting with 2008/12/01 right. So we need to consider this as a overlap. So i need to remove only those rows. But my first example also applicable for some cases. It should ended with after the end date of previous row. We need to consider both the cases.
ASKER
Unless i can give only like below right.
WHERE sda < prevend
WHERE sda < prevend
easiest solution is probably to run two deletes...
first delete all rows that have a double violation
then delete rows where the new_start_date < previous row's end date
first delete all rows that have a double violation
then delete rows where the new_start_date < previous row's end date
DELETE FROM sample_table
WHERE ROWID IN
(SELECT rid
FROM (SELECT ROWID rid,
new_start_date,
new_end_date,
LAG(new_start_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
prevstart,
LAG(new_end_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
prevend,
LEAD(new_start_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
nextstart,
LEAD(new_end_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
nextend
FROM sample_table
WHERE item_name = 'SONY')
WHERE new_start_date < prevend AND new_end_date > nextstart);
DELETE FROM sample_table
WHERE ROWID IN
(SELECT rid
FROM (SELECT ROWID rid,
new_start_date,
new_end_date,
LAG(new_start_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
prevstart,
LAG(new_end_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
prevend,
LEAD(new_start_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
nextstart,
LEAD(new_end_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
nextend
FROM sample_table
WHERE item_name = 'SONY')
WHERE new_start_date < prevend );
you can't "just" do the delete where next_start_date < prevend because rows with double violation will cause a double row delete. themselves plus the following row.
that's why I suggest doing it in two steps.
that's why I suggest doing it in two steps.
ASKER
Ya it was great. I am putting those two queries in union and it is working for both condition. But if i make it as UNION, how should i delete?
don't combine them.
do two deletes
if you do both at once then you'll get extra deletes like I described above.
first delete the rows that violate on both ends
then delete the rows that violate on one end.
do two deletes
if you do both at once then you'll get extra deletes like I described above.
first delete the rows that violate on both ends
then delete the rows that violate on one end.
ASKER
I mean how should i delete those records from the table?
if you want to try to combine them then you'll have to write a much more complicated set of filters in order to determine if rows that violate on only one end should be kept anyway because their violation is caused by another row's double ended violation.
much simpler to just do two deletes
much simpler to just do two deletes
oh sorry, didn't see you posted again.
how to delete?
with the two deletes I posted above.
how to delete?
with the two deletes I posted above.
can you try this?
select t1.MODEL,t1.PRICE,t1.START_DATE_ACTIVE,t1.END_DATE_ACTIVE
from (select *,ROW_NUMBER() over (partition by Model order by Price) RN from your_table) t1
left join (select *,ROW_NUMBER() over (partition by Model order by Price) RN from your_table) t2
on t1.Model = t2.Model and t1.RN - 1 = t2.RN
left join (select *,ROW_NUMBER() over (partition by Model order by Price) RN from your_table ) t3
on t1.Model = t3.Model and t1.RN + 1 = t3.RN
where case when (t1.START_DATE_ACTIVE < t2.END_DATE_ACTIVE) and
(t1.END_DATE_ACTIVE > t3.START_DATE_ACTIVE) and
(t1.END_DATE_ACTIVE < t3.END_DATE_ACTIVE) then 1 else 0 end = 0
order by t1.Model,t1.RN
why query the table 3 times?
I don't understand the strategy behind the query,
I do understand the functionality of it, just not the why.
but, speaking of "why", in my previously posted deletes, I included a couple of extra analytics that aren't really necessary.
I was experimenting and didn't take them out before posting. They aren't harmful but aren't helpful either.
here are cleaner versions. I'll admit, still "bigger" code than the previous post, but it should be much more efficient.
I don't understand the strategy behind the query,
I do understand the functionality of it, just not the why.
but, speaking of "why", in my previously posted deletes, I included a couple of extra analytics that aren't really necessary.
I was experimenting and didn't take them out before posting. They aren't harmful but aren't helpful either.
here are cleaner versions. I'll admit, still "bigger" code than the previous post, but it should be much more efficient.
DELETE FROM sample_table
WHERE ROWID IN
(SELECT rid
FROM (SELECT ROWID rid,
new_start_date,
new_end_date,
LAG(new_end_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
prevend,
LEAD(new_start_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
nextstart
FROM sample_table
WHERE item_name = 'SONY')
WHERE new_start_date < prevend AND new_end_date > nextstart);
DELETE FROM sample_table
WHERE ROWID IN
(SELECT rid
FROM (SELECT ROWID rid,
new_start_date,
new_end_date,
LAG(new_end_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
prevend
FROM sample_table
WHERE item_name = 'SONY')
WHERE new_start_date < prevend );
<<I don't understand the strategy behind the query,
I do understand the functionality of it, just not the why. >>
Do you want explanation?
I do understand the functionality of it, just not the why. >>
Do you want explanation?
select t1.MODEL,t1.PRICE,t1.START_DATE_ACTIVE,t1.END_DATE_ACTIVE
from (select *,ROW_NUMBER() over (partition by Model order by Price) RN from your_table) t1
left join (select *,ROW_NUMBER() over (partition by Model order by Price) RN from your_table) t2
on t1.Model = t2.Model and t1.RN - 1 = t2.RN
left join (select *,ROW_NUMBER() over (partition by Model order by Price) RN from your_table ) t3
on t1.Model = t3.Model and t1.RN + 1 = t3.RN
where case when (t1.START_DATE_ACTIVE < t2.END_DATE_ACTIVE) and
(t1.END_DATE_ACTIVE > t3.START_DATE_ACTIVE) and
(t1.END_DATE_ACTIVE < t3.END_DATE_ACTIVE) then 1 else 0 end = 0
order by t1.Model,t1.RN
/* query result
MODEL PRICE START_DATE_ACTIVE END_DATE_ACTIVE
SONY 100 1901-01-01 00:00:00.000 2005-12-31 00:00:00.000
SONY 200 2006-01-01 00:00:00.000 2007-12-31 00:00:00.000
SONY 300 2008-01-01 00:00:00.000 2008-12-31 00:00:00.000
SONY 500 2009-01-01 00:00:00.000 2010-12-31 00:00:00.000
*/
not an explanation of the functionality (results) no. I understand what it does and how it works.
but what are you trying to do? it clearly produces different results than what the asker is looking for.
I assume you are doing something different intentionally, so what is it/why that approach?
Particularly since it's so "heavy" with 3 outer joins.
So, I know what it does, and even if I didn't, I ran it myself (after fixing small syntax errors) and saw the results.
I'm just not sure what direction you're going.
It might be a good approach but "as is" I don't know how to use what you've posted.
but what are you trying to do? it clearly produces different results than what the asker is looking for.
I assume you are doing something different intentionally, so what is it/why that approach?
Particularly since it's so "heavy" with 3 outer joins.
So, I know what it does, and even if I didn't, I ran it myself (after fixing small syntax errors) and saw the results.
I'm just not sure what direction you're going.
It might be a good approach but "as is" I don't know how to use what you've posted.
>> it clearly produces different results than what the asker is looking for.
Is it? Can you provide some example set where you are getting different results.
Asker - can you test this on your whole sample data and let me know where its failing. I don' say that its the perfect solution, i just tested this with the sample set provided but check against your complete table.
>> Particularly since it's so "heavy" with 3 outer joins.
JOINing 3 times may be over head, but to achieve the required output, i used this approach.
>> I assume you are doing something different intentionally, so what is it/why that approach?
I don't understand your meaning of "intentionally" here. I am trying to achieve the asker's expected result set.
To remove the overlapping records, i am checking previous record and next records Start /End dates with current record's Start/End dates. (The condition in my CASE statement).
Is it? Can you provide some example set where you are getting different results.
Asker - can you test this on your whole sample data and let me know where its failing. I don' say that its the perfect solution, i just tested this with the sample set provided but check against your complete table.
>> Particularly since it's so "heavy" with 3 outer joins.
JOINing 3 times may be over head, but to achieve the required output, i used this approach.
>> I assume you are doing something different intentionally, so what is it/why that approach?
I don't understand your meaning of "intentionally" here. I am trying to achieve the asker's expected result set.
To remove the overlapping records, i am checking previous record and next records Start /End dates with current record's Start/End dates. (The condition in my CASE statement).
Try this...
I've modified your query to fit the most recent structure posted by the asker
I populated it with the data from the spreadsheet, but changed the first row to not go back to 1901 then added more historical data
the goal is to delete the rows with overlaps
with the sample data below there are 2 overlaps
the row with a price of 400 and the second row with a price of 17500
If you run the 2 deletes I posted earlier you'll be left with 10 non-overlapping rows.
Your query doesn't return the 2 rows that should be deleted
I tried looking at in reverse, maybe you were trying to return the data that would be left after deletes; but it doesn't do that either.
That's what I meant by "doing something intentionally". It didn't appear to be a small mistake pursuing the same goal I had ben and simply got some unexpected results. Your approach appeared so radically different from where the thread had been going that it seemed like you must be doing something different and on purpose and I simply don't understand.
I've modified your query to fit the most recent structure posted by the asker
I populated it with the data from the spreadsheet, but changed the first row to not go back to 1901 then added more historical data
the goal is to delete the rows with overlaps
with the sample data below there are 2 overlaps
the row with a price of 400 and the second row with a price of 17500
If you run the 2 deletes I posted earlier you'll be left with 10 non-overlapping rows.
Your query doesn't return the 2 rows that should be deleted
I tried looking at in reverse, maybe you were trying to return the data that would be left after deletes; but it doesn't do that either.
That's what I meant by "doing something intentionally". It didn't appear to be a small mistake pursuing the same goal I had ben and simply got some unexpected results. Your approach appeared so radically different from where the thread had been going that it seemed like you must be doing something different and on purpose and I simply don't understand.
drop table sample_table;
CREATE TABLE sample_table
AS
SELECT price_list_name,
item_name,
price,
TO_DATE(nsd, 'yyyy/mm/dd') new_start_date,
TO_DATE(ned, 'yyyy/mm/dd') new_end_date
FROM (SELECT 'Global Price List All' price_list_name,
'SONY' item_name,
15500 price,
'2005/01/01' nsd,
'2005/05/14' ned
FROM DUAL
UNION ALL
SELECT 'Global Price List All', 'SONY', 16000, '2005/05/15', '2006/02/04' FROM DUAL
UNION ALL
SELECT 'Global Price List All', 'SONY', 16800, '2006/02/05', '2007/06/09' FROM DUAL
UNION ALL
SELECT 'Global Price List All', 'SONY', 17000, '2007/06/10', '2007/12/31' FROM DUAL
UNION ALL
SELECT 'Global Price List All', 'SONY', 17500, '2008/01/01', '2008/12/31' FROM DUAL
UNION ALL
SELECT 'Global Price List All', 'SONY', 17500, '2008/12/01', '2008/12/31' FROM DUAL
UNION ALL
SELECT 'Global Price List All', 'SONY', 18000, '2009/01/01', '2010/12/31' FROM DUAL
union all
SELECT 'Global Price List All', 'SONY', 100, '1901/01/01', '1995/12/31' FROM DUAL
UNION ALL
SELECT 'Global Price List All', 'SONY', 200, '1996/01/01', '1997/12/31' FROM DUAL
UNION ALL
SELECT 'Global Price List All', 'SONY', 300, '1998/01/01', '1998/12/31' FROM DUAL
UNION ALL
SELECT 'Global Price List All', 'SONY', 400, '1998/02/01', '1999/12/31' FROM DUAL
UNION ALL
SELECT 'Global Price List All', 'SONY', 500, '1999/01/01', '2000/12/31' FROM DUAL);
select t1.item_name,t1.PRICE,t1.new_START_DATE,t1.new_END_DATE
from (select sample_table.*,ROW_NUMBER() over (partition by item_name order by Price) RN from sample_table) t1
left join (select sample_table.*,ROW_NUMBER() over (partition by item_name order by Price) RN from sample_table) t2
on t1.item_name = t2.item_name and t1.RN - 1 = t2.RN
left join (select sample_table.*,ROW_NUMBER() over (partition by item_name order by Price) RN from sample_table ) t3
on t1.item_name = t3.item_name and t1.RN + 1 = t3.RN
where case when (t1.new_START_DATE < t2.new_END_DATE) and
(t1.new_END_DATE > t3.new_START_DATE) and
(t1.new_END_DATE < t3.new_END_DATE) then 1 else 0 end = 0
order by t1.item_name,t1.RN
I missed one more condition in my previous post.
yes, I am trying to return the data that would be left after deletes.
yes, I am trying to return the data that would be left after deletes.
select t1.item_name,t1.PRICE,t1.new_START_DATE,t1.new_END_DATE
from (select sample_table.*,ROW_NUMBER() over (partition by item_name order by Price) RN from sample_table) t1
left join (select sample_table.*,ROW_NUMBER() over (partition by item_name order by Price) RN from sample_table) t2
on t1.item_name = t2.item_name and t1.RN - 1 = t2.RN
left join (select sample_table.*,ROW_NUMBER() over (partition by item_name order by Price) RN from sample_table ) t3
on t1.item_name = t3.item_name and t1.RN + 1 = t3.RN
where case when t1.new_START_DATE <= t2.new_END_DATE
and ((t1.new_END_DATE <= t2.new_END_DATE) or
(t1.new_END_DATE >= t2.new_END_DATE and t1.new_END_DATE >= t3.new_START_DATE)) then 1 else 0 end = 0
order by t1.item_name,t1.RN
ok, maybe I'm misunderstanding the request
I thought the goal was to actually "do" the deletes of the invalid rows
Suriyaraj_Sudalaiappan, can you please clarify?
I thought the goal was to actually "do" the deletes of the invalid rows
Suriyaraj_Sudalaiappan, can you please clarify?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi,
Shall i use the query like below. In my table, i have 9000 rows. I did not give any where condition to select the particular model.
SELECT rid
FROM (SELECT ROWID rid,
price_list_name,
item_name,
uom_code,
new_start_date,
new_end_date,
formula_name,
LAG(new_end_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
prevend,
LEAD(new_start_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
nextstart
FROM SAMPLE_TABLE)
WHERE (new_start_date < prevend)
OR (new_start_date < prevend AND new_end_date > nextstart);
Shall i use the query like below. In my table, i have 9000 rows. I did not give any where condition to select the particular model.
SELECT rid
FROM (SELECT ROWID rid,
price_list_name,
item_name,
uom_code,
new_start_date,
new_end_date,
formula_name,
LAG(new_end_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
prevend,
LEAD(new_start_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
nextstart
FROM SAMPLE_TABLE)
WHERE (new_start_date < prevend)
OR (new_start_date < prevend AND new_end_date > nextstart);
what do you want to do with that query?
as you can see from the threads above there is some confusion as to what you are really trying to accomplish.
I have two guesses as to your goal and I've posted answers to both of them.
if you want to do a delete of data from your table, then no
use the two deletes I posted before in 24034314, run them in that same order
if you want to show records that will be kept if you were to do a delete then use the query I posted in 24035849 (you could also use Sharath's but it's less efficient)
if neither of those goals is really what you're trying to do, please try explaining again.
as you can see from the threads above there is some confusion as to what you are really trying to accomplish.
I have two guesses as to your goal and I've posted answers to both of them.
if you want to do a delete of data from your table, then no
use the two deletes I posted before in 24034314, run them in that same order
if you want to show records that will be kept if you were to do a delete then use the query I posted in 24035849 (you could also use Sharath's but it's less efficient)
if neither of those goals is really what you're trying to do, please try explaining again.
you can remove the model or item_name condition, of course
ASKER
essdbdu33 $cp l3109359.dbg $XXOM_TOP/log
essdbdu33 $ls -l $XXOM_TOP/log
total 800
-rw-r--r-- 1 rhuggi aosimp 401791 Apr 1 12:53 l3109359.dbg
essdbdu33 $>l3109359.dbg
bash: l3109359.dbg: Permission denied
essdbdu33 $pwd
/apps/betsyp1/betsyp1db/10 .2.0/temp/ betsyp1_es sdbdu33
Hi,
Actually i want to update my custom table if any overlapping data are there.
Actually my requirement is, we need to identify the overlapping data from the table and update the same. Here overlapping means, we have to consider only the start date and end date columns. We have 'n' number of items in my table and it contains different prices. Price is different for same item with different start and end date. That means, price is applicable for only with in that period. So we should be start from one date and end with other date. That date should not be repeated multiple times. For example my sample data, it start with 1/1/2000 and end with 12/31/2001. Then the next date should start with 1/1/2002. It should not start 1/2/2002 and more than that. because my data is like that. If any start like 1/2/2002 and more than that we should update that row in the table as overlapping.
And i got two scenarios from my data sheet like below. The start date is same for two rows and end date is different. Some data start date is different and end date is same. I will give the example. In this we don't know which one we should remove. For that discussion is going on right now. I will let you know after they confirmed.
For first scenario:
START_DATE END_DATE
10/28/2006 12/31/2006
10/28/2006 12/31/2010
For second scenario:
START_DATE END_DATE
1/1/2008 12/31/2010
1/2/2008 12/31/2010
Below query i modified from your query. Please verify that and let me know if any changes. And i included one more condition here like pps1.new_end_date <> prevend. It will helpful for my second scenario to remove same end date and consider only one. But not decided this one.
UPDATE /*+index(pps) */
XXQP_PAM_PRICE_STG pps
SET
pps.process_flag = 3,
pps.error_msg = 'Overlapping data, not consider for Import'
WHERE
pps.process_flag = 1
AND pps.ROWID IN (
SELECT /*+index(pps1) */ rid
FROM (
SELECT ROWID rid,
price_list_name,
item_name,
uom_code,
new_start_date,
new_end_date,
formula_name,
LAG(new_end_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
prevend,
LEAD(new_start_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
nextstart
FROM XXQP_PAM_PRICE_STG
) pps1
WHERE ((pps1.new_start_date < prevend AND pps1.new_end_date <> prevend) OR (pps1.new_start_date < prevend AND pps1.new_end_date > nextstart))
AND pps1.item_name = pps.item_name
);
Thanks
essdbdu33 $ls -l $XXOM_TOP/log
total 800
-rw-r--r-- 1 rhuggi aosimp 401791 Apr 1 12:53 l3109359.dbg
essdbdu33 $>l3109359.dbg
bash: l3109359.dbg: Permission denied
essdbdu33 $pwd
/apps/betsyp1/betsyp1db/10
Hi,
Actually i want to update my custom table if any overlapping data are there.
Actually my requirement is, we need to identify the overlapping data from the table and update the same. Here overlapping means, we have to consider only the start date and end date columns. We have 'n' number of items in my table and it contains different prices. Price is different for same item with different start and end date. That means, price is applicable for only with in that period. So we should be start from one date and end with other date. That date should not be repeated multiple times. For example my sample data, it start with 1/1/2000 and end with 12/31/2001. Then the next date should start with 1/1/2002. It should not start 1/2/2002 and more than that. because my data is like that. If any start like 1/2/2002 and more than that we should update that row in the table as overlapping.
And i got two scenarios from my data sheet like below. The start date is same for two rows and end date is different. Some data start date is different and end date is same. I will give the example. In this we don't know which one we should remove. For that discussion is going on right now. I will let you know after they confirmed.
For first scenario:
START_DATE END_DATE
10/28/2006 12/31/2006
10/28/2006 12/31/2010
For second scenario:
START_DATE END_DATE
1/1/2008 12/31/2010
1/2/2008 12/31/2010
Below query i modified from your query. Please verify that and let me know if any changes. And i included one more condition here like pps1.new_end_date <> prevend. It will helpful for my second scenario to remove same end date and consider only one. But not decided this one.
UPDATE /*+index(pps) */
XXQP_PAM_PRICE_STG pps
SET
pps.process_flag = 3,
pps.error_msg = 'Overlapping data, not consider for Import'
WHERE
pps.process_flag = 1
AND pps.ROWID IN (
SELECT /*+index(pps1) */ rid
FROM (
SELECT ROWID rid,
price_list_name,
item_name,
uom_code,
new_start_date,
new_end_date,
formula_name,
LAG(new_end_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
prevend,
LEAD(new_start_date)
OVER (
PARTITION BY item_name
ORDER BY new_start_date, new_end_date
)
nextstart
FROM XXQP_PAM_PRICE_STG
) pps1
WHERE ((pps1.new_start_date < prevend AND pps1.new_end_date <> prevend) OR (pps1.new_start_date < prevend AND pps1.new_end_date > nextstart))
AND pps1.item_name = pps.item_name
);
Thanks
ASKER
don't consider from the staring in my previous comment. some copy pase mistake. Please read from "HI"
so, this entire question has been wrong then it seems.
you're not trying to do a delete at all
and, you're not just trying to find overlaps, but you're also trying to find holes?
your description of one record ending 12/31/2001 but the next record should not start 1/2/2002
is that because it would leave a "hole" of 1/1/2001?
please post complete example of data (not a query)
please use a create table like I posted above with your entire table structure and enough sample data to show all scenarios (good and bad) you want to consider.
then post what you want the table to look like afterward. Don't worry about trying to guess what query might be used, just manually create the data in its final form and we'll try to figure out what query will produce the results you are looking for
you're not trying to do a delete at all
and, you're not just trying to find overlaps, but you're also trying to find holes?
your description of one record ending 12/31/2001 but the next record should not start 1/2/2002
is that because it would leave a "hole" of 1/1/2001?
please post complete example of data (not a query)
please use a create table like I posted above with your entire table structure and enough sample data to show all scenarios (good and bad) you want to consider.
then post what you want the table to look like afterward. Don't worry about trying to guess what query might be used, just manually create the data in its final form and we'll try to figure out what query will produce the results you are looking for
by the way, please close your old questions, you have open questions, including apparent repeats (I just posted to one of them) over 2 months old.
ASKER
GOOD ONE
glad I could help
Open in new window