Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 436
  • Last Modified:

Retrieve duplicates based on multiple columns and assign a variable

Greetings Experts

The situation I´m trying to solve is that I have an Oracle DB table (11g) which has duplicates (a row is considered duplicate if the DB already contains an exact combination of Product_Name,Product_Code and Product_ID the rows at the point when they are created are assigned a “VALID_FROM”date.

Now I want to retrieve these duplicates rows and assign a VALID_UNTIL (sysdate) to the row whichever has the older date for “VALID_FROM”.  (and in the process I want to check that the “VALID_UNTIL” would be null so I wouldn´t be overwriting anything.

I tried the below code which retrieves the duplicates but how do I indicate that insert sysdate to the row where the date is the earliest...and also do I need to group them as well, because I suspect that If I simply tell it to choose the earliest date and assing the date it will do so acrosse the whole select, whereas I want it to be assigned within each set of duplicates.

SELECT A.* FROM  DATA_TABLE A
 INNER
 JOIN 
  (
SELECT 
PRODUCT_NAME
, PRODUCT_CODE
, PRODUCT_ID
 FROM   DATA_TABLE 
 
GROUP
 BY 
PRODUCT_NAME
, PRODUCT_CODE
, PRODUCT_ID
  HAVING Count(*) >1
  ) B
    
ON a. PRODUCT_NAME = b. PRODUCT_NAME
AND a. PRODUCT_CODE = b. PRODUCT_CODE
AND a. PRODUCT_ID = b. PRODUCT_ID
ORDER BY A.VALID_FROM

Open in new window

0
Daniel993
Asked:
Daniel993
1 Solution
 
lwadwellCommented:
I would start with something like
SELECT *
  FROM (SELECT PRODUCT_NAME
             , PRODUCT_CODE
             , PRODUCT_ID
             , VALID_FROM
             , VALID_UNTIL 
             , row_number() over(partition by PRODUCT_NAME, PRODUCT_CODE, PRODUCT_ID ORDER BY VALID_FROM DESC) rn
          FROM DATA_TABLE
         WHERE VALID_UNTIL IS NULL) v
 WHERE rn > 1

Open in new window

0
 
flow01Commented:
2 questions :  
   is it possible that the duplicate row with the oldest VALID_FROM has already filled VALID_UNTIL
  is it possible  that the duplicate row contains also the same VALID_FROM

else

update DATA_TABLE U
set valid_until = SYSDATE
where (product_name, product_code, product_id) in  
(
select product_name, product_code, product_id
from DATA_TABLE A
group by product_name, product_code, product_id
having count(*) > 1                   -- select duplicates
and count(valid_until) < count(*) -1  -- for which not all but one row are ended
)
and valid_until IS NULL
and valid_from > (SELECT MIN(VALID_FROM)
                  FROM DATA_TABLE B
                  WHERE B.PRODUCT_NAME = U.PRODUCT_NAME
                  AND   B.PRODUCT_CODE = U.PRODUCT_CODE
                  AND   B.PRODUCT_ID = U.PRODUCT_ID
                 )
/
0
 
gatorvipCommented:
I'm not fully clear on what you're trying to do, but try the following code. What it does is update all non-null valid_until entries with the most recent valid_from value for that duplicate record.

merge into <your_table> t
using (  select product_name, product_code, product_id, max(valid_from) last_date
        from <your_table>
        group by product_name, product_code, product_id
) q
on (t.product_name = q.product_name
  and t.product_code = q.product_code
  and t.product_id = q.product_id)
when matched then
update
set t.valid_until = q.last_date
where t.valid_until is null

Open in new window


If you don't want to update the most recent entry, then you can add the following exclusion at the end

and t.valid_from <> q.last_date

Open in new window

0
Prepare for your VMware VCP6-DCV exam.

Josh Coen and Jason Langer have prepared the latest edition of VCP study guide. Both authors have been working in the IT field for more than a decade, and both hold VMware certifications. This 163-page guide covers all 10 of the exam blueprint sections.

 
namethisCommented:
merge into your_table a using ( 
   select PRODUCT_NAME, PRODUCT_CODE, PRODUCT_ID, max(valid_from) valid_from
   from your_table group by PRODUCT_NAME, PRODUCT_CODE, PRODUCT_ID
)b on ( 
   a.PRODUCT_NAME=b.PRODUCT_NAME and a.PRODUCT_CODE=b.PRODUCT_CODE 
   and a.PRODUCT_ID=b.PRODUCT_ID and a.valid_from<b.valid_from and a.valid_until is null
)
when matched then set valid_until=sysdate;

Open in new window

0
 
Daniel993Author Commented:
Gator, your script works perfectly, I tested it in various ways and every time it updates the "Valid_Until" with the correct date...but one more question and then I'm set to go, is there an easy way to put it into an after trigger? the below trigger gives me nasty errors during compilation.

create or replace TRIGGER "BI_TEST_TABLE" 
AFTER INSERT or UPDATE ON "TEST_TABLE"
FOR EACH ROW
BEGIN 

merge into TEST_TABLE t
using (  select product_name, product_code, product_id, max(valid_from) last_date
        from TEST_TABLE
        group by product_name, product_code, product_id
) q
on (t.product_name = q.product_name
  and t.product_code = q.product_code
  and t.product_id = q.product_id)
when matched then
update
set t.valid_until = q.last_date
where t.valid_until is null
and t.valid_from <> q.last_date

END;

Open in new window


PL/SQL: ORA-00933: SQL command not properly endedCompilation failed, line 3 (16:59:34) The line numbers associated with compilation errors are relative to the first BEGIN statement. This only affects the compilation of database triggers.

PL/SQL: SQL Statement ignoredCompilation failed, line 20 (16:59:34) The line numbers associated with compilation errors are relative to the first BEGIN statement. This only affects the compilation of database triggers.

PLS-00103: Encountered the symbol "end-of-file" when expecting one of the following: ( begin case declare end exception exit for goto if loop mod null pragma raise return select update while with << continue close current delete fetch lock insert open rollback savepoint set sql execute commit forall merge pipe purge
0
 
gatorvipCommented:
First of all, you're missing a ";" at the end of the merge statement (on line 18 in your code). I did not include it in the previous post because it was a self-standing SQL statement.

Second, even after you add the ; you will probably get a mutating table error because you'd be trying to select/update the table while at the same time inserting into it.

Third, what is the logic behind what you want the trigger to do? The valid_until fields have already been filled in, so you'd only have to worry about new records.
0
 
Daniel993Author Commented:
1) Added the semi-colon.
2) Again you were exactly correct, ran the code and got a mutating table error when trying to insert data
3) The goal was to get the code running so that every time someone enters a new row then after that the script would run, and I was able to do that by
removing "For each Row" and
setting the trigger for "After Insert" not "After Insert or Update"

So now it works perfectly, thanks Gator.

P.S. Flow01 thanks for your comments as well, the questions you asked made me rethink the approach a little bit...indeed I have to restrict users from inserting two identical rows at the same time
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now