Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 178
  • Last Modified:

Need to get rid off some duplicate records from the table under specific conditions

I have two tables in the database:
ProductGroup with fields: GroupID, GroupName
Product with fields: ProdID, GroupID, ProdName, DistribCode, ExpiryDate

I have to write a store procedure of cursor (I am not sure what will fit the best) that will get rid of unneeded records.
It must first look for ProdName, if there are some duplicates then must check DistribCode, if can find the same code for this ProdName then must look in GroupName for word "French".If can find it , delete all occurences of the same ProductName with the same DistribCode except one that have "French" in parent table.
If can't find "French" then it must keep just a record with the latest expiry date.

I will give you some data to better describe what I am looking for:

ProductGroup table:

GroupID                           GroupName
-----------------------------------------------------------------------------------------------------------------
5000                                Cheese
5001                                Valley cheese
5002                                French cheese
5003                                Village cheese
5004                                Milk

Product table:

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
------------------------------------------------------------------------------------------------------------------------
1000                5000                  Country cheese                  30                              01-09-2004
1001                5001                  Country cheese                  30                              01-09-2004
1002                5001                  Country cheese                  30                              01-10-2004
1003                5000                  Country cheese                100                              01-10-2004
1004                5000                  Country cheese                150                              01-11-2004
1005                5002                  Country cheese                150                              01-09-2004


After running the code I need to have just 3 rows.

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1002                5001                  Country cheese                  30                              01-10-2004

because there is no "French" word in GroupName, so record with the latest expiry date was kept in the table

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1003                5000                  Country cheese                100                              01-10-2004
because there is only one record for this ProdName and this DistribCode

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1005                5002                  Country cheese                150                              01-09-2004
because "French"  word from GroupName has the higher priority than expirydate.

Thx
Iwona


0
iwona62
Asked:
iwona62
  • 5
  • 5
1 Solution
 
MartinCMSCommented:
Don't know if this will work for you or not since I don't have the data to test.  Give it a try..... This is a select Statement without dups, not deleting your dups.

select ProdID,pg.GroupID,ProdName,DistribCode,ExpiryDate
From      (select * From       ProductGroup pg join Product p on pg.GroupID = p.GroupID) t1
Where      EXISTS (SELECT 1
            FROM   (select * From       ProductGroup pg join Product p on pg.GroupID = p.GroupID) AS yt2
              WHERE  yt2.ProdName = t1.ProdName AND
                 yt2.GroupName like '%French%' AND
                     yt2.ProdID < t1.ProdID)
0
 
Jan FranekCommented:
I would do it this way:

1. gather important data:

select
p.ProdName,
p.DistribCode,
count(1) as duplicates,
max(p.ExpiryDate) as maxdate,
case when exists ( select 1 from ProductGroup pg1, Product p1 where pg1.GroupID = p1.GroupID and p1.ProdName = p.ProdName and p1.DistribCode = p.DistribCode and pg1.GroupName like "French%" ) then 1 else 0 end as french
from Product p
group by p.ProdName, p.DistribCode

this query should show, whether for given ProdName and DistribCode there are duplicate rows, it show maximum ExpiryDate and also it shows where there exists group begining with French.

2. browse through this data using cursor and do appropriate action:

- if there is no duplicate - no action
- else if there is French - delete non French records
- else if there is no French - delete rows with lower expiry date

0
 
iwona62Author Commented:
Sorry,
I made mistake.
ExpiryDate column  also exist in ProductGroup table, not in Product table.
How this change will affect sql code?
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
iwona62Author Commented:
Hi Janek,

I used your query actually modifying to be able retrieve also max(expirydate).

select
p.ProdName,
p.DistribCode,
count(1) as duplicates,
max(pg.ExpiryDate) as maxdate,
case when exists ( select 1 from ProductGroup pg1, Product p1 where pg1.GroupID = p1.GroupID and p1.ProdName = p.ProdName and p1.DistribCode = p.DistribCode and pg1.GroupName like "French%" ) then 1 else 0 end as french
from Product p, ProductGroup pg
where p.GroupID= pg.GroupID
group by p.ProdName, p.DistribCode


I have additional question.
How I can handle the situation if I will have more than one record with the same max(expirydate) and/or with word "French".
Actually after running the query I have this issue, even worse because I have "French" and the same dates.
Take a look:

ProductName        DistribCode Duplicates  ExpiryDate               French
----------------------------------------------------------------------------------------
Pills                         30             2           2002-05-24 00:00:00.000     1
Pills                         30             3           2002-05-24 00:00:00.000     1
0
 
Jan FranekCommented:
First - it's strangem that you have 2 rows with the same ProductName and DistribCode in the result of that query - there should be just one (because of GROUP BY).

Second - if you really have the problem with more than one record with the same max(expirydate) or French, you have to define more specific criteria - for example - if there are two rows with French, then keep the one with max(expirydate), if there are two with the same expirydate, keep the one with maximum ProdID. But this has to be defined by you or somebody, who assigned you this task.
0
 
iwona62Author Commented:
Yes, it make sense.

Now when I am trying to delete records from Product table (child table) I have a problem because I have to check conditions in parent table ProductGroup like expirydate.
How I can combine those two tables in delete statement, but delete only from Product.
0
 
Jan FranekCommented:
delete Product
from Product p, ProductGroup pg
where ....
0
 
iwona62Author Commented:
Yes, it works.
If I need to update the child table and have ProductName to be a concatanation of GroupName (from ProductGroup table) and ProductID (from Product table) how the statement will look like?
0
 
Jan FranekCommented:
update Products
set ProductName = pg.GroupName + p.ProductID -- may be you need some type conversion here
from Product p, ProductGroup pg
where ....
0
 
iwona62Author Commented:
Thanks Jan,

Actually I found out.

My statement is:

update Products
set Products.ProductName = Products.ProductName + ProductGroup.DistribCode
from Products
inner join ProductGroup
on ProductGroup.GroupID = Products.GroupID

I am sure that your statement will work too.

Thx for your help and quick response

Iwona
0
 
Jan FranekCommented:
Glad to help you :-)
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 5
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now