Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Need to get rid off some duplicate records from the table under specific conditions

Posted on 2004-09-20
11
166 Views
Last Modified: 2010-05-18
I have two tables in the database:
ProductGroup with fields: GroupID, GroupName
Product with fields: ProdID, GroupID, ProdName, DistribCode, ExpiryDate

I have to write a store procedure of cursor (I am not sure what will fit the best) that will get rid of unneeded records.
It must first look for ProdName, if there are some duplicates then must check DistribCode, if can find the same code for this ProdName then must look in GroupName for word "French".If can find it , delete all occurences of the same ProductName with the same DistribCode except one that have "French" in parent table.
If can't find "French" then it must keep just a record with the latest expiry date.

I will give you some data to better describe what I am looking for:

ProductGroup table:

GroupID                           GroupName
-----------------------------------------------------------------------------------------------------------------
5000                                Cheese
5001                                Valley cheese
5002                                French cheese
5003                                Village cheese
5004                                Milk

Product table:

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
------------------------------------------------------------------------------------------------------------------------
1000                5000                  Country cheese                  30                              01-09-2004
1001                5001                  Country cheese                  30                              01-09-2004
1002                5001                  Country cheese                  30                              01-10-2004
1003                5000                  Country cheese                100                              01-10-2004
1004                5000                  Country cheese                150                              01-11-2004
1005                5002                  Country cheese                150                              01-09-2004


After running the code I need to have just 3 rows.

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1002                5001                  Country cheese                  30                              01-10-2004

because there is no "French" word in GroupName, so record with the latest expiry date was kept in the table

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1003                5000                  Country cheese                100                              01-10-2004
because there is only one record for this ProdName and this DistribCode

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1005                5002                  Country cheese                150                              01-09-2004
because "French"  word from GroupName has the higher priority than expirydate.

Thx
Iwona


0
Comment
Question by:iwona62
  • 5
  • 5
11 Comments
 
LVL 8

Expert Comment

by:MartinCMS
ID: 12103868
Don't know if this will work for you or not since I don't have the data to test.  Give it a try..... This is a select Statement without dups, not deleting your dups.

select ProdID,pg.GroupID,ProdName,DistribCode,ExpiryDate
From      (select * From       ProductGroup pg join Product p on pg.GroupID = p.GroupID) t1
Where      EXISTS (SELECT 1
            FROM   (select * From       ProductGroup pg join Product p on pg.GroupID = p.GroupID) AS yt2
              WHERE  yt2.ProdName = t1.ProdName AND
                 yt2.GroupName like '%French%' AND
                     yt2.ProdID < t1.ProdID)
0
 
LVL 14

Accepted Solution

by:
Jan_Franek earned 500 total points
ID: 12104230
I would do it this way:

1. gather important data:

select
p.ProdName,
p.DistribCode,
count(1) as duplicates,
max(p.ExpiryDate) as maxdate,
case when exists ( select 1 from ProductGroup pg1, Product p1 where pg1.GroupID = p1.GroupID and p1.ProdName = p.ProdName and p1.DistribCode = p.DistribCode and pg1.GroupName like "French%" ) then 1 else 0 end as french
from Product p
group by p.ProdName, p.DistribCode

this query should show, whether for given ProdName and DistribCode there are duplicate rows, it show maximum ExpiryDate and also it shows where there exists group begining with French.

2. browse through this data using cursor and do appropriate action:

- if there is no duplicate - no action
- else if there is French - delete non French records
- else if there is no French - delete rows with lower expiry date

0
 

Author Comment

by:iwona62
ID: 12104906
Sorry,
I made mistake.
ExpiryDate column  also exist in ProductGroup table, not in Product table.
How this change will affect sql code?
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 

Author Comment

by:iwona62
ID: 12105344
Hi Janek,

I used your query actually modifying to be able retrieve also max(expirydate).

select
p.ProdName,
p.DistribCode,
count(1) as duplicates,
max(pg.ExpiryDate) as maxdate,
case when exists ( select 1 from ProductGroup pg1, Product p1 where pg1.GroupID = p1.GroupID and p1.ProdName = p.ProdName and p1.DistribCode = p.DistribCode and pg1.GroupName like "French%" ) then 1 else 0 end as french
from Product p, ProductGroup pg
where p.GroupID= pg.GroupID
group by p.ProdName, p.DistribCode


I have additional question.
How I can handle the situation if I will have more than one record with the same max(expirydate) and/or with word "French".
Actually after running the query I have this issue, even worse because I have "French" and the same dates.
Take a look:

ProductName        DistribCode Duplicates  ExpiryDate               French
----------------------------------------------------------------------------------------
Pills                         30             2           2002-05-24 00:00:00.000     1
Pills                         30             3           2002-05-24 00:00:00.000     1
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12109809
First - it's strangem that you have 2 rows with the same ProductName and DistribCode in the result of that query - there should be just one (because of GROUP BY).

Second - if you really have the problem with more than one record with the same max(expirydate) or French, you have to define more specific criteria - for example - if there are two rows with French, then keep the one with max(expirydate), if there are two with the same expirydate, keep the one with maximum ProdID. But this has to be defined by you or somebody, who assigned you this task.
0
 

Author Comment

by:iwona62
ID: 12113079
Yes, it make sense.

Now when I am trying to delete records from Product table (child table) I have a problem because I have to check conditions in parent table ProductGroup like expirydate.
How I can combine those two tables in delete statement, but delete only from Product.
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12113116
delete Product
from Product p, ProductGroup pg
where ....
0
 

Author Comment

by:iwona62
ID: 12115321
Yes, it works.
If I need to update the child table and have ProductName to be a concatanation of GroupName (from ProductGroup table) and ProductID (from Product table) how the statement will look like?
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12116893
update Products
set ProductName = pg.GroupName + p.ProductID -- may be you need some type conversion here
from Product p, ProductGroup pg
where ....
0
 

Author Comment

by:iwona62
ID: 12116974
Thanks Jan,

Actually I found out.

My statement is:

update Products
set Products.ProductName = Products.ProductName + ProductGroup.DistribCode
from Products
inner join ProductGroup
on ProductGroup.GroupID = Products.GroupID

I am sure that your statement will work too.

Thx for your help and quick response

Iwona
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12116990
Glad to help you :-)
0

Featured Post

The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When you hear the word proxy, you may become apprehensive. This article will help you to understand Proxy and when it is useful. Let's talk Proxy for SQL Server. (Not in terms of Internet access.) Typically, you'll run into this type of problem w…
In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.

808 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question