Solved

Need to get rid off some duplicate records from the table under specific conditions

Posted on 2004-09-20
11
164 Views
Last Modified: 2010-05-18
I have two tables in the database:
ProductGroup with fields: GroupID, GroupName
Product with fields: ProdID, GroupID, ProdName, DistribCode, ExpiryDate

I have to write a store procedure of cursor (I am not sure what will fit the best) that will get rid of unneeded records.
It must first look for ProdName, if there are some duplicates then must check DistribCode, if can find the same code for this ProdName then must look in GroupName for word "French".If can find it , delete all occurences of the same ProductName with the same DistribCode except one that have "French" in parent table.
If can't find "French" then it must keep just a record with the latest expiry date.

I will give you some data to better describe what I am looking for:

ProductGroup table:

GroupID                           GroupName
-----------------------------------------------------------------------------------------------------------------
5000                                Cheese
5001                                Valley cheese
5002                                French cheese
5003                                Village cheese
5004                                Milk

Product table:

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
------------------------------------------------------------------------------------------------------------------------
1000                5000                  Country cheese                  30                              01-09-2004
1001                5001                  Country cheese                  30                              01-09-2004
1002                5001                  Country cheese                  30                              01-10-2004
1003                5000                  Country cheese                100                              01-10-2004
1004                5000                  Country cheese                150                              01-11-2004
1005                5002                  Country cheese                150                              01-09-2004


After running the code I need to have just 3 rows.

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1002                5001                  Country cheese                  30                              01-10-2004

because there is no "French" word in GroupName, so record with the latest expiry date was kept in the table

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1003                5000                  Country cheese                100                              01-10-2004
because there is only one record for this ProdName and this DistribCode

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1005                5002                  Country cheese                150                              01-09-2004
because "French"  word from GroupName has the higher priority than expirydate.

Thx
Iwona


0
Comment
Question by:iwona62
  • 5
  • 5
11 Comments
 
LVL 8

Expert Comment

by:MartinCMS
ID: 12103868
Don't know if this will work for you or not since I don't have the data to test.  Give it a try..... This is a select Statement without dups, not deleting your dups.

select ProdID,pg.GroupID,ProdName,DistribCode,ExpiryDate
From      (select * From       ProductGroup pg join Product p on pg.GroupID = p.GroupID) t1
Where      EXISTS (SELECT 1
            FROM   (select * From       ProductGroup pg join Product p on pg.GroupID = p.GroupID) AS yt2
              WHERE  yt2.ProdName = t1.ProdName AND
                 yt2.GroupName like '%French%' AND
                     yt2.ProdID < t1.ProdID)
0
 
LVL 14

Accepted Solution

by:
Jan_Franek earned 500 total points
ID: 12104230
I would do it this way:

1. gather important data:

select
p.ProdName,
p.DistribCode,
count(1) as duplicates,
max(p.ExpiryDate) as maxdate,
case when exists ( select 1 from ProductGroup pg1, Product p1 where pg1.GroupID = p1.GroupID and p1.ProdName = p.ProdName and p1.DistribCode = p.DistribCode and pg1.GroupName like "French%" ) then 1 else 0 end as french
from Product p
group by p.ProdName, p.DistribCode

this query should show, whether for given ProdName and DistribCode there are duplicate rows, it show maximum ExpiryDate and also it shows where there exists group begining with French.

2. browse through this data using cursor and do appropriate action:

- if there is no duplicate - no action
- else if there is French - delete non French records
- else if there is no French - delete rows with lower expiry date

0
 

Author Comment

by:iwona62
ID: 12104906
Sorry,
I made mistake.
ExpiryDate column  also exist in ProductGroup table, not in Product table.
How this change will affect sql code?
0
 

Author Comment

by:iwona62
ID: 12105344
Hi Janek,

I used your query actually modifying to be able retrieve also max(expirydate).

select
p.ProdName,
p.DistribCode,
count(1) as duplicates,
max(pg.ExpiryDate) as maxdate,
case when exists ( select 1 from ProductGroup pg1, Product p1 where pg1.GroupID = p1.GroupID and p1.ProdName = p.ProdName and p1.DistribCode = p.DistribCode and pg1.GroupName like "French%" ) then 1 else 0 end as french
from Product p, ProductGroup pg
where p.GroupID= pg.GroupID
group by p.ProdName, p.DistribCode


I have additional question.
How I can handle the situation if I will have more than one record with the same max(expirydate) and/or with word "French".
Actually after running the query I have this issue, even worse because I have "French" and the same dates.
Take a look:

ProductName        DistribCode Duplicates  ExpiryDate               French
----------------------------------------------------------------------------------------
Pills                         30             2           2002-05-24 00:00:00.000     1
Pills                         30             3           2002-05-24 00:00:00.000     1
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12109809
First - it's strangem that you have 2 rows with the same ProductName and DistribCode in the result of that query - there should be just one (because of GROUP BY).

Second - if you really have the problem with more than one record with the same max(expirydate) or French, you have to define more specific criteria - for example - if there are two rows with French, then keep the one with max(expirydate), if there are two with the same expirydate, keep the one with maximum ProdID. But this has to be defined by you or somebody, who assigned you this task.
0
Control application downtime with dependency maps

Visualize the interdependencies between application components better with Applications Manager's automated application discovery and dependency mapping feature. Resolve performance issues faster by quickly isolating problematic components.

 

Author Comment

by:iwona62
ID: 12113079
Yes, it make sense.

Now when I am trying to delete records from Product table (child table) I have a problem because I have to check conditions in parent table ProductGroup like expirydate.
How I can combine those two tables in delete statement, but delete only from Product.
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12113116
delete Product
from Product p, ProductGroup pg
where ....
0
 

Author Comment

by:iwona62
ID: 12115321
Yes, it works.
If I need to update the child table and have ProductName to be a concatanation of GroupName (from ProductGroup table) and ProductID (from Product table) how the statement will look like?
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12116893
update Products
set ProductName = pg.GroupName + p.ProductID -- may be you need some type conversion here
from Product p, ProductGroup pg
where ....
0
 

Author Comment

by:iwona62
ID: 12116974
Thanks Jan,

Actually I found out.

My statement is:

update Products
set Products.ProductName = Products.ProductName + ProductGroup.DistribCode
from Products
inner join ProductGroup
on ProductGroup.GroupID = Products.GroupID

I am sure that your statement will work too.

Thx for your help and quick response

Iwona
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12116990
Glad to help you :-)
0

Featured Post

Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SQL Query 17 30
MS SQLK Server multi-part identifier cannot be bound 5 25
sql server query from excel 3 57
Need a starter for ETL protocol? 4 32
Having an SQL database can be a big investment for a small company. Hardware, setup and of course, the price of software all add up to a big bill that some companies may not be able to absorb.  Luckily, there is a free version SQL Express, but does …
Why is this different from all of the other step by step guides?  Because I make a living as a DBA and not as a writer and I lived through this experience. Defining the name: When I talk to people they say different names on this subject stuff l…
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed

932 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now