Solved

Need to get rid off some duplicate records from the table under specific conditions

Posted on 2004-09-20
11
163 Views
Last Modified: 2010-05-18
I have two tables in the database:
ProductGroup with fields: GroupID, GroupName
Product with fields: ProdID, GroupID, ProdName, DistribCode, ExpiryDate

I have to write a store procedure of cursor (I am not sure what will fit the best) that will get rid of unneeded records.
It must first look for ProdName, if there are some duplicates then must check DistribCode, if can find the same code for this ProdName then must look in GroupName for word "French".If can find it , delete all occurences of the same ProductName with the same DistribCode except one that have "French" in parent table.
If can't find "French" then it must keep just a record with the latest expiry date.

I will give you some data to better describe what I am looking for:

ProductGroup table:

GroupID                           GroupName
-----------------------------------------------------------------------------------------------------------------
5000                                Cheese
5001                                Valley cheese
5002                                French cheese
5003                                Village cheese
5004                                Milk

Product table:

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
------------------------------------------------------------------------------------------------------------------------
1000                5000                  Country cheese                  30                              01-09-2004
1001                5001                  Country cheese                  30                              01-09-2004
1002                5001                  Country cheese                  30                              01-10-2004
1003                5000                  Country cheese                100                              01-10-2004
1004                5000                  Country cheese                150                              01-11-2004
1005                5002                  Country cheese                150                              01-09-2004


After running the code I need to have just 3 rows.

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1002                5001                  Country cheese                  30                              01-10-2004

because there is no "French" word in GroupName, so record with the latest expiry date was kept in the table

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1003                5000                  Country cheese                100                              01-10-2004
because there is only one record for this ProdName and this DistribCode

ProdID             GroupID              ProdName                     DistribCode                      ExpiryDate
---------------------------------------------------------------------------------------------------------------
1005                5002                  Country cheese                150                              01-09-2004
because "French"  word from GroupName has the higher priority than expirydate.

Thx
Iwona


0
Comment
Question by:iwona62
  • 5
  • 5
11 Comments
 
LVL 8

Expert Comment

by:MartinCMS
ID: 12103868
Don't know if this will work for you or not since I don't have the data to test.  Give it a try..... This is a select Statement without dups, not deleting your dups.

select ProdID,pg.GroupID,ProdName,DistribCode,ExpiryDate
From      (select * From       ProductGroup pg join Product p on pg.GroupID = p.GroupID) t1
Where      EXISTS (SELECT 1
            FROM   (select * From       ProductGroup pg join Product p on pg.GroupID = p.GroupID) AS yt2
              WHERE  yt2.ProdName = t1.ProdName AND
                 yt2.GroupName like '%French%' AND
                     yt2.ProdID < t1.ProdID)
0
 
LVL 14

Accepted Solution

by:
Jan_Franek earned 500 total points
ID: 12104230
I would do it this way:

1. gather important data:

select
p.ProdName,
p.DistribCode,
count(1) as duplicates,
max(p.ExpiryDate) as maxdate,
case when exists ( select 1 from ProductGroup pg1, Product p1 where pg1.GroupID = p1.GroupID and p1.ProdName = p.ProdName and p1.DistribCode = p.DistribCode and pg1.GroupName like "French%" ) then 1 else 0 end as french
from Product p
group by p.ProdName, p.DistribCode

this query should show, whether for given ProdName and DistribCode there are duplicate rows, it show maximum ExpiryDate and also it shows where there exists group begining with French.

2. browse through this data using cursor and do appropriate action:

- if there is no duplicate - no action
- else if there is French - delete non French records
- else if there is no French - delete rows with lower expiry date

0
 

Author Comment

by:iwona62
ID: 12104906
Sorry,
I made mistake.
ExpiryDate column  also exist in ProductGroup table, not in Product table.
How this change will affect sql code?
0
 

Author Comment

by:iwona62
ID: 12105344
Hi Janek,

I used your query actually modifying to be able retrieve also max(expirydate).

select
p.ProdName,
p.DistribCode,
count(1) as duplicates,
max(pg.ExpiryDate) as maxdate,
case when exists ( select 1 from ProductGroup pg1, Product p1 where pg1.GroupID = p1.GroupID and p1.ProdName = p.ProdName and p1.DistribCode = p.DistribCode and pg1.GroupName like "French%" ) then 1 else 0 end as french
from Product p, ProductGroup pg
where p.GroupID= pg.GroupID
group by p.ProdName, p.DistribCode


I have additional question.
How I can handle the situation if I will have more than one record with the same max(expirydate) and/or with word "French".
Actually after running the query I have this issue, even worse because I have "French" and the same dates.
Take a look:

ProductName        DistribCode Duplicates  ExpiryDate               French
----------------------------------------------------------------------------------------
Pills                         30             2           2002-05-24 00:00:00.000     1
Pills                         30             3           2002-05-24 00:00:00.000     1
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12109809
First - it's strangem that you have 2 rows with the same ProductName and DistribCode in the result of that query - there should be just one (because of GROUP BY).

Second - if you really have the problem with more than one record with the same max(expirydate) or French, you have to define more specific criteria - for example - if there are two rows with French, then keep the one with max(expirydate), if there are two with the same expirydate, keep the one with maximum ProdID. But this has to be defined by you or somebody, who assigned you this task.
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:iwona62
ID: 12113079
Yes, it make sense.

Now when I am trying to delete records from Product table (child table) I have a problem because I have to check conditions in parent table ProductGroup like expirydate.
How I can combine those two tables in delete statement, but delete only from Product.
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12113116
delete Product
from Product p, ProductGroup pg
where ....
0
 

Author Comment

by:iwona62
ID: 12115321
Yes, it works.
If I need to update the child table and have ProductName to be a concatanation of GroupName (from ProductGroup table) and ProductID (from Product table) how the statement will look like?
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12116893
update Products
set ProductName = pg.GroupName + p.ProductID -- may be you need some type conversion here
from Product p, ProductGroup pg
where ....
0
 

Author Comment

by:iwona62
ID: 12116974
Thanks Jan,

Actually I found out.

My statement is:

update Products
set Products.ProductName = Products.ProductName + ProductGroup.DistribCode
from Products
inner join ProductGroup
on ProductGroup.GroupID = Products.GroupID

I am sure that your statement will work too.

Thx for your help and quick response

Iwona
0
 
LVL 14

Expert Comment

by:Jan_Franek
ID: 12116990
Glad to help you :-)
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Dynamics crm 2011 8 43
Temporarily disable SQL Replication 7 21
SQL Server Log File Space 6 33
SQL 2012 and SQL 2014 always on 9 24
When you hear the word proxy, you may become apprehensive. This article will help you to understand Proxy and when it is useful. Let's talk Proxy for SQL Server. (Not in terms of Internet access.) Typically, you'll run into this type of problem w…
JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now