Avatar of hazemfadl
hazemfadlFlag for Egypt

asked on 

Removing duplicates

Hello,

I have 2 tables:

District:
DistrictID      DistrictName
1                  Cairo
23                Cairo
44                Cairo
60                Alex
62                Alex
70                Aswan
74                Aswan
78                Aswan

Map:
MapID    DistrictID  MapName
1            1               ABC
2            44             XYZ
3            23             DEF
4            60             GHI
5            62             JKL
6            70             MNO
7            74             PQR
8            78             STU

As described above you will note that the MapID (1,2,3) has different ditrictID (and 4,5 and 6,7,8 have the same problem) though they are the same city, I need to do the following:

1- Update the "Maps" table to eliminate the duplicate districtid based on the duplication of the district table (I mean to make all the same district refer to a single DistrictID) -- the maps table have to be as follows:

MapID    DistrictID  MapName
1            1               ABC
2            1             XYZ
3            1             DEF
4            60             GHI
5            60             JKL
6            70             MNO
7            70             PQR
8            70             STU

2- Delete all the duplicates of the District table, districts table have to be as follows:
DistrictID      DistrictName
1                  Cairo
60                Alex
70                Aswan


Any ideas.

Thanks ...
Microsoft SQL Server 2008

Avatar of undefined
Last Comment
hazemfadl
ASKER CERTIFIED SOLUTION
Avatar of twol
twol
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of Kevin Cross
Kevin Cross
Flag of United States of America image

It looks like this is already answered.  Definitely above is simple way to go.  I was wanting to play with analytical function way versus the multiple sub queries, so here is my approach just for reference.  Please award twol as that should work perfectly fine for you.

-- ============
-- setup of test environment
-- ============
CREATE TABLE #District(DistrictID INT PRIMARY KEY, 
                       DistrictName VARCHAR(50)
);

INSERT INTO #District(DistrictID, DistrictName)
SELECT 1, 'Cairo'
UNION SELECT 23, 'Cairo'
UNION SELECT 44, 'Cairo'
UNION SELECT 60, 'Alex'
UNION SELECT 62, 'Alex'
UNION SELECT 70, 'Aswan'
UNION SELECT 74, 'Aswan'
UNION SELECT 78, 'Aswan'
;

CREATE TABLE #Map(MapID INT PRIMARY KEY, 
                  DistrictID INT FOREIGN KEY REFERENCES #District(DistrictID), 
				  MapName VARCHAR(3)
);

INSERT INTO #Map
SELECT 1, 1, 'ABC'
UNION SELECT 2, 44, 'XYZ'
UNION SELECT 3, 23, 'DEF'
UNION SELECT 4, 60, 'GHI'
UNION SELECT 5, 62, 'JKL'
UNION SELECT 6, 70, 'MNO'
UNION SELECT 7, 74, 'PQR'
UNION SELECT 8, 78, 'STU'
;
-- ============

-- update Map table first to min DistrictID for a given DistrictName
UPDATE tto
SET tto.DistrictID = tfr.MIN_DistrictID
FROM #Map tto
JOIN (
   SELECT DistrictID
        , MIN(DistrictID) OVER(PARTITION BY DistrictName) MIN_DistrictID
   FROM #District
) tfr ON tfr.DistrictID = tto.DistrictID
-- verify updat of maps
SELECT * FROM #Map;

-- remove duplicates from districts based on min DistrictID for given DistrictName
;WITH districts AS (
   SELECT DistrictID
        , MIN(DistrictID) OVER(PARTITION BY DistrictName) MIN_DistrictID
   FROM #District
)
DELETE 
FROM districts 
WHERE DistrictID <> MIN_DistrictID
;
-- verify update of district list
SELECT * FROM #District;

-- ============
-- cleanup test environment
-- ============
DROP TABLE #Map;

DROP TABLE #District;
-- ============

Open in new window

Avatar of hazemfadl
hazemfadl
Flag of Egypt image

ASKER

Thank you ...
Microsoft SQL Server 2008
Microsoft SQL Server 2008

Microsoft SQL Server 2008 is a suite of relational database management system (RDBMS) products providing multi-user database access functionality.Component services include integration (SSIS), reporting (SSRS), analysis (SSAS), data quality, master data, T-SQL and performance tuning. Major improvements include the Always On technologies and support for unstructured data types.

50K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo