Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

How to remove duplicate records in SQL2008r2 where there is a column with identical values except one is UPPER case and the other is LOWER case ?

Posted on 2015-01-13
8
Medium Priority
?
88 Views
Last Modified: 2015-01-14
I have inherited tables with duplicate records and I'm able to remove most except in one table I have a column that have records with identical values except one is UPPER case and the other is LOWER case.  I verify with the client that the two identical values are correct, one UPPER and the other LOWER.  How do I remove the duplicates?

I have included code to create a test table, load the table and the query that will search for the duplicates.
GRP-ID.sql
0
Comment
Question by:wdbates
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
  • 2
8 Comments
 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 40547649
Instead of a SELECT, make the final statement a DELETE:

DELETE
  FROM CTE
WHERE DuplicateCount > 1
0
 
LVL 24

Expert Comment

by:Phillip Burton
ID: 40547654
You can use case insensitivity using the COLLATE function - see http://msdn.microsoft.com/en-us/library/ms184391.aspx for more details.
0
 

Author Comment

by:wdbates
ID: 40547671
Scott, you missed the point, yes the query will be my delete query, but create and load the table and execute the CTE starting at line 57 through 81 and see the results.

Hello Phillip, glad to hear from you again.  I will give your suggestion alook and get back to the group.
0
Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 40547728
Ok; good luck.
0
 

Author Comment

by:wdbates
ID: 40547736
Hello Phillip, I must be missing something and below is what I have tried;

SELECT SERVERPROPERTY ('Collation')  --To determine collation
GO

SQL_Latin1_General_CP1_CI_AS  

--Test Query

SELECT *
       FROM [dbo].[UPPERlower]
       WHERE [GRP_ID] LIKE '%954.Atx%' Collate SQL_Latin1_General_CP1_CI_AS

GRP_ID      GRP_DESC      ORD_ID      INTERFACE_DATE
954.Atx      NULL             OUTSIDE      NULL
954.Atx      NULL             OUTSIDE      NULL
954.Atx      NULL             OUTSIDE      NULL
954.ATX      NULL            OUTSIDE      NULL
954.ATX      NULL            OUTSIDE      NULL
954.ATX      NULL            OUTSIDE      NULL

I thought that the returning results would only be the records where GRP_ID = '954.Atx'
0
 

Author Comment

by:wdbates
ID: 40547748
I may have it.  Going to a meeting will try later tonight or early tomorrow.

  SELECT *
       FROM [dbo].[UPPERlower]
       WHERE [GRP_ID] Collate Latin1_General_CS_AS = '954.Atx'

GRP_ID      GRP_DESC      ORD_ID      INTERFACE_DATE
954.Atx      NULL      OUTSIDE      NULL
954.Atx      NULL      OUTSIDE      NULL
954.Atx      NULL      OUTSIDE      NULL
0
 
LVL 24

Accepted Solution

by:
Phillip Burton earned 2000 total points
ID: 40548450
What about

SELECT DISTINCT *
       FROM [dbo].[UPPERlower]
       WHERE [GRP_ID] LIKE '%954.Atx%' Collate SQL_Latin1_General_CP1_CI_AS

Open in new window


or

SELECT GRP_ID, GRP_DESC, ORD_ID, INTERFACE_DATE
       FROM [dbo].[UPPERlower]
       GROUP BY [GRP_ID] Collate SQL_Latin1_General_CP1_CI_AS,
       , GRP_DESC, ORD_ID, INTERFACE_DATE

Open in new window

0
 

Author Closing Comment

by:wdbates
ID: 40549024
Hello Phillip;

That was the fix.  Code and results below.

Thank you very much.

WITH CTE
      (
       GRP_ID
      ,GRP_DESC
      ,ORD_ID
      ,DuplicateCount
      )
AS
   (
      SELECT CASE SUBSTRING([GRP_ID], 1, 6)
                    WHEN 't' THEN '954-Atx' Collate Latin1_General_CS_AS
                    ELSE GRP_ID
               END
            ,GRP_DESC
            ,ORD_ID
              ,ROW_NUMBER()
                  OVER(PARTITION BY CASE SUBSTRING([GRP_ID], 1, 6)
                                                  WHEN 't' THEN '954-Atx' Collate Latin1_General_CS_AS
                                                  ELSE GRP_ID
                                            END+ISNULL(GRP_DESC, '')+ISNULL(ORD_ID, '')
                          ORDER BY CASE SUBSTRING([GRP_ID], 1, 6)
                                                  WHEN 't' THEN '954-Atx' Collate Latin1_General_CS_AS
                                                  ELSE GRP_ID
                                            END+ISNULL(GRP_DESC, '')+ISNULL(ORD_ID, '')) AS DuplicateCount
       FROM [dbo].[UPPERlower] --ORDER BY GRP_ID Collate Latin1_General_CS_AS, DuplicateCount DESC
   )
SELECT *
  FROM CTE
 WHERE DuplicateCount > 1
  ORDER BY GRP_ID
GO

GRP_ID      GRP_DESC      ORD_ID      DuplicateCount
954.Atx      NULL      OUTSIDE      2
954.Atx      NULL      OUTSIDE      3
954.ATX      NULL      OUTSIDE      2
954.ATX      NULL      OUTSIDE      3
0

Featured Post

The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have written a PowerShell script to "walk" the security structure of each SQL instance to find:         Each Login (Windows or SQL)             * Its Server Roles             * Every database to which the login is mapped             * The associated "Database User" for this …
In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…
Sometimes it takes a new vantage point, apart from our everyday security practices, to truly see our Active Directory (AD) vulnerabilities. We get used to implementing the same techniques and checking the same areas for a breach. This pattern can re…

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question