Solved

Find duplicate/case sensitive records in SQL database

Posted on 2016-07-21
10
107 Views
Last Modified: 2016-07-21
Hi,
Our application is tied to a database on SQL server.  The application upgrade is failing because duplicate objects are detected in a table which share the same ParentID - these objects contain the same Name but with different case sensitivity.  It appears collation was changed on the Name column -- collation settings below.

Example:

select * from TREE
where ParentID=4178385

Results:
results
I'm looking for a way to locate all objects in the TREE table that contains:
1. The same ParentID
2. The same Name, regardless of case sensitivity

Is there a SQL query that can help locate the requirements above?  I'm doing my best with Google but haven't found what I'm looking for.

Collation:
SQL 2014 instance: SQL_Latin1_General_CI_AS.
ACME database: Latin1_General_CI_AS.
TREE table: Latin1_General_CI_AS.
NAME column:  SQL_Latin1_General_CP1_CS_AS

Thank you.
0
Comment
Question by:bmsande
  • 5
  • 3
  • 2
10 Comments
 
LVL 65

Expert Comment

by:Jim Horn
ID: 41723459
The Tree table is defined collation Latin1_General_CI_AS, the CI meaning Case Insensitive, so as you are experiencing lower case will be treated the same as upper case for comparisons.

You can specify the collation in a SELECT clause, without changing the collation of the entire database, to identify differences.

<total air code, not abundantly guaranteed>
SELECT ParentId, COUNT(DISTINCT Name COLLATE Latin1_General_CS_AS) as count_name
FROM TREE
GROUP BY ParentId
ORDER BY ParentId

Open in new window

1
 
LVL 1

Expert Comment

by:Petr J
ID: 41723498
You could use the following select statement:
SELECT COUNT(*) as cnt, ParentId, UPPER(Name)
FROM TREE
GROUP BY ParentId, UPPER(Name)
HAVING cnt > 1

Open in new window


Hopefully the syntax is ok, I didn't use MSSQL for a while but the idea should be clear.
0
 

Author Comment

by:bmsande
ID: 41723504
Thanks.  I think that gives me a start but I'm having trouble interpreting the results:

results
When I investigate the first record, one object is returned but this does not indicate a duplicate record.
results2
Not sure if my explanation is confusing.  Is it possible to list duplicate Names, regardless of case, by distinct ParentID?  I wouldn't expect anything to return if only ONE row is returned for that ParentID, since we're looking for duplicate Names.

Hope I'm making sense.....  Thanks.
0
 
LVL 65

Expert Comment

by:Jim Horn
ID: 41723525
>When I investigate the first record, one object is returned but this does not indicate a duplicate record.
You can filter out the count=1's with a HAVING clause, which is the same as WHERE but it filters based on aggregate numbers.

SELECT ParentId, COUNT(DISTINCT Name COLLATE Latin1_General_CS_AS) as count_name
FROM TREE
GROUP BY ParentId
HAVING COUNT(DISTINCT Name COLLATE Latin1_General_CS_AS) > 1
ORDER BY ParentId

Open in new window


>Is it possible to list duplicate Names, regardless of case, by distinct ParentID?
I would suspect that the code I provided does that.  If it doesn't please spell out where it doesn't, and we'll work from there.
0
 

Author Comment

by:bmsande
ID: 41723527
@PetrJ
Thanks.  This only shows uppercase results.  I need to identify rows with a duplicate Name (regardless of case) that share the same ParentID.

So if the ParentID=1212 and there are three rows, each with the following Name:
SpreadSheet1.xls
spreadsheet1.xls
EmployeeSchedule.xls

I would expect TWO results - SpreadSheet1.xls and spreadsheet.xls
0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 
LVL 1

Expert Comment

by:Petr J
ID: 41723529
Key is the value of count_name. It indicates number of occurences - you need to be concerned when it's more than 1.

You can filter it using the following statement:
SELECT *
FROM (
SELECT ParentId, COUNT(DISTINCT Name COLLATE Latin1_General_CS_AS) as count_name
FROM TREE
GROUP BY ParentId
ORDER BY ParentId)
WHERE count_name > 1

Open in new window

I hope the syntax is OK for MSSQL.
0
 

Author Comment

by:bmsande
ID: 41723536
Having issues with the syntax in MSSQL.  Trying to convert.
0
 

Author Comment

by:bmsande
ID: 41723538
Msg 1033, Level 15, State 1, Line 6
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.
0
 
LVL 1

Accepted Solution

by:
Petr J earned 500 total points
ID: 41723560
Ok, for both SpreadSheet1.xls and spreadsheet1.xls use the following:
SELECT *
FROM TREE t1
WHERE EXISTS (
SELECT *
FROM TREE t2
WHERE t1.ParentId = t2.ParentId
AND UPPER(t1.Name) = UPPER(t2.Name)
AND t1.Name != t2.Name
)

Open in new window

0
 

Author Closing Comment

by:bmsande
ID: 41723603
Thank you!!!
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Suggested Solutions

CCModeler offers a way to enter basic information like entities, attributes and relationships and export them as yEd or erviz diagram. It also can import existing Access or SQL Server tables with relationships.
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
Viewers will learn how the fundamental information of how to create a table.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now