?
Solved

sql duplicate rows from concatenated columns

Posted on 2014-11-17
9
Medium Priority
?
332 Views
Last Modified: 2014-12-04
I have a table with 2 columns, code and country. Code is 9 characters. I would like to find all the rows in the table where the first 8 char plus the country are duplicates.
e.g.
123456781  AG
123456782  AG
123456781  BB
987654321  JM

would have concatentated values
12345678AG
12345678AG
12345678BB
98765432JM


which has the duplicate value 12345678AG
so I want to get only the records where the concatenated values are duplicated.
i.e.

123456781  AG
123456782  AG

I have used concatenation only to describe the problem.
I am fine if the solution can select these same record without concatenation.
0
Comment
Question by:DaveChoiceTech
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
  • 2
  • +1
9 Comments
 
LVL 66

Expert Comment

by:Jim Horn
ID: 40447835
Give this a whirl..
SELECT LEFT(code, 8), country, COUNT(Country) as the_count
FROM YourTable
GROUP BY LEFT(code, 8), country
HAVING COUNT(Country) > 1

Open in new window

If you'd like some more reading on GROUP BY I have an image and code-heavy tutorial out there called SQL Server GROUP BY Solutions
0
 
LVL 42

Expert Comment

by:pcelba
ID: 40447910
More accurate solution could be this one:
SELECT * FROM YourTable
 WHERE LEFT(code, 8)+country IN (
      SELECT LEFT(code, 8)+country
        FROM YourTable
       GROUP BY LEFT(code, 8)+country
      HAVING COUNT(*) > 1)

Open in new window


MS SQL Server is OK with it. DB2? Who knows?
0
 
LVL 32

Expert Comment

by:awking00
ID: 40450026
I think DB2 does not use the left function but you should be able to use the substr function in its stead.
select substr(code,1,8), country, count(*)
from yourtable
group by substr(code,1,8), country
having count(*) > 1
0
Moving data to the cloud? Find out if you’re ready

Before moving to the cloud, it is important to carefully define your db needs, plan for the migration & understand prod. environment. This wp explains how to define what you need from a cloud provider, plan for the migration & what putting a cloud solution into practice entails.

 
LVL 32

Expert Comment

by:awking00
ID: 40450041
I missed where youi wanted to retrieve the complete code column
select y.code, y.country
from yourtable y,
(select substr(code,1,8), country, count(*)
 from yourtable
 group by substr(code,1,8), country
 having count(*) > 1) x
where x.country = y.country
and x.code = substr(y.code,1,8);
0
 

Author Comment

by:DaveChoiceTech
ID: 40450154
Perhaps I wasn't clear. The issue is to find the rows with duplicates of the combined columns LEFT(code,8) and country.
i.e. given
123456781  AG
123456782  AG
123456781  BB
987654321  JM
987654321  BB
987654322  BB

now adding a column for the first 8 digits of the code gives.

12345678  123456781  AG
12345678  123456782  AG
12345678  123456781  BB
98765432  987654321  JM
98765432  987654321  BB
98765432  987654322  BB

Wherever the combined value of the 8 digit code plus the country are duplicated the row  from the original table should be provided.

So what I want is

123456781 AG
123456782 AG
987654321 BB
987654322 BB

Note that rows 123456781  BB, 987654321  BB and 987654321  JM are omitted since the first 8 digits of the code combined with the country code are unique.
0
 

Author Comment

by:DaveChoiceTech
ID: 40450182
Oops that final note should have been

Note that rows 123456781  BB and 987654321  JM are omitted since the first 8 digits of the code combined with the country code are unique.
0
 
LVL 32

Accepted Solution

by:
awking00 earned 2000 total points
ID: 40450201
I left out the alias, should be -
select y.code, y.country
from yourtable y,
(select substr(code,1,8) as code, country, count(*)
 from yourtable
 group by substr(code,1,8), country
 having count(*) > 1) x
where x.country = y.country
and x.code = substr(y.code,1,8)
order by 1, 2;
0
 
LVL 42

Expert Comment

by:pcelba
ID: 40450279
What query provided this result:
12345678  123456781  AG
12345678  123456782  AG
12345678  123456781  BB
98765432  987654321  JM
98765432  987654321  BB
98765432  987654322  BB
Both my and awking00's code should work without problems on MS SQL.
0
 
LVL 32

Expert Comment

by:awking00
ID: 40450296
Actually, my code won't work on MS SQL as is, but could be easily modified to do so by using the substring function (i.e. not substr which is DB2) and adding the AS keyword for the alias of the table and subquery.
0

Featured Post

Does Your Cloud Backup Use Blockchain Technology?

Blockchain technology has already revolutionized finance thanks to Bitcoin. Now it's disrupting other areas, including the realm of data protection. Learn how blockchain is now being used to authenticate backup files and keep them safe from hackers.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Lotus Notes has been used since a very long time as an e-mail client and is very popular because of it's unmatched security. In this article we are going to learn about  RRV Bucket corruption and understand various methods to Fix "RRV Bucket Corrupt…
In today's business world, data is more important than ever for informing marketing campaigns. Accessing and using data, however, may not come naturally to some creative marketing professionals. Here are four tips for adapting to wield data for insi…
Via a live example, show how to setup several different housekeeping processes for a SQL Server.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

765 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question