Solved

Query to look for duplicates entries in a rows

Posted on 2014-10-22
15
128 Views
Last Modified: 2014-10-27
Hello

I have a mysql database, in which one of the columns is called "Aliases".

I need a query that I can run in phpmyadmin which will search for any exact duplicates of those aliases and present them to me so I can edit them.

For example if there are two aliases named  mickey-mouse    then that could be considered duplicates and I want to just see those duplicates in the results of the query.  if there was also a mickey-mouse-club then that would not be an exact match and I don't want to see that included in the mickey-mouse match results.

Editing them is no problem. I just want to easy find those duplicates so I can make the appropriate fix by hand.

I'm using phpmyadmin Version information: 4.2.10,

I have a backup of my database.  all aliases are lower case and have no spaces -- just-dashes-between-words.

I have about 1000 records and just a few of them have duplicate aliases -- so a query would help me take care of this task.

I've used phpmyadmin for years, but am not a database pro like you people are.

The search is all within one table, which we can call bananas_content  ....

Thanks

Rowby
duplicate-alias.jpg
0
Comment
Question by:Rowby Goren
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 6
  • 2
15 Comments
 
LVL 58

Expert Comment

by:Gary
ID: 40398237
SELECT * FROM tablename
INNER JOIN (SELECT * FROM tablename
GROUP BY column_with_duplicate_values_name HAVING COUNT(id) > 1) dupes
ON dupes.id= tablename.id

Open in new window

0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 40398244
Thanks Gary

Ran it but it apparently is looking for the database name too

MySQL said: Documentation

#1046 - No database selected

So I guess I need to include the database name in the query???
0
 
LVL 58

Expert Comment

by:Gary
ID: 40398249
In PHPMyAdmin select your db on the left then run the sql
0
Ransomware-A Revenue Bonanza for Service Providers

Ransomware – malware that gets on your customers’ computers, encrypts their data, and extorts a hefty ransom for the decryption keys – is a surging new threat.  The purpose of this eBook is to educate the reader about ransomware attacks.

 
LVL 9

Author Comment

by:Rowby Goren
ID: 40398255
Still getting an error

Just want to make sure I am replacing the placeholders in your example with the correct item:

SELECT * FROM com_content
INNER JOIN (SELECT * FROM com_content
GROUP BY column_with_duplicate_values_name HAVING COUNT(id) > 1) dupes
ON dupes.id= tablename.id

Note that I am using com_content in both places  I think I have to use "alias" in one of the lines....
0
 
LVL 58

Expert Comment

by:Gary
ID: 40398261
column_with_duplicate_values_name

Should be the name of the column containing the duplicate values.
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 40398272
Getting closer

See screen captureerror
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 250 total points
ID: 40398273
don't use "select *" in the group by subquery, e.g.

SELECT * FROM com_content
INNER JOIN (
                        SELECT
                                                column(s)_with_duplicate_values_name(s)
                                              , MIN(id)  as ID
                        FROM com_content
                        GROUP BY
                                                column(s)_with_duplicate_values_name(s)
                        HAVING COUNT(id) > 1
                    ) dupes
ON dupes.id= tablename.id

Use MIN(id) or MAX(id) but you need one of these, and group by the column or columns that you consider form a unique record, don't include any more columns than these in the select clause unless they are within an aggregate function such as min() or max()
0
 
LVL 58

Expert Comment

by:Gary
ID: 40398274
tablename should be com_content

And
Good point by Paul...
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 40398275
Oops.
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40398277
SELECT * FROM m???_content
INNER JOIN (
                        SELECT
                                                column(s)_with_duplicate_values_name(s)
                                              , MIN(id)  as ID
                        FROM m???_content
                        GROUP BY
                                                column(s)_with_duplicate_values_name(s)
                        HAVING COUNT(id) > 1
                    ) dupes
ON dupes.id= m???_content.id
0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 40398287
Not quite there

SELECT * FROM com_content
INNER JOIN (SELECT * FROM com_content
GROUP BY alias HAVING COUNT(id) > 1) dupes
ON dupes.id= tablename.id

Result

mzsah_content is the table name
0
 
LVL 58

Expert Comment

by:Gary
ID: 40398289
ON dupes.id= tablename.id

com_content

and fix the sql per Paul's recommendation
0
 
LVL 58

Assisted Solution

by:Gary
Gary earned 250 total points
ID: 40398294
SELECT * FROM com_content
INNER JOIN (
	SELECT
		alias, MIN(id) as ID
	FROM com_content
	GROUP BY
		alias 
	HAVING COUNT(id) > 1
) dupes
ON dupes.id= com_content.id

Open in new window

0
 
LVL 9

Author Comment

by:Rowby Goren
ID: 40398305
Thanks

SELECT * FROM mxxxx_content
INNER JOIN (
                        SELECT
                                                alias
                                              , MIN(id)  as ID
                        FROM mxxxx_content
                        GROUP BY
                                                alias
                        HAVING COUNT(id) > 1
                    ) dupes
ON dupes.id= mxxxx_content.id

Worked!

I'll award points later.

Thanks!!!
0
 
LVL 9

Author Closing Comment

by:Rowby Goren
ID: 40406544
Thanks and sorry for delay in  awarding the points.

Rowby
0

Featured Post

NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Composite queries are used to retrieve the results from joining multiple queries after applying any filters. UNION, INTERSECT, MINUS, and UNION ALL are some of the operators used to get certain desired results.​
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

740 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question