Solved

sql for duplicate rows in table

Posted on 2013-06-04
7
301 Views
Last Modified: 2013-06-05
hi experts

I have table Customer with 4 columns and 100 rows of data.
Is there a handy sql to check if there are duplicate rows of data?

thanks
0
Comment
Question by:royjayd
7 Comments
 
LVL 65

Accepted Solution

by:
Jim Horn earned 325 total points
ID: 39220305
Give this a whirl < air code > ...

SELECT Column1, Column2, Column3, Column4, Count(Column4) as duplicate_rows
FROM YourTable
GROUP BY Column1, Column2, Column3, Column4
HAVING COUNT(Column4) > 1
ORDER BY COUNT(Column4) DESC
0
 
LVL 16

Expert Comment

by:Surendra Nath
ID: 39220307
yes, do the below
;with cte
(
 select row_number() over(parition by <all your columns> order by (select 1)) rn,* from <your table>
)
select * from cte where rn>1

Open in new window


in the above code, replace the
<all your columns> with the columns list of your table
<your table> with your table name.
0
 

Author Comment

by:royjayd
ID: 39220466
<<SELECT Column1, Column2, Column3, Column4, Count(Column4) as duplicate_rows
FROM YourTable
GROUP BY Column1, Column2, Column3, Column4
HAVING COUNT(Column4) > 1
ORDER BY COUNT(Column4) DESC >>

looks good, i am trying to get a total sum of the duplicate rows
but this doesnt seem to work

SELECT Column1, Column2, Column3, Column4, Count(Column4) as duplicate_rows,
SUM(duplicate_rows)
FROM YourTable
GROUP BY Column1, Column2, Column3, Column4
HAVING COUNT(Column4) > 1
ORDER BY COUNT(Column4) DESC

thanks
0
Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

 
LVL 40

Expert Comment

by:Sharath
ID: 39220552
try this.
SELECT Column1,Column2,Column3,Column4,duplicate_rows,
      SUM(duplicate_rows) OVER ( partition BY (SELECT 1)) sum_duplicate_rows
  FROM (SELECT Column1,Column2,Column3,Column4,COUNT(Column4) AS duplicate_rows 
          FROM YourTable 
         GROUP BY Column1,Column2,Column3,Column4 
        HAVING COUNT(Column4) > 1) t1 
 ORDER BY duplicate_rows DESC 

Open in new window

0
 
LVL 65

Expert Comment

by:Jim Horn
ID: 39220573
<somewhat redundant with the above comments>

>i am trying to get a total sum of the duplicate rows
Throw the whole thing in a subquery, and sum.  
Replace the SUM with COUNT if you want distinct duplicate row columns.

SELECT SUM(a.duplicate_rows)
FROM (
   SELECT Column1, Column2, Column3, Column4, Count(Column4) as duplicate_rows,
   SUM(duplicate_rows)
   FROM YourTable
   GROUP BY Column1, Column2, Column3, Column4
   HAVING COUNT(Column4) > 1
   ORDER BY COUNT(Column4) DESC ) a
0
 

Author Closing Comment

by:royjayd
ID: 39223358
0
 
LVL 65

Expert Comment

by:Jim Horn
ID: 39223431
Thanks for the grade.  Good luck with your project.  -Jim
0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Audit has been really one of the more interesting, most useful, yet difficult to maintain topics in the history of SQL Server. In earlier versions of SQL people had very few options for auditing in SQL Server. It typically meant using SQL Trace …
Hi all, It is important and often overlooked to understand “Database properties”. Often we see questions about "log files" or "where is the database" and one of the easiest ways to get general information about your database is to use “Database p…
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…

778 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question