?
Solved

Count Duplicate Records in SQL 2008 Table

Posted on 2014-07-24
3
Medium Priority
?
242 Views
Last Modified: 2014-07-25
I have a table that has "repeating" rows of data.

For example:

col1 | col2 | col3 | col4 |
  A          B        B         C
  A          B        B         C
  A          B        B         C
  A          C        C         D
  B          C        C         D
  B          D        E         F
  B          D        E         F

I'm trying to select the records where all values for all columns are the same (the first three and last two rows in my example).  

Any help will be greatly appreciated.  

TIA!
0
Comment
Question by:ttist25
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
3 Comments
 
LVL 66

Accepted Solution

by:
Jim Horn earned 1000 total points
ID: 40217283
Give this a whirl..
SELECT col1, col2, col3, col4
FROM your_table
GROUP BY col1, col2, col3, col4
HAVING COUNT(col4) > 1
ORDER BY col1, col2, col3, col4

Open in new window

0
 
LVL 49

Assisted Solution

by:PortletPaul
PortletPaul earned 1000 total points
ID: 40218549
to actually specify which rows are redundant requires a unique identifier for each record. e.g. the following identifies rows 2,3,7 as redundant.
| ID | COL1 | COL2 | COL3 | COL4 |
|----|------|------|------|------|
|  2 |    A |    B |    B |    C |
|  3 |    A |    B |    B |    C |
|  7 |    B |    D |    E |    F |

SELECT
      *
FROM YourTable
WHERE id NOT IN (
            SELECT
                  MIN(id) AS min_id
            FROM YourTable
            GROUP BY
                  col1
                , col2
                , col3
                , col4
      )
;

CREATE TABLE YourTable
	( ID int identity(1,1), [col1] varchar(1), [col2] varchar(1), [col3] varchar(1), [col4] varchar(1))
;
	
INSERT INTO YourTable
	([col1], [col2], [col3], [col4])
VALUES
	('A', 'B', 'B', 'C'),
	('A', 'B', 'B', 'C'),
	('A', 'B', 'B', 'C'),
	('A', 'C', 'C', 'D'),
	('B', 'C', 'C', 'D'),
	('B', 'D', 'E', 'F'),
	('B', 'D', 'E', 'F')
;

http://sqlfiddle.com/#!3/6692a/3

Open in new window

This PAQ may be helpful
0
 
LVL 1

Author Comment

by:ttist25
ID: 40220406
Hey guys,

Happy Friday!  Thanks for the responses.  I actually figured this out myself (kind of) by using a script I found in this great article by Gregory Larsen.

Here is the code I used from that article:
declare @cmd varchar(4000)
declare @table varchar(100)
declare @curr_col varchar(100)
declare @old_col varchar(100)
declare @column_names varchar(4000)
-- Set the table to look for duplicates
set @table = 'YourTableNameHere'
set @curr_col = ''
-- Get name of first column 
select top 1 @curr_col=column_name   
  from information_schema.columns 
  where table_name = @table order by column_name
set @column_names = @curr_col
set @old_col = @curr_col
-- Get name of second column
select top 1 @curr_col=column_name 
  from information_schema.columns 
  where table_name = @table 
                  and 
        column_name > @old_col 
  order by column_name
-- Process all columns
while @curr_col <> @old_col
begin
  set @column_names = rtrim(@column_names) + ',' + rtrim(@curr_col)
  set @old_col = @curr_col
  -- Get next column 
  select top 1 @curr_col=column_name 
    from information_schema.columns 
    where table_name = @table 
                    and 
          column_name > @old_col 
    order by column_name
end
-- build the command to search for duplicates
set @cmd = 'select * from ' + rtrim(@table) +
           ' group by ' + rtrim(@column_names) + 
           ' having count(*) > 1'
-- Find duplicates
exec (@cmd)

Open in new window


This is a great little script for the more dimly witted of us (namely me).  

Thanks again for the responses - you guys both are always a huge help to me and I thank you for it.  Have a great weekend!!!
0

Featured Post

U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Via a live example, show how to shrink a transaction log file down to a reasonable size.
Viewers will learn how the fundamental information of how to create a table.
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question