Solved

Macro needed to extract duplicate data

Posted on 2010-09-09
30
178 Views
Last Modified: 2012-05-10
Hi all,

I need a macro that extracts duplicate data from a spreadsheet where two rows have the same text string in column A &  D

For example
A5 - Client 1 D5 04/06/08
A99 - Client 1 D5 04/06/08

The data is not in a sorted order.

Hope this is cear
0
Comment
Question by:reportmania
  • 11
  • 10
  • 4
  • +2
30 Comments
 
LVL 33

Expert Comment

by:jppinto
ID: 33636155
Here's an example that will highlight the duplicates on your sheet:

http://excel-user.blogspot.com/2009/10/highlight-duplicates.html

jppinto
0
 
LVL 33

Expert Comment

by:jppinto
ID: 33636166
0
 
LVL 24

Expert Comment

by:StephenJR
ID: 33636173
Do you want the duplicate row(s) deleted or moved elsewhere so that you have only uniques left?
0
Active Directory Webinar

We all know we need to protect and secure our privileges, but where to start? Join Experts Exchange and ManageEngine on Tuesday, April 11, 2017 10:00 AM PDT to learn how to track and secure privileged users in Active Directory.

 
LVL 33

Expert Comment

by:jppinto
ID: 33636182
Here's another technique:

http://www.mrexcel.com/tip138.shtml

jppinto
0
 
LVL 33

Expert Comment

by:jppinto
ID: 33636196
If you want a Macro, take a look at this one:

http://www.ozgrid.com/VBA/RemoveDuplicates.htm

jppinto
0
 

Author Comment

by:reportmania
ID: 33636377
moved to a new sheet not deleted i need to analyse them
0
 

Author Comment

by:reportmania
ID: 33636387
but i need the whole row of where the duplicates occured to be copied across
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33636442

I'm assuming you mean if 2 rows have the same value in A AND D then delete the dup.

sub delete_dups()
dim c1 as range
dim c2 as range

for each c1 in range("A1:A" & range("A" & range("A:A").rows.count).end(xlup).row)
       for each c2 in range("A" & c1.row & ":A" & range("A" & range("A:A").rows.count).end(xlup).row)
            if c1.value = c2.value and c1.offset(0,3).value = c2.offset(0,3).value  then
                    c2.delete
            end if
       next
next
end sub

Open in new window

0
 

Author Comment

by:reportmania
ID: 33636464
no, if two rows have the same value in "a" & "d" then extract both rows into a new sheet
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33636482
just saw your last post.....this will copy the 2 matching rows on sheet1 to sheet2
ub delete_dups()  
dim c1 as range  
dim c2 as range  
dim cnt as long
cnt = 1 
dim ws1 as worksheet
dim ws 2 as worksheet
set ws1 = sheet1
set ws2 = sheet2
for each c1 in ws1.range("A1:A" & ws1.range("A" & ws1.range("A:A").rows.count).end(xlup).row)  
       for each c2 in ws1.range("A" & c1.row & ":A" & ws1.range("A" & ws1.range("A:A").rows.count).end(xlup).row)  
            if c1.value = c2.value and c1.offset(0,3).value = c2.offset(0,3).value  then  
                    c1.entirerow.copy ws2.range("A" & cnt)
                    cnt = cnt + 1
                    c2.entirerow.copy ws2.range("A" & cnt)
                    cnt = cnt + 1
            end if  
       next  
next  
end sub

Open in new window

0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33636491
remove that space between ws 2......
should be
dim ws2
0
 

Author Comment

by:reportmania
ID: 33636663
this copied the data across but it wasnt just the duplicates.

Example attached of what i need the macro to do
Book2.xlsm
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 33636915
0
 

Author Comment

by:reportmania
ID: 33636934
thanks matt but i need to enbed this code ito the worksheet as I will be distributing it.
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33636962
i don't understand.  In your example the after copy rows aren't the same in A and D.  I thought you wanted to copy only rows where A and D were the same?
0
 
LVL 24

Expert Comment

by:StephenJR
ID: 33636970
This perhaps?
Sub x()

Dim rData As Range

Application.ScreenUpdating = False

With Sheets("data")
    With .Range("A1").CurrentRegion
        .Offset(1, .Columns.Count).Resize(.Rows.Count - 1, 1).Formula = "=SUMPRODUCT((" & .Columns(1).Address & "=A2)*(" & .Columns(4).Address & "=D2))"
    End With
    .AutoFilterMode = False
    .Range("A1").AutoFilter Field:=.Range("A1").CurrentRegion.Columns.Count, Criteria1:=2
    With .AutoFilter.Range
        On Error Resume Next
        Set rData = .Offset(1).Resize(.Rows.Count - 1).SpecialCells(xlCellTypeVisible)
        On Error GoTo 0
        If Not rData Is Nothing Then
            rData.Resize(, rData.Columns.Count - 1).Copy Sheets("after copy").Range("A1")
            rData.EntireRow.Delete
        End If
    End With
    .AutoFilterMode = False
    With .Range("A1").CurrentRegion
        .Columns(.Columns.Count).Clear
    End With
End With

Application.ScreenUpdating = True

End Sub

Open in new window

0
 

Author Comment

by:reportmania
ID: 33637087
MWGainesJR: - on the example ive used columns A & B  - basically where column A & C (in this case) are duplicated in more than 1 row i want a macro which copies the duplicate rows (along with leaving leaving the original where it is) into a new sheet.
0
 

Author Comment

by:reportmania
ID: 33637095
Stephen - i receive an invalid outside procedure error when running this
0
 
LVL 24

Expert Comment

by:StephenJR
ID: 33637138
Works for me. Do you have any text before line 1 or after line 30? And have you definitely copied over those two lines of code?

I should have mentioned you need to add a header row to your data in row 1, anything will do.
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33637216
I see where my problem was......here's the corrected code.....works with columns A and D

Sub delete_dups()
Dim c1 As Range
Dim c2 As Range
Dim cnt As Long
cnt = 1
Dim ws1 As Worksheet
Dim ws2 As Worksheet
Set ws1 = Sheet1
Set ws2 = Sheet2

For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)
    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then
        Exit Sub
    End If
       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)
            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then
                    c1.EntireRow.Copy ws2.Range("A" & cnt)
                    cnt = cnt + 1
                    c2.EntireRow.Copy ws2.Range("A" & cnt)
                    cnt = cnt + 1
                    GoTo nxt
            End If
       Next
nxt:
Next
End Sub

Open in new window

0
 

Author Comment

by:reportmania
ID: 33637425
Excellent - it works - only issue is it seems to duplicate the occasional row it has extracted - any ideas?
0
 
LVL 13

Accepted Solution

by:
MWGainesJR earned 500 total points
ID: 33637590
It copies both matching rows.......it goes through each row and then looks for duplicates below it.  If it find one below it, it copies the current row plus the duplicate below it.  If you don't want that and only want rows with duplicates use this:
 

Sub delete_dups()  
Dim c1 As Range  
Dim c2 As Range  
Dim cnt As Long  
cnt = 1  
Dim ws1 As Worksheet  
Dim ws2 As Worksheet  
Set ws1 = Sheet1  
Set ws2 = Sheet2  
  
For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  
    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then  
        Exit Sub  
    End If  
       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  
            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then  
                    c1.EntireRow.Copy ws2.Range("A" & cnt)  
                    cnt = cnt + 1  
                    GoTo nxt  
            End If  
       Next  
nxt:  
Next  
End Sub

Open in new window

0
 

Author Comment

by:reportmania
ID: 33637809
this works :) ideally i woud have liked it to copy the original and duplicate, but it seems it writes the duplicates twice occasionally rather than pairing them and copying them once

Thanks.
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33637849
Well if you have:
A     B     C     D
1     A     3      C
2     B     4      D
1     F     5      C
1     G     6      C
On sheet2 you'll have:
A     B     C     D
1     A     3      C
1     F     5      C
1     G     6      C
This is what my original code did.
0
 

Author Comment

by:reportmania
ID: 33637877
i got resuts like

A     B     C     D
1     A     3      C
1     A     3      C
1     F     5      C
1     G     6      C
1     G     6      C

unsure why - but thanks I'll use the second version without it copying the original :)
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33637992
wait you're right.  I see why.....that's actually the way I intended it.  Tricked my self.....If the seond works for you great!  If not, I'll get back to work and get you what you need.
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33638022
Here, this should do it:
sub delete_dups()  
Dim c1 As Range  
Dim c2 As Range  
Dim cnt As Long  
cnt = 1  
Dim ws1 As Worksheet  
Dim ws2 As Worksheet
dim origcopy as boolean
Set ws1 = Sheet1  
Set ws2 = Sheet2  
origcopy = false  
For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  
    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then  
        Exit Sub  
    End If  
       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  
            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then
                if origcopy = false then
                    c1.EntireRow.Copy ws2.Range("A" & cnt)
                    origcopy = true  
                    cnt = cnt + 1
                end if  
                    c2.EntireRow.Copy ws2.Range("A" & cnt)  
                    cnt = cnt + 1  
            End If  
       Next
origcopy = false
Next  
End Sub

Open in new window

0
 

Author Comment

by:reportmania
ID: 33640268
tried this one and it doesnt work  - does the same as the other and the 2nd one i thought work seems to drag non duplicates across too :(
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33640359
ok, this time I used a check.....might not be the best way but seems to work:
 

Sub delete_dups()
Dim c1 As Range
Dim c2 As Range
Dim cnt As Long
cnt = 1
Dim ws1 As Worksheet
Dim ws2 As Worksheet
Dim origcopy As Boolean
Set ws1 = Sheet1
Set ws2 = Sheet2
origcopy = False
For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)
    If c1.EntireRow.Interior.Color = vbRed Then
        c1.EntireRow.Interior.Color = xlNone
        GoTo nxt
    End If
    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then
        Exit Sub
    End If
       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)
            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then
                If origcopy = False Then
                    c1.EntireRow.Copy ws2.Range("A" & cnt)
                    origcopy = True
                    cnt = cnt + 1
                End If
                    c2.EntireRow.Copy ws2.Range("A" & cnt)
                    c2.EntireRow.Interior.Color = vbRed
                    cnt = cnt + 1
            End If
       Next
origcopy = False
nxt:
Next
End Sub

Open in new window

0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Text box keydown event does not work 38 42
Cost allcocation ... 10 23
Countdown Timer 2 17
how to ignore blank cells from the data validation list? 16 33
This article is the result of a quest to better understand Task Scheduler 2.0 and all the newer objects available in vbscript in this version over  the limited options we had scripting in Task Scheduler 1.0.  As I started my journey of knowledge I f…
This code takes an Excel list of URL’s and adds a header titled “URL List”. It then searches through all URL’s in column “A”, looking for duplicates. When a duplicate is found, it is moved to the top of the list. The duplicate URL’s are then highlig…
The viewer will learn how to create a normally distributed random variable in Excel, use a normal distribution to simulate the return on an investment over a period of years, Create a Monte Carlo simulation using a normal random variable, and calcul…
This Micro Tutorial demonstrates in Microsoft Excel how to consolidate your marketing data by creating an interactive charts using form controls. This creates cool drop-downs for viewers of your chart to choose from.

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question