Solved

Macro needed to extract duplicate data

Posted on 2010-09-09
30
175 Views
Last Modified: 2012-05-10
Hi all,

I need a macro that extracts duplicate data from a spreadsheet where two rows have the same text string in column A &  D

For example
A5 - Client 1 D5 04/06/08
A99 - Client 1 D5 04/06/08

The data is not in a sorted order.

Hope this is cear
0
Comment
Question by:reportmania
  • 11
  • 10
  • 4
  • +2
30 Comments
 
LVL 33

Expert Comment

by:jppinto
Comment Utility
Here's an example that will highlight the duplicates on your sheet:

http://excel-user.blogspot.com/2009/10/highlight-duplicates.html

jppinto
0
 
LVL 33

Expert Comment

by:jppinto
Comment Utility
0
 
LVL 24

Expert Comment

by:StephenJR
Comment Utility
Do you want the duplicate row(s) deleted or moved elsewhere so that you have only uniques left?
0
 
LVL 33

Expert Comment

by:jppinto
Comment Utility
Here's another technique:

http://www.mrexcel.com/tip138.shtml

jppinto
0
 
LVL 33

Expert Comment

by:jppinto
Comment Utility
If you want a Macro, take a look at this one:

http://www.ozgrid.com/VBA/RemoveDuplicates.htm

jppinto
0
 

Author Comment

by:reportmania
Comment Utility
moved to a new sheet not deleted i need to analyse them
0
 

Author Comment

by:reportmania
Comment Utility
but i need the whole row of where the duplicates occured to be copied across
0
 
LVL 13

Expert Comment

by:MWGainesJR
Comment Utility

I'm assuming you mean if 2 rows have the same value in A AND D then delete the dup.

sub delete_dups()

dim c1 as range

dim c2 as range



for each c1 in range("A1:A" & range("A" & range("A:A").rows.count).end(xlup).row)

       for each c2 in range("A" & c1.row & ":A" & range("A" & range("A:A").rows.count).end(xlup).row)

            if c1.value = c2.value and c1.offset(0,3).value = c2.offset(0,3).value  then

                    c2.delete

            end if

       next

next

end sub

Open in new window

0
 

Author Comment

by:reportmania
Comment Utility
no, if two rows have the same value in "a" & "d" then extract both rows into a new sheet
0
 
LVL 13

Expert Comment

by:MWGainesJR
Comment Utility
just saw your last post.....this will copy the 2 matching rows on sheet1 to sheet2
ub delete_dups()  

dim c1 as range  

dim c2 as range  

dim cnt as long

cnt = 1 

dim ws1 as worksheet

dim ws 2 as worksheet

set ws1 = sheet1

set ws2 = sheet2

for each c1 in ws1.range("A1:A" & ws1.range("A" & ws1.range("A:A").rows.count).end(xlup).row)  

       for each c2 in ws1.range("A" & c1.row & ":A" & ws1.range("A" & ws1.range("A:A").rows.count).end(xlup).row)  

            if c1.value = c2.value and c1.offset(0,3).value = c2.offset(0,3).value  then  

                    c1.entirerow.copy ws2.range("A" & cnt)

                    cnt = cnt + 1

                    c2.entirerow.copy ws2.range("A" & cnt)

                    cnt = cnt + 1

            end if  

       next  

next  

end sub

Open in new window

0
 
LVL 13

Expert Comment

by:MWGainesJR
Comment Utility
remove that space between ws 2......
should be
dim ws2
0
 

Author Comment

by:reportmania
Comment Utility
this copied the data across but it wasnt just the duplicates.

Example attached of what i need the macro to do
Book2.xlsm
0
 
LVL 92

Expert Comment

by:Patrick Matthews
Comment Utility
0
 

Author Comment

by:reportmania
Comment Utility
thanks matt but i need to enbed this code ito the worksheet as I will be distributing it.
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 13

Expert Comment

by:MWGainesJR
Comment Utility
i don't understand.  In your example the after copy rows aren't the same in A and D.  I thought you wanted to copy only rows where A and D were the same?
0
 
LVL 24

Expert Comment

by:StephenJR
Comment Utility
This perhaps?
Sub x()

Dim rData As Range

Application.ScreenUpdating = False

With Sheets("data")
    With .Range("A1").CurrentRegion
        .Offset(1, .Columns.Count).Resize(.Rows.Count - 1, 1).Formula = "=SUMPRODUCT((" & .Columns(1).Address & "=A2)*(" & .Columns(4).Address & "=D2))"
    End With
    .AutoFilterMode = False
    .Range("A1").AutoFilter Field:=.Range("A1").CurrentRegion.Columns.Count, Criteria1:=2
    With .AutoFilter.Range
        On Error Resume Next
        Set rData = .Offset(1).Resize(.Rows.Count - 1).SpecialCells(xlCellTypeVisible)
        On Error GoTo 0
        If Not rData Is Nothing Then
            rData.Resize(, rData.Columns.Count - 1).Copy Sheets("after copy").Range("A1")
            rData.EntireRow.Delete
        End If
    End With
    .AutoFilterMode = False
    With .Range("A1").CurrentRegion
        .Columns(.Columns.Count).Clear
    End With
End With

Application.ScreenUpdating = True

End Sub

Open in new window

0
 

Author Comment

by:reportmania
Comment Utility
MWGainesJR: - on the example ive used columns A & B  - basically where column A & C (in this case) are duplicated in more than 1 row i want a macro which copies the duplicate rows (along with leaving leaving the original where it is) into a new sheet.
0
 

Author Comment

by:reportmania
Comment Utility
Stephen - i receive an invalid outside procedure error when running this
0
 
LVL 24

Expert Comment

by:StephenJR
Comment Utility
Works for me. Do you have any text before line 1 or after line 30? And have you definitely copied over those two lines of code?

I should have mentioned you need to add a header row to your data in row 1, anything will do.
0
 
LVL 13

Expert Comment

by:MWGainesJR
Comment Utility
I see where my problem was......here's the corrected code.....works with columns A and D

Sub delete_dups()

Dim c1 As Range

Dim c2 As Range

Dim cnt As Long

cnt = 1

Dim ws1 As Worksheet

Dim ws2 As Worksheet

Set ws1 = Sheet1

Set ws2 = Sheet2



For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)

    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then

        Exit Sub

    End If

       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)

            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then

                    c1.EntireRow.Copy ws2.Range("A" & cnt)

                    cnt = cnt + 1

                    c2.EntireRow.Copy ws2.Range("A" & cnt)

                    cnt = cnt + 1

                    GoTo nxt

            End If

       Next

nxt:

Next

End Sub

Open in new window

0
 

Author Comment

by:reportmania
Comment Utility
Excellent - it works - only issue is it seems to duplicate the occasional row it has extracted - any ideas?
0
 
LVL 13

Accepted Solution

by:
MWGainesJR earned 500 total points
Comment Utility
It copies both matching rows.......it goes through each row and then looks for duplicates below it.  If it find one below it, it copies the current row plus the duplicate below it.  If you don't want that and only want rows with duplicates use this:
 

Sub delete_dups()  

Dim c1 As Range  

Dim c2 As Range  

Dim cnt As Long  

cnt = 1  

Dim ws1 As Worksheet  

Dim ws2 As Worksheet  

Set ws1 = Sheet1  

Set ws2 = Sheet2  

  

For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  

    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then  

        Exit Sub  

    End If  

       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  

            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then  

                    c1.EntireRow.Copy ws2.Range("A" & cnt)  

                    cnt = cnt + 1  

                    GoTo nxt  

            End If  

       Next  

nxt:  

Next  

End Sub

Open in new window

0
 

Author Comment

by:reportmania
Comment Utility
this works :) ideally i woud have liked it to copy the original and duplicate, but it seems it writes the duplicates twice occasionally rather than pairing them and copying them once

Thanks.
0
 
LVL 13

Expert Comment

by:MWGainesJR
Comment Utility
Well if you have:
A     B     C     D
1     A     3      C
2     B     4      D
1     F     5      C
1     G     6      C
On sheet2 you'll have:
A     B     C     D
1     A     3      C
1     F     5      C
1     G     6      C
This is what my original code did.
0
 

Author Comment

by:reportmania
Comment Utility
i got resuts like

A     B     C     D
1     A     3      C
1     A     3      C
1     F     5      C
1     G     6      C
1     G     6      C

unsure why - but thanks I'll use the second version without it copying the original :)
0
 
LVL 13

Expert Comment

by:MWGainesJR
Comment Utility
wait you're right.  I see why.....that's actually the way I intended it.  Tricked my self.....If the seond works for you great!  If not, I'll get back to work and get you what you need.
0
 
LVL 13

Expert Comment

by:MWGainesJR
Comment Utility
Here, this should do it:
sub delete_dups()  

Dim c1 As Range  

Dim c2 As Range  

Dim cnt As Long  

cnt = 1  

Dim ws1 As Worksheet  

Dim ws2 As Worksheet

dim origcopy as boolean

Set ws1 = Sheet1  

Set ws2 = Sheet2  

origcopy = false  

For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  

    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then  

        Exit Sub  

    End If  

       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  

            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then

                if origcopy = false then

                    c1.EntireRow.Copy ws2.Range("A" & cnt)

                    origcopy = true  

                    cnt = cnt + 1

                end if  

                    c2.EntireRow.Copy ws2.Range("A" & cnt)  

                    cnt = cnt + 1  

            End If  

       Next

origcopy = false

Next  

End Sub

Open in new window

0
 

Author Comment

by:reportmania
Comment Utility
tried this one and it doesnt work  - does the same as the other and the 2nd one i thought work seems to drag non duplicates across too :(
0
 
LVL 13

Expert Comment

by:MWGainesJR
Comment Utility
ok, this time I used a check.....might not be the best way but seems to work:
 

Sub delete_dups()

Dim c1 As Range

Dim c2 As Range

Dim cnt As Long

cnt = 1

Dim ws1 As Worksheet

Dim ws2 As Worksheet

Dim origcopy As Boolean

Set ws1 = Sheet1

Set ws2 = Sheet2

origcopy = False

For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)

    If c1.EntireRow.Interior.Color = vbRed Then

        c1.EntireRow.Interior.Color = xlNone

        GoTo nxt

    End If

    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then

        Exit Sub

    End If

       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)

            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then

                If origcopy = False Then

                    c1.EntireRow.Copy ws2.Range("A" & cnt)

                    origcopy = True

                    cnt = cnt + 1

                End If

                    c2.EntireRow.Copy ws2.Range("A" & cnt)

                    c2.EntireRow.Interior.Color = vbRed

                    cnt = cnt + 1

            End If

       Next

origcopy = False

nxt:

Next

End Sub

Open in new window

0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Enums (shorthand for ‘enumerations’) are not often used by programmers but they can be quite valuable when they are.  What are they? An Enum is just a type of variable like a string or an Integer, but in this case one that you create that contains…
Background What I'm presenting in this article is the result of 2 conditions in my work area: We have a SQL Server production environment but no development or test environment; andWe have an MS Access front end using tables in SQL Server but we a…
The viewer will learn how to use the =DISCRINV command to create a discrete random variable, use this command to model a set of probabilities and outcomes in a Monte Carlo simulation, and learn how to find the standard deviation of a set of probabil…
This Micro Tutorial will demonstrate on a Mac how to change the sort order for chart legend values and decrpyt the intimidating chart menu.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now