Solved

Macro needed to extract duplicate data

Posted on 2010-09-09
30
176 Views
Last Modified: 2012-05-10
Hi all,

I need a macro that extracts duplicate data from a spreadsheet where two rows have the same text string in column A &  D

For example
A5 - Client 1 D5 04/06/08
A99 - Client 1 D5 04/06/08

The data is not in a sorted order.

Hope this is cear
0
Comment
Question by:reportmania
  • 11
  • 10
  • 4
  • +2
30 Comments
 
LVL 33

Expert Comment

by:jppinto
ID: 33636155
Here's an example that will highlight the duplicates on your sheet:

http://excel-user.blogspot.com/2009/10/highlight-duplicates.html

jppinto
0
 
LVL 33

Expert Comment

by:jppinto
ID: 33636166
0
 
LVL 24

Expert Comment

by:StephenJR
ID: 33636173
Do you want the duplicate row(s) deleted or moved elsewhere so that you have only uniques left?
0
 
LVL 33

Expert Comment

by:jppinto
ID: 33636182
Here's another technique:

http://www.mrexcel.com/tip138.shtml

jppinto
0
 
LVL 33

Expert Comment

by:jppinto
ID: 33636196
If you want a Macro, take a look at this one:

http://www.ozgrid.com/VBA/RemoveDuplicates.htm

jppinto
0
 

Author Comment

by:reportmania
ID: 33636377
moved to a new sheet not deleted i need to analyse them
0
 

Author Comment

by:reportmania
ID: 33636387
but i need the whole row of where the duplicates occured to be copied across
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33636442

I'm assuming you mean if 2 rows have the same value in A AND D then delete the dup.

sub delete_dups()

dim c1 as range

dim c2 as range



for each c1 in range("A1:A" & range("A" & range("A:A").rows.count).end(xlup).row)

       for each c2 in range("A" & c1.row & ":A" & range("A" & range("A:A").rows.count).end(xlup).row)

            if c1.value = c2.value and c1.offset(0,3).value = c2.offset(0,3).value  then

                    c2.delete

            end if

       next

next

end sub

Open in new window

0
 

Author Comment

by:reportmania
ID: 33636464
no, if two rows have the same value in "a" & "d" then extract both rows into a new sheet
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33636482
just saw your last post.....this will copy the 2 matching rows on sheet1 to sheet2
ub delete_dups()  

dim c1 as range  

dim c2 as range  

dim cnt as long

cnt = 1 

dim ws1 as worksheet

dim ws 2 as worksheet

set ws1 = sheet1

set ws2 = sheet2

for each c1 in ws1.range("A1:A" & ws1.range("A" & ws1.range("A:A").rows.count).end(xlup).row)  

       for each c2 in ws1.range("A" & c1.row & ":A" & ws1.range("A" & ws1.range("A:A").rows.count).end(xlup).row)  

            if c1.value = c2.value and c1.offset(0,3).value = c2.offset(0,3).value  then  

                    c1.entirerow.copy ws2.range("A" & cnt)

                    cnt = cnt + 1

                    c2.entirerow.copy ws2.range("A" & cnt)

                    cnt = cnt + 1

            end if  

       next  

next  

end sub

Open in new window

0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33636491
remove that space between ws 2......
should be
dim ws2
0
 

Author Comment

by:reportmania
ID: 33636663
this copied the data across but it wasnt just the duplicates.

Example attached of what i need the macro to do
Book2.xlsm
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 33636915
0
 

Author Comment

by:reportmania
ID: 33636934
thanks matt but i need to enbed this code ito the worksheet as I will be distributing it.
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33636962
i don't understand.  In your example the after copy rows aren't the same in A and D.  I thought you wanted to copy only rows where A and D were the same?
0
 
LVL 24

Expert Comment

by:StephenJR
ID: 33636970
This perhaps?
Sub x()

Dim rData As Range

Application.ScreenUpdating = False

With Sheets("data")
    With .Range("A1").CurrentRegion
        .Offset(1, .Columns.Count).Resize(.Rows.Count - 1, 1).Formula = "=SUMPRODUCT((" & .Columns(1).Address & "=A2)*(" & .Columns(4).Address & "=D2))"
    End With
    .AutoFilterMode = False
    .Range("A1").AutoFilter Field:=.Range("A1").CurrentRegion.Columns.Count, Criteria1:=2
    With .AutoFilter.Range
        On Error Resume Next
        Set rData = .Offset(1).Resize(.Rows.Count - 1).SpecialCells(xlCellTypeVisible)
        On Error GoTo 0
        If Not rData Is Nothing Then
            rData.Resize(, rData.Columns.Count - 1).Copy Sheets("after copy").Range("A1")
            rData.EntireRow.Delete
        End If
    End With
    .AutoFilterMode = False
    With .Range("A1").CurrentRegion
        .Columns(.Columns.Count).Clear
    End With
End With

Application.ScreenUpdating = True

End Sub

Open in new window

0
 

Author Comment

by:reportmania
ID: 33637087
MWGainesJR: - on the example ive used columns A & B  - basically where column A & C (in this case) are duplicated in more than 1 row i want a macro which copies the duplicate rows (along with leaving leaving the original where it is) into a new sheet.
0
 

Author Comment

by:reportmania
ID: 33637095
Stephen - i receive an invalid outside procedure error when running this
0
 
LVL 24

Expert Comment

by:StephenJR
ID: 33637138
Works for me. Do you have any text before line 1 or after line 30? And have you definitely copied over those two lines of code?

I should have mentioned you need to add a header row to your data in row 1, anything will do.
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33637216
I see where my problem was......here's the corrected code.....works with columns A and D

Sub delete_dups()

Dim c1 As Range

Dim c2 As Range

Dim cnt As Long

cnt = 1

Dim ws1 As Worksheet

Dim ws2 As Worksheet

Set ws1 = Sheet1

Set ws2 = Sheet2



For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)

    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then

        Exit Sub

    End If

       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)

            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then

                    c1.EntireRow.Copy ws2.Range("A" & cnt)

                    cnt = cnt + 1

                    c2.EntireRow.Copy ws2.Range("A" & cnt)

                    cnt = cnt + 1

                    GoTo nxt

            End If

       Next

nxt:

Next

End Sub

Open in new window

0
 

Author Comment

by:reportmania
ID: 33637425
Excellent - it works - only issue is it seems to duplicate the occasional row it has extracted - any ideas?
0
 
LVL 13

Accepted Solution

by:
MWGainesJR earned 500 total points
ID: 33637590
It copies both matching rows.......it goes through each row and then looks for duplicates below it.  If it find one below it, it copies the current row plus the duplicate below it.  If you don't want that and only want rows with duplicates use this:
 

Sub delete_dups()  

Dim c1 As Range  

Dim c2 As Range  

Dim cnt As Long  

cnt = 1  

Dim ws1 As Worksheet  

Dim ws2 As Worksheet  

Set ws1 = Sheet1  

Set ws2 = Sheet2  

  

For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  

    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then  

        Exit Sub  

    End If  

       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  

            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then  

                    c1.EntireRow.Copy ws2.Range("A" & cnt)  

                    cnt = cnt + 1  

                    GoTo nxt  

            End If  

       Next  

nxt:  

Next  

End Sub

Open in new window

0
 

Author Comment

by:reportmania
ID: 33637809
this works :) ideally i woud have liked it to copy the original and duplicate, but it seems it writes the duplicates twice occasionally rather than pairing them and copying them once

Thanks.
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33637849
Well if you have:
A     B     C     D
1     A     3      C
2     B     4      D
1     F     5      C
1     G     6      C
On sheet2 you'll have:
A     B     C     D
1     A     3      C
1     F     5      C
1     G     6      C
This is what my original code did.
0
 

Author Comment

by:reportmania
ID: 33637877
i got resuts like

A     B     C     D
1     A     3      C
1     A     3      C
1     F     5      C
1     G     6      C
1     G     6      C

unsure why - but thanks I'll use the second version without it copying the original :)
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33637992
wait you're right.  I see why.....that's actually the way I intended it.  Tricked my self.....If the seond works for you great!  If not, I'll get back to work and get you what you need.
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33638022
Here, this should do it:
sub delete_dups()  

Dim c1 As Range  

Dim c2 As Range  

Dim cnt As Long  

cnt = 1  

Dim ws1 As Worksheet  

Dim ws2 As Worksheet

dim origcopy as boolean

Set ws1 = Sheet1  

Set ws2 = Sheet2  

origcopy = false  

For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  

    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then  

        Exit Sub  

    End If  

       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)  

            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then

                if origcopy = false then

                    c1.EntireRow.Copy ws2.Range("A" & cnt)

                    origcopy = true  

                    cnt = cnt + 1

                end if  

                    c2.EntireRow.Copy ws2.Range("A" & cnt)  

                    cnt = cnt + 1  

            End If  

       Next

origcopy = false

Next  

End Sub

Open in new window

0
 

Author Comment

by:reportmania
ID: 33640268
tried this one and it doesnt work  - does the same as the other and the 2nd one i thought work seems to drag non duplicates across too :(
0
 
LVL 13

Expert Comment

by:MWGainesJR
ID: 33640359
ok, this time I used a check.....might not be the best way but seems to work:
 

Sub delete_dups()

Dim c1 As Range

Dim c2 As Range

Dim cnt As Long

cnt = 1

Dim ws1 As Worksheet

Dim ws2 As Worksheet

Dim origcopy As Boolean

Set ws1 = Sheet1

Set ws2 = Sheet2

origcopy = False

For Each c1 In ws1.Range("A1:A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)

    If c1.EntireRow.Interior.Color = vbRed Then

        c1.EntireRow.Interior.Color = xlNone

        GoTo nxt

    End If

    If c1.Row = ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row Then

        Exit Sub

    End If

       For Each c2 In ws1.Range("A" & c1.Row + 1 & ":A" & ws1.Range("A" & ws1.Range("A:A").Rows.Count).End(xlUp).Row)

            If c1.Value = c2.Value And c1.Offset(0, 3).Value = c2.Offset(0, 3).Value Then

                If origcopy = False Then

                    c1.EntireRow.Copy ws2.Range("A" & cnt)

                    origcopy = True

                    cnt = cnt + 1

                End If

                    c2.EntireRow.Copy ws2.Range("A" & cnt)

                    c2.EntireRow.Interior.Color = vbRed

                    cnt = cnt + 1

            End If

       Next

origcopy = False

nxt:

Next

End Sub

Open in new window

0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

A little background as to how I came to I design this code: Around 5 years ago I designed an add-in that formatted Excel files to a corporate standard, applying different cell colours and font type depending on whether the cells contained inputs,…
Workbook link problems after copying tabs to a new workbook? David Miller (dlmille) Intro Have you either copied sheets to a new workbook, and after having saved and opened that workbook, you find that there are links back to the original sou…
The viewer will learn how to create a normally distributed random variable in Excel, use a normal distribution to simulate the return on an investment over a period of years, Create a Monte Carlo simulation using a normal random variable, and calcul…
This Micro Tutorial will demonstrate how to use longer labels with horizontal bar charts instead of the vertical column chart.

929 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now