Avatar of james
james
Flag for United Kingdom of Great Britain and Northern Ireland asked on

how do I delete data if rows if certain criteria match met

 


I have a workbook that has 2 sheets. What im trying to do is if data matches in a row on columns B,D,G on both sheets then I need the rows deleting from both sheets.

They must be an exact match.


Any Ideas?

Microsoft ExcelMicrosoft Office

Avatar of undefined
Last Comment
Cédric Tielemans

8/22/2022 - Mon
byundt

Manual approach: use an auxiliary column with COUNTIFS formulas to determine if rows have matches. You must put such formulas on both worksheets and then replace the formulas with the values returned, Having done so, you may filter for values greater than 0 and delete those rows.
=COUNTIFS(Sheet2!B:B,B2,Sheet2!D:D,D2,Sheet2!G:G,G2)

Open in new window

byundt

If you want a macro, try customizing this as needed. As written, the code assumes you have header labels in row 1 of Sheet1 and Sheet2, with data in rows below that.
Sub DeleteBDGduplicateRows()
Dim aux1 As Range, aux11 As Range, aux2 As Range, aux22 As Range, rg11 As Range, rg22 As Range
Dim frmla1 As String, frmla2 As String, sh1 As String, sh2 As String
Application.ScreenUpdating = False
sh1 = "Sheet1"  'Name of worksheet to match
sh2 = "Sheet2"  'Name of other worksheet to match
frmla1 = "=COUNTIFS(" & sh2 & "!B:B,B2," & sh2 & "!D:D,D2," & sh2 & "!G:G,G2)"
frmla2 = "=COUNTIFS(" & sh1 & "!B:B,B2," & sh1 & "!D:D,D2," & sh1 & "!G:G,G2)"

Set rg11 = Worksheets(sh1).UsedRange
Set rg22 = Worksheets(sh2).UsedRange
Set rg11 = rg11.Resize(, rg11.Columns.Count + 1)
Set rg22 = rg22.Resize(, rg22.Columns.Count + 1)
Set aux11 = rg11.Columns(rg11.Columns.Count)
Set aux22 = rg22.Columns(rg22.Columns.Count)
Set aux1 = aux11.Offset(1).Resize(aux11.Rows.Count - 1)
Set aux2 = aux22.Offset(1).Resize(aux22.Rows.Count - 1)

aux1.Cells(1).Formula = frmla1
aux1.FillDown       '.AutoFill  Destination:=aux1, Type:=xlFillDefault
aux1.Copy
aux1.PasteSpecial Paste:=xlPasteValues

aux2.Cells(1).Formula = frmla1
aux2.FillDown       'AutoFill Destination:=aux2, Type:=xlFillDefault
aux2.Copy
aux2.PasteSpecial Paste:=xlPasteValues

rg11.AutoFilter
rg11.AutoFilter Field:=(aux11.Column - rg11.Column + 1), Criteria1:=">0"
aux1.SpecialCells(xlCellTypeVisible).EntireRow.Delete Shift:=xlUp
rg11.AutoFilter
aux11.EntireColumn.Delete

rg22.AutoFilter
rg22.AutoFilter Field:=(aux22.Column - rg22.Column + 1), Criteria1:=">0"
aux2.SpecialCells(xlCellTypeVisible).EntireRow.Delete Shift:=xlUp
rg22.AutoFilter
aux22.EntireColumn.Delete

End Sub

Open in new window


DeleteDuperowsBDG.xlsm

Cédric Tielemans

  1. Load both sheets to arrays
  2. Load 1 array in a "dictionary" with b d g as key
  3. Setup 2 target arrays
  4. If key exist skip row when populating target arrays
  5. Clear sheets and Dump arrays to sheets
Alternatively, depending on how you update the source data you could do this in "powerquery" which mainly if you have no other vba would be the preferred solution imho:

1. Load source sheets to query ( data > from table )
2. Add custom col to merge b d g in both query
3. Merge on cost col
4. Expand
5. Filter
6. Load to sheet
7. Hide source sheets
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
james

ASKER
Sorry can I change it a bit as confusing myself.

In the attached workbook I have a difference of  £30.35 of which im trying to find out what makes it up.
I think what I need to do is look in column B first and then in column G . If there is a matching positive and negative value in column G and the number in column B is the same for both entries then I need the lines deleting.

If you look at lines 3 &4 there is a negative and a positive £1410.00 and the digits in column B are the same so need them lines deleting. I need this to look all over the workbook and delete matching as required

https://1drv.ms/x/s!AlFxNGHiZueYgZwIbJ8O3XYajYpoTQ?e=hXfgiA

Any Ideas?

byundt

I put the following formula in cell H2 and copied it down. This formula returns 0 if there are no matches for columns B and D with both positive and negative values in column G. Filter the results of this formula for values greater than 0. Those are the ones that need to be deleted. The formula took quite a while to copy down on my Mac running Excel in a Parallels virtual machine.
=COUNTIFS(B:B,B2,D:D,D2,G:G,-G2)

Open in new window

If you don't need to match column D, then you may simplify the formula to:
=COUNTIFS(B:B,B2,G:G,-G2)

Open in new window

ASKER CERTIFIED SOLUTION
byundt

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
Cédric Tielemans

Unfortunately for you, your data is more complex than just matching the opposites as the sum of the difference is more subtile, split over multiple smaller amounts who when summed-up compensate other amounts.
Based on this I don't think above formulas will work for you.

Nevertheless for future reference I added an alternative approach with Powerquery. This has the advantage to be much more scalable and more efficient for more complex calculations on big datasets.

I added 2 sheets:
=> "Grouped", very simple query that shows you the ID's with the difference => 30,35 difference comes from ID 1022 & 1457. doesn't give you the detail by line but maybe good enough. it's also very fast.

=> "Filtered table", will do what you asked (in a few minutes) but you'll notice that for ID 1022 the difference gives you the expected result but for 1457 the 26,55 difference comes from more numbers, not just the positive/neg matches.

Depending on the number of sheets on which you want to perform these calculations we could make a "function" from the query, add some vba to first export to csv etc.. as a rule of thumb, I try to keep processing time of a model under 10 min.

To see the queries use menu data > queries and connections > right click any query and choose edit
To refresh the queries just hit the refresh all button in the data menu or right click on a table and choose refresh.

tuesday2 .xlsb

Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.