james

asked on

# how do I delete data if rows if certain criteria match met

I have a workbook that has 2 sheets. What im trying to do is if data matches in a row on columns B,D,G on both sheets then I need the rows deleting from both sheets.

They must be an exact match.

Any Ideas?

If you want a macro, try customizing this as needed. As written, the code assumes you have header labels in row 1 of Sheet1 and Sheet2, with data in rows below that.

DeleteDuperowsBDG.xlsm

```
Sub DeleteBDGduplicateRows()
Dim aux1 As Range, aux11 As Range, aux2 As Range, aux22 As Range, rg11 As Range, rg22 As Range
Dim frmla1 As String, frmla2 As String, sh1 As String, sh2 As String
Application.ScreenUpdating = False
sh1 = "Sheet1" 'Name of worksheet to match
sh2 = "Sheet2" 'Name of other worksheet to match
frmla1 = "=COUNTIFS(" & sh2 & "!B:B,B2," & sh2 & "!D:D,D2," & sh2 & "!G:G,G2)"
frmla2 = "=COUNTIFS(" & sh1 & "!B:B,B2," & sh1 & "!D:D,D2," & sh1 & "!G:G,G2)"
Set rg11 = Worksheets(sh1).UsedRange
Set rg22 = Worksheets(sh2).UsedRange
Set rg11 = rg11.Resize(, rg11.Columns.Count + 1)
Set rg22 = rg22.Resize(, rg22.Columns.Count + 1)
Set aux11 = rg11.Columns(rg11.Columns.Count)
Set aux22 = rg22.Columns(rg22.Columns.Count)
Set aux1 = aux11.Offset(1).Resize(aux11.Rows.Count - 1)
Set aux2 = aux22.Offset(1).Resize(aux22.Rows.Count - 1)
aux1.Cells(1).Formula = frmla1
aux1.FillDown '.AutoFill Destination:=aux1, Type:=xlFillDefault
aux1.Copy
aux1.PasteSpecial Paste:=xlPasteValues
aux2.Cells(1).Formula = frmla1
aux2.FillDown 'AutoFill Destination:=aux2, Type:=xlFillDefault
aux2.Copy
aux2.PasteSpecial Paste:=xlPasteValues
rg11.AutoFilter
rg11.AutoFilter Field:=(aux11.Column - rg11.Column + 1), Criteria1:=">0"
aux1.SpecialCells(xlCellTypeVisible).EntireRow.Delete Shift:=xlUp
rg11.AutoFilter
aux11.EntireColumn.Delete
rg22.AutoFilter
rg22.AutoFilter Field:=(aux22.Column - rg22.Column + 1), Criteria1:=">0"
aux2.SpecialCells(xlCellTypeVisible).EntireRow.Delete Shift:=xlUp
rg22.AutoFilter
aux22.EntireColumn.Delete
End Sub
```

DeleteDuperowsBDG.xlsm

- Load both sheets to arrays
- Load 1 array in a "dictionary" with b d g as key
- Setup 2 target arrays
- If key exist skip row when populating target arrays
- Clear sheets and Dump arrays to sheets

1. Load source sheets to query ( data > from table )

2. Add custom col to merge b d g in both query

3. Merge on cost col

4. Expand

5. Filter

6. Load to sheet

7. Hide source sheets

ASKER

Sorry can I change it a bit as confusing myself.

In the attached workbook I have a difference of £30.35 of which im trying to find out what makes it up.

I think what I need to do is look in column B first and then in column G . If there is a matching positive and negative value in column G and the number in column B is the same for both entries then I need the lines deleting.

If you look at lines 3 &4 there is a negative and a positive £1410.00 and the digits in column B are the same so need them lines deleting. I need this to look all over the workbook and delete matching as required

https://1drv.ms/x/s!AlFxNGHiZueYgZwIbJ8O3XYajYpoTQ?e=hXfgiA

Any Ideas?

In the attached workbook I have a difference of £30.35 of which im trying to find out what makes it up.

I think what I need to do is look in column B first and then in column G . If there is a matching positive and negative value in column G and the number in column B is the same for both entries then I need the lines deleting.

If you look at lines 3 &4 there is a negative and a positive £1410.00 and the digits in column B are the same so need them lines deleting. I need this to look all over the workbook and delete matching as required

https://1drv.ms/x/s!AlFxNGHiZueYgZwIbJ8O3XYajYpoTQ?e=hXfgiA

Any Ideas?

I put the following formula in cell H2 and copied it down. This formula returns 0 if there are no matches for columns B and D with both positive and negative values in column G. Filter the results of this formula for values greater than 0. Those are the ones that need to be deleted. The formula took quite a while to copy down on my Mac running Excel in a Parallels virtual machine.

`=COUNTIFS(B:B,B2,D:D,D2,G:G,-G2)`

If you don't need to match column D, then you may simplify the formula to:`=COUNTIFS(B:B,B2,G:G,-G2)`

ASKER CERTIFIED SOLUTION

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Unfortunately for you, your data is more complex than just matching the opposites as the sum of the difference is more subtile, split over multiple smaller amounts who when summed-up compensate other amounts.

Based on this I don't think above formulas will work for you.

Nevertheless for future reference I added an alternative approach with Powerquery. This has the advantage to be much more scalable and more efficient for more complex calculations on big datasets.

I added 2 sheets:

=> "Grouped", very simple query that shows you the ID's with the difference => 30,35 difference comes from ID 1022 & 1457. doesn't give you the detail by line but maybe good enough. it's also very fast.

=> "Filtered table", will do what you asked (in a few minutes) but you'll notice that for ID 1022 the difference gives you the expected result but for 1457 the 26,55 difference comes from more numbers, not just the positive/neg matches.

Depending on the number of sheets on which you want to perform these calculations we could make a "function" from the query, add some vba to first export to csv etc.. as a rule of thumb, I try to keep processing time of a model under 10 min.

To see the queries use menu data > queries and connections > right click any query and choose edit

To refresh the queries just hit the refresh all button in the data menu or right click on a table and choose refresh.

tuesday2 .xlsb

Based on this I don't think above formulas will work for you.

Nevertheless for future reference I added an alternative approach with Powerquery. This has the advantage to be much more scalable and more efficient for more complex calculations on big datasets.

I added 2 sheets:

=> "Grouped", very simple query that shows you the ID's with the difference => 30,35 difference comes from ID 1022 & 1457. doesn't give you the detail by line but maybe good enough. it's also very fast.

=> "Filtered table", will do what you asked (in a few minutes) but you'll notice that for ID 1022 the difference gives you the expected result but for 1457 the 26,55 difference comes from more numbers, not just the positive/neg matches.

Depending on the number of sheets on which you want to perform these calculations we could make a "function" from the query, add some vba to first export to csv etc.. as a rule of thumb, I try to keep processing time of a model under 10 min.

To see the queries use menu data > queries and connections > right click any query and choose edit

To refresh the queries just hit the refresh all button in the data menu or right click on a table and choose refresh.

tuesday2 .xlsb

Open in new window