Link to home
Start Free TrialLog in
Avatar of andreyman3d2k
andreyman3d2k

asked on

Excel consistency checker

I have a table of 4 columns. The first two are categories, and the second two are data for the categories, like this:

Category1    Category2    Column1    Column2
A                1                 X              Y
A                1                 X              Y
A                2                 W             Y
A                2                 W             Y
B                1                 X              Z
B                3                 X              Y
B                3                 X              Y
B                3                 X              Y

For any given combination in Category 1 and Category 2, data in Column 1 and Column 2 Should be the same. So:

Correct:
Category1    Category2    Column1    Column2
A                1                 X              Y
A                1                 X              Y

Incorrect:
Category1    Category2    Column1    Column2
A                1                 X              Y
A                1                 X              Z

Incorrect:
Category1    Category2    Column1    Column2
A                1                 X              Y
A                1                 W             Y

What I need is a macro or method that will identify all the incorrect blocks of data, preferably by changing their background color.

So if it came across something like this:
Category1    Category2    Column1    Column2
A                1                 X              Y
A                1                 X              Y
A                2                 W             Y
A                2                 W             Z
B                1                 X              Z
B                3                 X              Y
B                3                 X              Y
B                3                 S              Y

It would highlight rows 3:4 and 6:8

Please let me know if this is unclear. I have attached an example file, to illustrate the point better. I would greatly appreciate any help. Thanks!

Andrey

Example.xls
Avatar of alainbryden
alainbryden
Flag of Canada image

Using an array formula as follows:
=SUM((A2=A:A)*(B2=B:B)*((C2<>C:C)+(D2<>D:D)))
You can count the number of inconsistencies for a given row in a 5th column. You can then use conditional formatting to highlight all rows with inconsistencies. I've attached your example back.

Note that here my array formula refers to the entire column. This works in 2007 but not 2003. For 2003, make your array formula refer only to range specified by the entire list.

--
Alain
result.PNG
Solution-Example.xls
Avatar of andreyman3d2k
andreyman3d2k

ASKER

Hi, sorry for the belated reply. This is some seriously amazing Excel voodoo! I have no idea how it works, but it works brilliantly -- on the sample data. I am encountering a problem because my data has over 3000 rows, and the cells have many characters, so it takes like 15-20 min to process the sheet on my reasonably fast computer. And any time I breathe on it, it tries to recalculate everything (I know I can turn this off and do it manually whenever I want). Is there a faster method than this, or is this the best approach? Thanks again, this already saves me a ton of time, just wondering if it can be expedited.
ASKER CERTIFIED SOLUTION
Avatar of alainbryden
alainbryden
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Absolutely genius, really saved me a tremendous amount of work. Thanks very much, and thanks also for the great explanation -- I tried reading up on array formulas, but your explanation was much clearer than what I could find. Truly grateful.
Hi Alain,

If you are still monitoring this, a quick question:

If I want to modify your formula to add more data columns that have to be consistent, and to make the combination for which to check 3-columned (A,B and now G) would this be the correct way to do it?  :

=SUM((A2=A$2:A$4000)*(B2=B$2:B$4000)*(G2<>G$2:G$4000)*((F2<>F$2:F$4000)+(I2<>I$2:I$4000)+(J2<>J$2:J$4000)+(K2<>K$2:K$4000)+(L2<>L$2:L$4000)+(M2<>M$2:M$4000)))

Want to make sure that for a given column A,B,G combination, the values in columns F,I,J,K,L,M are the same.

Thanks very much again, sorry for the hassle.

Andrey
Close, except you want G to be the same right? So instead of G2<>G2:G4000 you want G2=G2:G4000. I'm sure you understood this and it was just a typo on your part. You seem to get the concept perfectly.

=SUM((A2=A$2:A$4000)*(B2=B$2:B$4000)*(G2=G$2:G$4000)*((F2<>F$2:F$4000)+(I2<>I$2:I$4000)+(J2<>J$2:J$4000)+(K2<>K$2:K$4000)+(L2<>L$2:L$4000)+(M2<>M$2:M$4000)))

--
Alain
Thanks very much, Alain. Indeed that was a typo, and it explains the surprising (now not-so-surprising) results that had me baffled.

Thank you for teaching me this.

Andrey