Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 587
  • Last Modified:

Excel consistency checker

I have a table of 4 columns. The first two are categories, and the second two are data for the categories, like this:

Category1    Category2    Column1    Column2
A                1                 X              Y
A                1                 X              Y
A                2                 W             Y
A                2                 W             Y
B                1                 X              Z
B                3                 X              Y
B                3                 X              Y
B                3                 X              Y

For any given combination in Category 1 and Category 2, data in Column 1 and Column 2 Should be the same. So:

Correct:
Category1    Category2    Column1    Column2
A                1                 X              Y
A                1                 X              Y

Incorrect:
Category1    Category2    Column1    Column2
A                1                 X              Y
A                1                 X              Z

Incorrect:
Category1    Category2    Column1    Column2
A                1                 X              Y
A                1                 W             Y

What I need is a macro or method that will identify all the incorrect blocks of data, preferably by changing their background color.

So if it came across something like this:
Category1    Category2    Column1    Column2
A                1                 X              Y
A                1                 X              Y
A                2                 W             Y
A                2                 W             Z
B                1                 X              Z
B                3                 X              Y
B                3                 X              Y
B                3                 S              Y

It would highlight rows 3:4 and 6:8

Please let me know if this is unclear. I have attached an example file, to illustrate the point better. I would greatly appreciate any help. Thanks!

Andrey

Example.xls
0
andreyman3d2k
Asked:
andreyman3d2k
  • 4
  • 3
1 Solution
 
alainbrydenCommented:
Using an array formula as follows:
=SUM((A2=A:A)*(B2=B:B)*((C2<>C:C)+(D2<>D:D)))
You can count the number of inconsistencies for a given row in a 5th column. You can then use conditional formatting to highlight all rows with inconsistencies. I've attached your example back.

Note that here my array formula refers to the entire column. This works in 2007 but not 2003. For 2003, make your array formula refer only to range specified by the entire list.

--
Alain
result.PNG
Solution-Example.xls
0
 
andreyman3d2kAuthor Commented:
Hi, sorry for the belated reply. This is some seriously amazing Excel voodoo! I have no idea how it works, but it works brilliantly -- on the sample data. I am encountering a problem because my data has over 3000 rows, and the cells have many characters, so it takes like 15-20 min to process the sheet on my reasonably fast computer. And any time I breathe on it, it tries to recalculate everything (I know I can turn this off and do it manually whenever I want). Is there a faster method than this, or is this the best approach? Thanks again, this already saves me a ton of time, just wondering if it can be expedited.
0
 
alainbrydenCommented:
3000 rows shouldn't be a big problem. It will probably work faster if you change the range in the array formula just to include your rows, so instead of
=SUM((A2=A:A)*(B2=B:B)*((C2<>C:C)+(D2<>D:D)))
try to use
=SUM((A2=A$2:A$4000)*(B2=B$2:B$4000)*((C2<>C$2:C$4000)+(D2<>D$2:D$4000)))

I can explain how it works too.
A2=A:A returns an array of true or false values. It will be true for every row where the value in column A for this row matches the value for column A for all other rows.
So you might get {false, true, true, true, true, false, false....} as an array result.
The same thing is done for column B to get an array of values where column B matches, same with C and D.
The * and + are used as boolean operators. * means AND and + means OR. It works on an array element by element basis, so basically I've asked it to count the number of rows where A and B are the same, but either C or D (or both) don't match.

--
Alain
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
andreyman3d2kAuthor Commented:
Absolutely genius, really saved me a tremendous amount of work. Thanks very much, and thanks also for the great explanation -- I tried reading up on array formulas, but your explanation was much clearer than what I could find. Truly grateful.
0
 
andreyman3d2kAuthor Commented:
Hi Alain,

If you are still monitoring this, a quick question:

If I want to modify your formula to add more data columns that have to be consistent, and to make the combination for which to check 3-columned (A,B and now G) would this be the correct way to do it?  :

=SUM((A2=A$2:A$4000)*(B2=B$2:B$4000)*(G2<>G$2:G$4000)*((F2<>F$2:F$4000)+(I2<>I$2:I$4000)+(J2<>J$2:J$4000)+(K2<>K$2:K$4000)+(L2<>L$2:L$4000)+(M2<>M$2:M$4000)))

Want to make sure that for a given column A,B,G combination, the values in columns F,I,J,K,L,M are the same.

Thanks very much again, sorry for the hassle.

Andrey
0
 
alainbrydenCommented:
Close, except you want G to be the same right? So instead of G2<>G2:G4000 you want G2=G2:G4000. I'm sure you understood this and it was just a typo on your part. You seem to get the concept perfectly.

=SUM((A2=A$2:A$4000)*(B2=B$2:B$4000)*(G2=G$2:G$4000)*((F2<>F$2:F$4000)+(I2<>I$2:I$4000)+(J2<>J$2:J$4000)+(K2<>K$2:K$4000)+(L2<>L$2:L$4000)+(M2<>M$2:M$4000)))

--
Alain
0
 
andreyman3d2kAuthor Commented:
Thanks very much, Alain. Indeed that was a typo, and it explains the surprising (now not-so-surprising) results that had me baffled.

Thank you for teaching me this.

Andrey
0

Featured Post

[Webinar] Database Backup and Recovery

Does your company store data on premises, off site, in the cloud, or a combination of these? If you answered “yes”, you need a data backup recovery plan that fits each and every platform. Watch now as as Percona teaches us how to build agile data backup recovery plan.

  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now