• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1073
  • Last Modified:

Comparing Strings Between Two Columns and Highlight difference in RED

Hi  Excel Experts,
I would like to compare column A of Sheet 1 with the column A of Sheet2 and highlight the word difference in RED.
The final results should look like in the attached file. Any assistance would be appreciated….


Sheet1
Column A

StringID
Angelina27-11 Florida123Order has been processed.
Michael345Order has been processed
23ScottGeorgia


Sheet2
Column A
Michael345Order has been processed
Tanisha78notProcessed234
23ScottGeorgia
Angelina27-11 Florida123Order has been processed.


FinalResultExample.bmp
0
smc1234
Asked:
smc1234
  • 12
  • 7
1 Solution
 
DaveCommented:
Is your data layout identical, ie in the case above the first String is 4 lines on sheet 1, 5 lines on sheet. Does the remaining data "line up" or given the 'missing' line in sheet 1?

An actual Excel sample would help

If your data layout is indentical then if you have a one off need then you can use SpreadSheet Advantage to produce the non-match output, you can get a 30 day free trial from http://www.spreadsheetadvantage.com/

Cheers

Dave
0
 
dlmilleCommented:
This link gets you very close:  http://excel.bigresource.com/Track/excel-SzmrTZeT/

Dave
0
 
dlmilleCommented:
I ran the code supplied from the tip (above) with the data in two successive columns as an experiment.

Here are the results.  If this is "sufficient", I can modify to work with two workbooks.
  sampleDave  
0
Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
dlmilleCommented:
Let me try that again...

Example below:

sampleDave
0
 
dlmilleCommented:
Ok.  I adapted the code from the link I suggested.  It is about as good as I can get without spending significant time back and forth with you to understand what is or isn't a match (you might have gleaned that quandry from the support thread on that link) :)

I have attached two examples.  the r1 example compares two columns side-by-side so you can quickly see how it works.  the r2 example compares column A in Sheet1 with column A in Sheet2

Methodology:  The app compares from left to right, WORD compare to WORD (continuous text separated by a space), then character by character.  In a WORD comparison, if "shipped" is matched with "has", the "h" in the "shipped" word is deemed a match.  Once done, if you've selected the option, the algorithm will process by WORD from right to left.  In this way, if there is a gap or garbage in the middle of a text string, the ending words have a chance to match with the other dataset.

For example "rain in asdfdjfd Spain" compared with "rain in Spain" would only match "rain in" on the left to right, but would match also "Spain" processing right to left.  This may be the preferred option, though the default is Left To Right, just select NO at the prompt to process both ways.

I hope this gets you close enough for what you're trying to do (as your purpose was not stated) - if its to highlight where edits need to be, then close may be "good enough"?

Finally, as the other Dave mentioned, this isn't an easy exercise, unless the data is aligned correctly.  It would take a completely additional algorithm which may not be trivial to first somehow MATCH a row in Sheet1 with a row in Sheet2 to do the compare/highlight.  I'm assuming you can manage this.

So, here are both attachments.  Again r1 is just a side-by-side comparison for you to quickly test the veracity of the app.   r2 is the final product with data in Sheet1 and Sheet2

Dave
CompareAndHighlight-r1.xlsm
CompareAndHighlight-r2.xlsm
0
 
smc1234Author Commented:
Thank you so much for your suggestions. I will try it tomorrow.
0
 
smc1234Author Commented:
0
 
dlmilleCommented:
@smc - you have a comment to go with that file?

:)

Dave
0
 
smc1234Author Commented:
Thank you Dave so much for your help.
 r2.xlsm is very close to what I am looking for except that is comparing row 1 sheet 1 with row 2 sheet 2 etc.   I need to compare StringID sheet 1 with the closest StringID on sheet2 and highlight the difference.  The attached excelSample3 contains how the final results should look like.  Can you please let me know if that would be possible to accomplish via vb code?

excelSample3.xls
0
 
dlmilleCommented:
Do you mind if I reorder the list, if done programmatically?
0
 
dlmilleCommented:
This was quite a challenge and generally more than you might expect at E-E, except as experts we like to take on a challenge from time to time and this is one of those challenges.  So here goes!

Ok - I found a fuzzy search algorithm for string comparison here: http://excellerando.blogspot.com/2010/03/vlookup-with-fuzzy-matching-to-get.html
Some neat fuzzyVlookup, fuzzyHlookup, and fuzzymatchscore routines that helped me match as best as possible rows in Sheet 1 with rows in Sheet 2.  Once I had a match, I used the HIGHLIGHT routine by WORD to highlight the differences.

I believe I found an error in this public domain code, in the subroutine SumOfCommonStrings() as it didn't carry through the iScore tally on recursive call - potentially an errant mistake that was somehow not caught.  When I corrected for it, my test data and yours performed MUCH better - actually, the row matches were exactly correct.  I also posted back to my source to ensure the original blogger concurs, or at least has a chance to concur at this point.

For the sample data you provide, the solution works quite well.  Why don't you run it on a more intensive dataset and advise if it serves?

PS - if you can sort/align the two datasets, then this fuzzy search routine is not really necessary, but your prompts lead me to believe that that's not necessarily going to happen...  So, I sought a method for matching that I believe could work a high % of the time...

Let me know how it works out on a larger dataset!

There's significant amount of code here (re: fuzzy match routines, and highlight routines), so I hesitate to just post it in the code below.  In both cases, I believe I improved on the public sources I found that gave me a good start.  Hopefully, anyone interested enough will open the attached spreadsheet and see for his/her self!

Dave
CompareAndHighlight-r3.xlsm
0
 
smc1234Author Commented:
Hi expert,

That was exactly what I was looking for.
You did a great job! :)
However, it does not work on a large sample of data. In my case, I have more than 1500 rows of data. When I run the code it freezes everything. Do you know how to solve that?  Thank you…
0
 
dlmilleCommented:
provide me some/all of your data, if confidential then fudge it up, and I can take a look
0
 
dlmilleCommented:
Also, I created some messages in this version. While it runs, look at the statusbar at the bottom.  Make note on what its doing when it crashes - it will tell us which steps its on, etc...


Dave
CompareAndHighlight-r4.xlsm
0
 
smc1234Author Commented:
Thank you so much for all your help and for the status bar message. It is very helpful. :)
I really appreciate it.
0
 
dlmilleCommented:
So it wasn't freezing?  and the statusbar helped you understand that?

Let me know - and that's great.

Dave
0
 
smc1234Author Commented:
It only starts freezing if Sheets have more than 500 rows. I have reduced my data into subcategories, and now macro works great. :) Thank you.
0
 
dlmilleCommented:
If you can send me a sample of > 500 rows I can test to see where the error is - I like to ensure my tools work in all cases, if possible.  I want to see if its me or the fuzzymatch function that needs adaptation.

Appreciated,

Dave
0
 
smc1234Author Commented:
Hi,
Sorry for not being able to answer to you sooner. While running your code for two-three days, I realized that screen does not always freeze after 500 rows. The screen will freeze as soon as I touch the computer mouse, or if I want to work in a different application.  To unfreeze the screen, I press ESC button and then Continue on the message” The code has been interrupted, do you want to continue?” It would be nice to be able to work in any other application, while code is running. Do you know how to update code to allow multitasking? Thanks again for all your help. I really appreciate it.
0
 
dlmilleCommented:
I believe if we disable interrupt key while the macro is running then Excel should run unperturbed while you do other things...
Of course disabling interrupt means you can't stop the algorithm till it ends :)

Try this.
Private Sub CommandButton1_Click()
    Application.EnableCancelKey = xlDisabled
    'Application.CalculationInterruptKey = xlEscKey
    Application.Interactive = False

    Call CheckAgainstColumnA
    
    Application.Interactive = True
    Application.EnableCancelKey = xlInterrupt

End Sub

Open in new window

CompareAndHighlight-r5.xlsm
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 12
  • 7
Tackle projects and never again get stuck behind a technical roadblock.
Join Now