Excel 2010 - compare two columns of data with names not consistently entered

I am trying to find the quickest way to complete the following task:
 
I am using Excel 2010 - the spreadsheet I am working on has two columns of information (names).
Column A has 200 names listed and Column B has 1500 names listed.

I am trying to find out what names listed in Column A are in Column B -- the issue is the names are not consistently entered in both columns.

So - Column A may show ABC Computer Company and column B may list the name as ABC Computer Co Inc.

Any input would be appreciated.
Thanks
mmj1Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

[ fanpages ]IT Services ConsultantCommented:
What kind of tolerance of difference between a name in one list, compared to a name in the other list, is to be acceptable?

Two methods of interrogation immediately come to mind that approach the same "fuzzy logic" task in different ways:

Soundex (a Phonetic Algorithm) primarily used just for surnames (& tailored towards the region due to the occurrence of similar surnames in a set of data),

...and...

Levenshtein distance (or, "Edit distance") recording the number of individual characters changes (replacements, removals, or insertions) required to change one value into another value.


Alternatively, certain frequent sub-strings ("Company"/"Co.", "Limited"/"Ltd.", "Incorporated"/"Inc.") could be removed OR replaced for an alias (or aliases) from one or both columns during the comparison.

You will need to determine how similar one value can be to another value before any reasonable attempt can be made to address your requirement.

A sample workbook attached to this thread may also help any contributing Expert, but please obscure any details that may uniquely identify a third party.
0
mmj1Author Commented:
Unfortunately I am unable to provide a sample of the list.  I suppose I was looking for comparison more on the "key" words for example - if the company name is "Sampson Brothers Lumber Co" in Column A and "Sampson Brothers Lumber Co. Inc." in column B - I was more looking to partially match on the key word "Sampson" or "Sampson Brothers" - I know this may not be possible.
0
[ fanpages ]IT Services ConsultantCommented:
It is possible, but depending on the number of discrete "words" within a single cell (value), the time taken to compare each "word" of the values within the first list with every other "word" of the comparison (second) list will obviously increase.

Also, if a match is found ("Sampson" is within "Sampson Brothers", for instance), it may not be the most appropriate (or "best") match.  Should a search for one particular name (or sub-division of that name, on a word-by-word basis) stop searching the comparison list if a potential match is found, or should it continue to find a "better" match?

That is, if "Sampson" is within "Sampson Brothers", but there is also a "Sampson Limited", or a "C.Sampson & Sons", which is the correct value?

As I said above, you will need to determine the rules for the matching process, then Experts may propose their suggestions to address these.

I appreciate provision of a sample is not an option, but (fictitious) examples of both lists will certainly help those wishing to contribute (& will also mean that a known set of test data is used throughout, so you can then compare all available solutions for accuracy, or the closest proximity to your desired result).
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
mmj1Author Commented:
I really appreciate your assistance in trying to assist me with my question.  As I am working on some other projects at this time I am going to have to put this one on the back burner and didn't want to leave the question abandoned.  Thank you again for trying to assist me.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Excel

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.