Solved

percent matching of string 1 against string 2 in excel

Posted on 2013-11-11
2
3,850 Views
Last Modified: 2013-11-16
I have two columns in excel that I'm trying to figure out how closely the two columns match with each other.

Lets say I have a column of strings (column A), and another column (column B) that I want to match against. I want to take each cell in column A, and get a "percent match" in all the cells in B so I can find the closest match in column B.

Also, the stings would be something "Cell reconfiguration engineering AA145-78XR" against "Setup AA145-78XR" which should be a close match.

Any ideas?
0
Comment
Question by:k1ng87
2 Comments
 
LVL 9

Expert Comment

by:guswebb
ID: 39640323
This could get messy....first of all you need to split the cell contents in column A to form an array and then check for the presence of each array item in the cell contents in column B. This assumes that your comparison process will treat blocks of characters (delimited by a space or other character e.g. a hyphen). So in your example above, the cell from column A would have 5 strings in it's array. When searching column B you would need to check for the presence of each of these within each column B cell. If you find a match you would increment a counter, for example. This approach is at a very basic level and could be made more complex by looking for strings in a particular sequence, or for strings that are adjacent to each other. Again taking your example, finding 'reconciliation' and then 'engineering' in the same cell in column B might increment your counter by 1 on each occasion, but if the two strings are adjacent to each other that count might be higher (stronger match) or if the two strings appear in the same order as per the cell from column A, again the counter might be higher (again a stronger match). For example...

A1='Cell reconfiguration engineering AA145-78XR'
B1='Setup AA145-78XR' (score 1 for finding AA145, score 1 for finding 78XR, score 1 because they appear in the same order as A1, and score 1 because they are adjacent to each other. Total 'matching score' = 4.

Conversely...if B1 contained '78XR whatever AA145' the score might be only 2, 1 point
for finding AA145 and 1 point for finding 78XR. No points for them being adjacent and no points for them appearing in the same order.

Now for the hard bit....you need to code this up in VBA! I'm a bit rusty in that department but no doubt someone else can rattle off that code in a few mins.
0
 
LVL 81

Accepted Solution

by:
byundt earned 500 total points
ID: 39640515
What you are requesting is called "fuzzy matching". If you have Excel 2010 or later, Microsoft has an add-in for that purpose. http://www.microsoft.com/en-us/download/details.aspx?id=15011 "Fuzzy Lookup Add-In for Excel"

If you have an earlier version of Excel, then consider the code suggested by al_b_cnu in http://www.mrexcel.com/forum/excel-questions/195635-fuzzy-matching-new-version-plus-explanation.html#post955137
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A little background as to how I came to I design this code: Around 5 years ago I designed an add-in that formatted Excel files to a corporate standard, applying different cell colours and font type depending on whether the cells contained inputs,…
Workbook link problems after copying tabs to a new workbook? David Miller (dlmille) Intro Have you either copied sheets to a new workbook, and after having saved and opened that workbook, you find that there are links back to the original sou…
The viewer will learn how to create two correlated normally distributed random variables in Excel, use a normal distribution to simulate the return on different levels of investment in each of the two funds over a period of ten years, and, create a …
Many functions in Excel can make decisions. The most simple of these is the IF function: it returns a value depending on whether a condition you describe is true or false. Once you get the hang of using the IF function, you will find it easier to us…

828 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question