?
Solved

percent matching of string 1 against string 2 in excel

Posted on 2013-11-11
2
Medium Priority
?
5,078 Views
Last Modified: 2013-11-16
I have two columns in excel that I'm trying to figure out how closely the two columns match with each other.

Lets say I have a column of strings (column A), and another column (column B) that I want to match against. I want to take each cell in column A, and get a "percent match" in all the cells in B so I can find the closest match in column B.

Also, the stings would be something "Cell reconfiguration engineering AA145-78XR" against "Setup AA145-78XR" which should be a close match.

Any ideas?
0
Comment
Question by:k1ng87
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 9

Expert Comment

by:guswebb
ID: 39640323
This could get messy....first of all you need to split the cell contents in column A to form an array and then check for the presence of each array item in the cell contents in column B. This assumes that your comparison process will treat blocks of characters (delimited by a space or other character e.g. a hyphen). So in your example above, the cell from column A would have 5 strings in it's array. When searching column B you would need to check for the presence of each of these within each column B cell. If you find a match you would increment a counter, for example. This approach is at a very basic level and could be made more complex by looking for strings in a particular sequence, or for strings that are adjacent to each other. Again taking your example, finding 'reconciliation' and then 'engineering' in the same cell in column B might increment your counter by 1 on each occasion, but if the two strings are adjacent to each other that count might be higher (stronger match) or if the two strings appear in the same order as per the cell from column A, again the counter might be higher (again a stronger match). For example...

A1='Cell reconfiguration engineering AA145-78XR'
B1='Setup AA145-78XR' (score 1 for finding AA145, score 1 for finding 78XR, score 1 because they appear in the same order as A1, and score 1 because they are adjacent to each other. Total 'matching score' = 4.

Conversely...if B1 contained '78XR whatever AA145' the score might be only 2, 1 point
for finding AA145 and 1 point for finding 78XR. No points for them being adjacent and no points for them appearing in the same order.

Now for the hard bit....you need to code this up in VBA! I'm a bit rusty in that department but no doubt someone else can rattle off that code in a few mins.
0
 
LVL 81

Accepted Solution

by:
byundt earned 2000 total points
ID: 39640515
What you are requesting is called "fuzzy matching". If you have Excel 2010 or later, Microsoft has an add-in for that purpose. http://www.microsoft.com/en-us/download/details.aspx?id=15011 "Fuzzy Lookup Add-In for Excel"

If you have an earlier version of Excel, then consider the code suggested by al_b_cnu in http://www.mrexcel.com/forum/excel-questions/195635-fuzzy-matching-new-version-plus-explanation.html#post955137
0

Featured Post

Want to be a Web Developer? Get Certified Today!

Enroll in the Certified Web Development Professional course package to learn HTML, Javascript, and PHP. Build a solid foundation to work toward your dream job!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This code takes an Excel list of URL’s and adds a header titled “URL List”. It then searches through all URL’s in column “A”, looking for duplicates. When a duplicate is found, it is moved to the top of the list. The duplicate URL’s are then highlig…
Freeze panes is an option within all variants of Excel to enable parts of a sheet to remain stationary when the cursor is in another part of the sheet. This is a very useful feature which is overlooked or under used.
The viewer will learn how to use the =DISCRINV command to create a discrete random variable, use this command to model a set of probabilities and outcomes in a Monte Carlo simulation, and learn how to find the standard deviation of a set of probabil…
This Micro Tutorial will demonstrate how to create pivot charts out of a data set. I also added a drop-down menu which allows to choose from different categories in the data set and the chart will automatically update.

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question