I need to match inconsistant data in the sql server database.
when people are filling in their company names in a registration form, they all put in different names for the same company.
...... so many other different ways
So, the column saves so many different names for the same company, later when I want to search for total number of people who attended from Intel, my pattern matching string is not able to consolodate all the different ways, I am using Trim(substring(company_name,.....)) in the where clause with SELECT statement.I am hardly getting 40% of the people that need to be counted.
There is really no consistant pattern to the way people write their company names, I have 40,000 different company names in my list, and with this inconsistant names, the list goes up to 60,000. I cannot stop people from giving me inconsistant names, I will have to come up with a pattern matching query which can pull out company synonyms
80-90% of the time.
How can I solve this issue.