• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 317
  • Last Modified:

Advanced Comparison of two varchar fields

I have two fields that I need to compare and DIFFERENCE isn't quite doing the job.  Here's what I have so far and what I'd like to do with it:

1. A full text search is performed to find words that match in the two fields
2. Both the found and reference fields are converted to lower case and stripped of all non-character data
3. The fields are then compared using DIFFERENCE

DIFFERENCE isn't quite doing the job, so if anyone has any suggestions for a replacement?  I was thinking about adding the CHAR value of each character in the string of both fields and ranking the results based on similarity in overall number (e.g. If the reference field's overall value was 54 and the full text search match was 52, this result would appear near the top.  If the reference's value was 54 and the full text match was 187, that would appear towards the bottom.).  

I'm not sure, however, how I would do this as I'm not vastly experienced in string functions.  I do realize that I could do a more exact match with the full text search, however, for this application I need all available matches...just using better ranking.  

1 Solution
lcohanDatabase AnalystCommented:
In my opinion this is because you do
"A full text search is performed to find words that match in the two fields"
therefore full text does
"[...] search columns containing character-based data types for values that match the meaning and not just the exact wording of the words in the search condition. When FREETEXT is used, the full-text query engine internally performs the following actions on the freetext_string, assigns each term a weight, and then finds the matches.

Separates the string into individual words based on word boundaries (word-breaking).
Generates inflectional forms of the words (stemming).
Identifies a list of expansions or replacements for the terms based on matches in the thesaurus."

What I'm trying to say is that result sets may be different from start therefor may not be useful in a comparison for that matter.

If the data in the two fields to compare is NON-UNICODE I would use ASCII not CHAR and do the comparison after all strigs were converted to ASCII codes - you could follow the example from link below for doing that:

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now