Solved

Soundex function in Delphi

Posted on 2000-04-13
7
1,106 Views
Last Modified: 2010-04-04
Hi,

A long time ago, I have seen somewhere a sample on how to convert some text into soundex codes.  Now that I need such a function (for Delphi) I can't seem to find that article anymore.

Does anybody know about such a function with which I could get eg surnames of persons which sound similag (eg Worthwald and Worthwood).

Also a long time ago, I had a tool which I used to scan my Database for duplicate or similar records.  It was a commercial product, but I downloaded a demo of it, and I can't seem to find it anymore, nor the link to the website where it came from.

If anybody could help me find back that tool, I'm willing to award him 100 points for that.  I need this tool because I have to convert a Access DB to SQL Server this week, but the old DB contains many duplicate records and records which are similar, and with that tool, I could scan my tables and it would give me the similar or matching records.

Please Help me


Stefaan
0
Comment
Question by:Stefaan
7 Comments
 
LVL 4

Accepted Solution

by:
jeurk earned 100 total points
ID: 2710990
Is this your sample ? :
======================
{ ****************************************************************** }
{                                                                    }
{   Delphi component TSoundex                                        }
{                                                                    }
{   Copyright © 1995 by Indigo Software                              }
{                                                                    }
{ ****************************************************************** }

(*---------------------------------------------------------------------|
Description:
The Soundex component uses the Soundex algorithm to determine if two
words sound similar.  Useful in database applications where the
operator may not know the exact spelling of a search string, for
example a last name.

Properties:

FirstWord/SecondWord: String
      The FirstWord and SecondWord properties define the two words that
      are to be compared.  The SoundAlike and SoundAlikePlus properties
      will state whether the words sound similar, depending on which
      method you choose.

SoundexValue: String
      The SoundexValue property is a string consisting of a series of
      numbers that depicts the unique sound of the word specified in
      the FirstWord property.

      This value can be stored in a hidden field of a database for
      future searches.  When the operator searches for a given string
      (for example, a last name), it can be converted to a SoundexValue,
      and compared to the values in the hidden field, thereby returning
      all records which match the sound of the search string.

SoundAlike: Boolean
      The SoundAlike property states whether the words defined by
      FirstWord and SecondWord sound similar according to the Soundex
      algorithm.

SoundexPlusValue: String
      The SoundexPlusValue property is a string consisting of a series
      of numbers that depicts the unique sound of the word specified in
      the FirstWord property.

      This value can be stored in a hidden field of a database for future
      searches.  When the operator searches for a given string
      (for example, a last name), it can be converted to a SoundexPlusValue,
      and compared to the values in the hidden field, thereby returning all
      records which match the sound of the search string.

      In the Soundex algorithm, words that begin with different letters do
      not sound similar.  Therefore, the words phish and fish, or sell and
      cell, would return different SoundexValues.  Because of this, a new
      algorithm, SoundexPlus, was developed.  This algorithm takes the first
      letter into consideration, and in the above examples, returns true.

SoundAlikePlus: Boolean
      The SoundAlikePlus property states whether the words defined by
      FirstWord and SecondWord sound similar according to the SoundexPlus
      algorithm.

Methods:

Soundex(CheckWord:string):string;
      The Soundex method is a function which returns the SoundexValue
      for the CheckWord.

SoundexPlus(CheckWord:string):string;
      The SoundexPlus method is a function which returns the
      SoundexPlusValue for the CheckWord.
|---------------------------------------------------------------------*)
unit Soundex;

interface

{$IFDEF WIN32}
uses Messages, Windows, SysUtils, Classes, Controls,
     Forms, Menus, Graphics;
{$ELSE}
uses WinTypes, WinProcs, Messages, SysUtils, Classes, Controls,
     Forms, Menus, Graphics;
{$ENDIF}


type
  TSoundex = class(TComponent)
    private
      { Private fields of TSoundex }
        { Storage for property FirstWord }
        FFirstWord : String;
        { Storage for property SecondWord }
        FSecondWord : String;
        { Storage for property SoundexValue }
        FSoundexValue : String;
        { Storage for property SoundAlike }
        FSoundAlike : Boolean;
        { Storage for property SoundexPlusValue }
        FSoundexPlusValue : String;
        { Storage for property SoundAlikePlus }
        FSoundAlikePlus : Boolean;

      { Private methods of TSoundex }
        { Method to set variable and property values and create objects }
        procedure AutoInitialize;
        { Method to free any objects created by AutoInitialize }
        procedure AutoDestroy;
        { Read method for property SoundexValue }
        function GetSoundexValue : String;
        { Write method for property SoundexValue }
        procedure SetSoundexValue(Value : String);
        { Read method for property SoundAlike }
        function GetSoundAlike : Boolean;
        { Write method for property SoundAlike }
        procedure SetSoundAlike(Value : Boolean);
        { Read method for property SoundexPlusValue }
        function GetSoundexPlusValue : String;
        { Write method for property SoundexPlusValue }
        procedure SetSoundexPlusValue(Value : String);
        { Read method for property SoundAlikePlus }
        function GetSoundAlikePlus : Boolean;
        { Write method for property SoundAlikePlus }
        procedure SetSoundAlikePlus(Value : Boolean);

    protected
      { Protected fields of TSoundex }

      { Protected methods of TSoundex }

    public
      { Public fields of TSoundex }

      { Public methods of TSoundex }
        constructor Create(AOwner: TComponent); override;
        destructor Destroy; override;
        function Soundex(OriginalWord:string):string;
        function SoundexPlus(OriginalWord:string):string;

    published
      { Published properties of the component }
        property FirstWord : String read FFirstWord write FFirstWord;
        property SecondWord : String read FSecondWord write FSecondWord;
        property SoundexValue : String
             read GetSoundexValue write SetSoundexValue;
        property SoundAlike : Boolean
             read GetSoundAlike write SetSoundAlike
             default false;
        property SoundexPlusValue : String
             read GetSoundexPlusValue write SetSoundexPlusValue;
        property SoundAlikePlus : Boolean
             read GetSoundAlikePlus write SetSoundAlikePlus;

  end;

procedure Register;

implementation

procedure Register;
begin
     { Register TSoundex with Indigo Widgets as its
       default page on the Delphi component palette }
     RegisterComponents('Indigo Widgets', [TSoundex]);
end;

{ Method to set variable and property values and create objects }
procedure TSoundex.AutoInitialize;
begin
     FSoundAlike := false;
end; { of AutoInitialize }

{ Method to free any objects created by AutoInitialize }
procedure TSoundex.AutoDestroy;
begin
     { No objects from AutoInitialize to free }
end; { of AutoDestroy }

{ Read method for property SoundexValue }
function TSoundex.GetSoundexValue : String;
begin
   fsoundexvalue:=soundex(firstword);
   getsoundexvalue:=fsoundexvalue;
end;

{ Write method for property SoundexValue }
procedure TSoundex.SetSoundexValue(Value : String);
begin
     FSoundexValue := fsoundexvalue;
end;

{ Read method for property SoundAlike }
function TSoundex.GetSoundAlike : Boolean;
begin
  if (Soundex(firstword)=Soundex(secondword)) then
    FSoundAlike:=True
  else
    FSoundAlike:=False;
     GetSoundAlike := FSoundAlike;
end;

{ Write method for property SoundAlike }
procedure TSoundex.SetSoundAlike(Value : Boolean);
begin
     FSoundAlike := FSoundAlike;
end;

{ Read method for property SoundexPlusValue }
function TSoundex.GetSoundexPlusValue : String;
begin
     fsoundexplusvalue:=soundexplus(firstword);
     GetSoundexPlusValue := FSoundexPlusValue
end;

{ Write method for property SoundexPlusValue }
procedure TSoundex.SetSoundexPlusValue(Value : String);
begin
     FSoundexPlusValue := FSoundexPlusValue;
end;

{ Read method for property SoundAlikePlus }
function TSoundex.GetSoundAlikePlus : Boolean;
begin
  if (Soundexplus(firstword)=Soundexplus(secondword)) then
    FSoundAlikeplus:=True
  else
    FSoundAlikeplus:=False;
     GetSoundAlikePlus := FSoundAlikePlus;
end;

{ Write method for property SoundAlikePlus }
procedure TSoundex.SetSoundAlikePlus(Value : Boolean);
begin
     FSoundAlikePlus := FSoundAlikePlus;
end;

constructor TSoundex.Create(AOwner: TComponent);
begin
     inherited Create(AOwner);
     AutoInitialize;
end;

destructor TSoundex.Destroy;
begin
     AutoDestroy;
     inherited Destroy;
end;

function TSoundex.Soundex(OriginalWord:string):string;
var
  Tempstring1,Tempstring2:string;
  Count:integer;
begin
  Tempstring1:='';
  Tempstring2:='';
  OriginalWord:=Uppercase(OriginalWord); {Make original word uppercase}
  Appendstr(Tempstring1,OriginalWord[1]); {Use the first letter of the word}
  for Count:=2 to length(OriginalWord) do
      {Assign a numeric value to each letter, except the first}
      case OriginalWord[Count] of
        'B','F','P','V':
          Appendstr(Tempstring1,'1');
        'C','G','J','K','Q','S','X','Z':
          Appendstr(Tempstring1,'2');
        'D','T':
          Appendstr(Tempstring1,'3');
        'L':
          Appendstr(Tempstring1,'4');
        'M','N':
          Appendstr(Tempstring1,'5');
        'R':
          Appendstr(Tempstring1,'6');
        {All other letters, punctuation and numbers are ignored}
      end;

  Appendstr(Tempstring2,OriginalWord[1]);

  {Go through the result, and remove any consecutive numberic values
   that are duplicates}
  for Count:=2 to length(Tempstring1) do
    if Tempstring1[Count-1]<>Tempstring1[Count] then
        Appendstr(Tempstring2,Tempstring1[Count]);

  Soundex:=Tempstring2; {This is the soundex value}

end;

function TSoundex.SoundexPlus(OriginalWord:string):string;
var
  Tempstring1,Tempstring2:string;
  Count:integer;
begin
  Tempstring1:='';
  Tempstring2:='';
  OriginalWord:=Uppercase(OriginalWord); {Make original word uppercase}

  for Count:=1 to length(OriginalWord) do
      {Assign a numeric value to each letter}
      case OriginalWord[Count] of
        'B','F','P','V':
          Appendstr(Tempstring1,'1');
        'C','G','J','K','Q','S','X','Z':
          Appendstr(Tempstring1,'2');
        'D','T':
          Appendstr(Tempstring1,'3');
        'L':
          Appendstr(Tempstring1,'4');
        'M','N':
          Appendstr(Tempstring1,'5');
        'R':
          Appendstr(Tempstring1,'6');
        {All other letters, punctuation and numbers are ignored}
      end;

  {Go through the result, and remove any consecutive numberic values
   that are duplicates}
  for Count:=1 to length(Tempstring1) do
    if Tempstring1[Count-1]<>Tempstring1[Count] then
        Appendstr(Tempstring2,Tempstring1[Count]);

  Soundexplus:=Tempstring2; {This is the soundexplus value}

end;



end.
0
 
LVL 27

Expert Comment

by:kretzschmar
ID: 2711000
0
 
LVL 4

Expert Comment

by:jeurk
ID: 2711009
I have also a
Metaphone Phonetic Hash Algorithm

    Version 2 BETA

This is an algorithm expressed in Object Pascal to do the Metaphone
phonetic hash. It's kind of like soundex, but a little more specific.

I can send it to you, if you want...
give me a mail...
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 
LVL 4

Expert Comment

by:jeurk
ID: 2711012
Hello Meikl,
I'm lucky this time...
What I pasted is what is in you zip file ;)
except of the dcr...
Regards...
0
 
LVL 3

Author Comment

by:Stefaan
ID: 2711027
Hi Jeurk,

You can send it to Stefaan_Lesage@peopleware.be

I'll leave the question open till tomorow, because I need as many input as possible, and I still need to find the tool I'm looking for.

You and Meikl will both receive some points from me by tomorow.  If I seem to forget it, please remind me tomorow by adding a comment and I'll post two questions to reward you guys with the points.  In the mean time I'll leave the question open, so maybe I'll get some more input on the matter.
0
 
LVL 1

Expert Comment

by:JoeBooth
ID: 2711359
Consider the offer of the metaphone algorithm.  Soundex is limited and simple (it was originally developed for a paper system in the early 1900's).  Metaphone was more recently developed and will probably give you a better job with your matches.

Good luck...
0
 
LVL 4

Expert Comment

by:jeurk
ID: 2729931
Hello Stephan,
Could you evaluate the answer now ?
Or are things not working ?
CU
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

A lot of questions regard threads in Delphi.   One of the more specific questions is how to show progress of the thread.   Updating a progressbar from inside a thread is a mistake. A solution to this would be to send a synchronized message to the…
Objective: - This article will help user in how to convert their numeric value become words. How to use 1. You can copy this code in your Unit as function 2. than you can perform your function by type this code The Code   (CODE) The Im…
This video discusses moving either the default database or any database to a new volume.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now