Solved

Soundex function in Delphi

Posted on 2000-04-13
7
1,136 Views
Last Modified: 2010-04-04
Hi,

A long time ago, I have seen somewhere a sample on how to convert some text into soundex codes.  Now that I need such a function (for Delphi) I can't seem to find that article anymore.

Does anybody know about such a function with which I could get eg surnames of persons which sound similag (eg Worthwald and Worthwood).

Also a long time ago, I had a tool which I used to scan my Database for duplicate or similar records.  It was a commercial product, but I downloaded a demo of it, and I can't seem to find it anymore, nor the link to the website where it came from.

If anybody could help me find back that tool, I'm willing to award him 100 points for that.  I need this tool because I have to convert a Access DB to SQL Server this week, but the old DB contains many duplicate records and records which are similar, and with that tool, I could scan my tables and it would give me the similar or matching records.

Please Help me


Stefaan
0
Comment
Question by:Stefaan
7 Comments
 
LVL 4

Accepted Solution

by:
jeurk earned 100 total points
ID: 2710990
Is this your sample ? :
======================
{ ****************************************************************** }
{                                                                    }
{   Delphi component TSoundex                                        }
{                                                                    }
{   Copyright © 1995 by Indigo Software                              }
{                                                                    }
{ ****************************************************************** }

(*---------------------------------------------------------------------|
Description:
The Soundex component uses the Soundex algorithm to determine if two
words sound similar.  Useful in database applications where the
operator may not know the exact spelling of a search string, for
example a last name.

Properties:

FirstWord/SecondWord: String
      The FirstWord and SecondWord properties define the two words that
      are to be compared.  The SoundAlike and SoundAlikePlus properties
      will state whether the words sound similar, depending on which
      method you choose.

SoundexValue: String
      The SoundexValue property is a string consisting of a series of
      numbers that depicts the unique sound of the word specified in
      the FirstWord property.

      This value can be stored in a hidden field of a database for
      future searches.  When the operator searches for a given string
      (for example, a last name), it can be converted to a SoundexValue,
      and compared to the values in the hidden field, thereby returning
      all records which match the sound of the search string.

SoundAlike: Boolean
      The SoundAlike property states whether the words defined by
      FirstWord and SecondWord sound similar according to the Soundex
      algorithm.

SoundexPlusValue: String
      The SoundexPlusValue property is a string consisting of a series
      of numbers that depicts the unique sound of the word specified in
      the FirstWord property.

      This value can be stored in a hidden field of a database for future
      searches.  When the operator searches for a given string
      (for example, a last name), it can be converted to a SoundexPlusValue,
      and compared to the values in the hidden field, thereby returning all
      records which match the sound of the search string.

      In the Soundex algorithm, words that begin with different letters do
      not sound similar.  Therefore, the words phish and fish, or sell and
      cell, would return different SoundexValues.  Because of this, a new
      algorithm, SoundexPlus, was developed.  This algorithm takes the first
      letter into consideration, and in the above examples, returns true.

SoundAlikePlus: Boolean
      The SoundAlikePlus property states whether the words defined by
      FirstWord and SecondWord sound similar according to the SoundexPlus
      algorithm.

Methods:

Soundex(CheckWord:string):string;
      The Soundex method is a function which returns the SoundexValue
      for the CheckWord.

SoundexPlus(CheckWord:string):string;
      The SoundexPlus method is a function which returns the
      SoundexPlusValue for the CheckWord.
|---------------------------------------------------------------------*)
unit Soundex;

interface

{$IFDEF WIN32}
uses Messages, Windows, SysUtils, Classes, Controls,
     Forms, Menus, Graphics;
{$ELSE}
uses WinTypes, WinProcs, Messages, SysUtils, Classes, Controls,
     Forms, Menus, Graphics;
{$ENDIF}


type
  TSoundex = class(TComponent)
    private
      { Private fields of TSoundex }
        { Storage for property FirstWord }
        FFirstWord : String;
        { Storage for property SecondWord }
        FSecondWord : String;
        { Storage for property SoundexValue }
        FSoundexValue : String;
        { Storage for property SoundAlike }
        FSoundAlike : Boolean;
        { Storage for property SoundexPlusValue }
        FSoundexPlusValue : String;
        { Storage for property SoundAlikePlus }
        FSoundAlikePlus : Boolean;

      { Private methods of TSoundex }
        { Method to set variable and property values and create objects }
        procedure AutoInitialize;
        { Method to free any objects created by AutoInitialize }
        procedure AutoDestroy;
        { Read method for property SoundexValue }
        function GetSoundexValue : String;
        { Write method for property SoundexValue }
        procedure SetSoundexValue(Value : String);
        { Read method for property SoundAlike }
        function GetSoundAlike : Boolean;
        { Write method for property SoundAlike }
        procedure SetSoundAlike(Value : Boolean);
        { Read method for property SoundexPlusValue }
        function GetSoundexPlusValue : String;
        { Write method for property SoundexPlusValue }
        procedure SetSoundexPlusValue(Value : String);
        { Read method for property SoundAlikePlus }
        function GetSoundAlikePlus : Boolean;
        { Write method for property SoundAlikePlus }
        procedure SetSoundAlikePlus(Value : Boolean);

    protected
      { Protected fields of TSoundex }

      { Protected methods of TSoundex }

    public
      { Public fields of TSoundex }

      { Public methods of TSoundex }
        constructor Create(AOwner: TComponent); override;
        destructor Destroy; override;
        function Soundex(OriginalWord:string):string;
        function SoundexPlus(OriginalWord:string):string;

    published
      { Published properties of the component }
        property FirstWord : String read FFirstWord write FFirstWord;
        property SecondWord : String read FSecondWord write FSecondWord;
        property SoundexValue : String
             read GetSoundexValue write SetSoundexValue;
        property SoundAlike : Boolean
             read GetSoundAlike write SetSoundAlike
             default false;
        property SoundexPlusValue : String
             read GetSoundexPlusValue write SetSoundexPlusValue;
        property SoundAlikePlus : Boolean
             read GetSoundAlikePlus write SetSoundAlikePlus;

  end;

procedure Register;

implementation

procedure Register;
begin
     { Register TSoundex with Indigo Widgets as its
       default page on the Delphi component palette }
     RegisterComponents('Indigo Widgets', [TSoundex]);
end;

{ Method to set variable and property values and create objects }
procedure TSoundex.AutoInitialize;
begin
     FSoundAlike := false;
end; { of AutoInitialize }

{ Method to free any objects created by AutoInitialize }
procedure TSoundex.AutoDestroy;
begin
     { No objects from AutoInitialize to free }
end; { of AutoDestroy }

{ Read method for property SoundexValue }
function TSoundex.GetSoundexValue : String;
begin
   fsoundexvalue:=soundex(firstword);
   getsoundexvalue:=fsoundexvalue;
end;

{ Write method for property SoundexValue }
procedure TSoundex.SetSoundexValue(Value : String);
begin
     FSoundexValue := fsoundexvalue;
end;

{ Read method for property SoundAlike }
function TSoundex.GetSoundAlike : Boolean;
begin
  if (Soundex(firstword)=Soundex(secondword)) then
    FSoundAlike:=True
  else
    FSoundAlike:=False;
     GetSoundAlike := FSoundAlike;
end;

{ Write method for property SoundAlike }
procedure TSoundex.SetSoundAlike(Value : Boolean);
begin
     FSoundAlike := FSoundAlike;
end;

{ Read method for property SoundexPlusValue }
function TSoundex.GetSoundexPlusValue : String;
begin
     fsoundexplusvalue:=soundexplus(firstword);
     GetSoundexPlusValue := FSoundexPlusValue
end;

{ Write method for property SoundexPlusValue }
procedure TSoundex.SetSoundexPlusValue(Value : String);
begin
     FSoundexPlusValue := FSoundexPlusValue;
end;

{ Read method for property SoundAlikePlus }
function TSoundex.GetSoundAlikePlus : Boolean;
begin
  if (Soundexplus(firstword)=Soundexplus(secondword)) then
    FSoundAlikeplus:=True
  else
    FSoundAlikeplus:=False;
     GetSoundAlikePlus := FSoundAlikePlus;
end;

{ Write method for property SoundAlikePlus }
procedure TSoundex.SetSoundAlikePlus(Value : Boolean);
begin
     FSoundAlikePlus := FSoundAlikePlus;
end;

constructor TSoundex.Create(AOwner: TComponent);
begin
     inherited Create(AOwner);
     AutoInitialize;
end;

destructor TSoundex.Destroy;
begin
     AutoDestroy;
     inherited Destroy;
end;

function TSoundex.Soundex(OriginalWord:string):string;
var
  Tempstring1,Tempstring2:string;
  Count:integer;
begin
  Tempstring1:='';
  Tempstring2:='';
  OriginalWord:=Uppercase(OriginalWord); {Make original word uppercase}
  Appendstr(Tempstring1,OriginalWord[1]); {Use the first letter of the word}
  for Count:=2 to length(OriginalWord) do
      {Assign a numeric value to each letter, except the first}
      case OriginalWord[Count] of
        'B','F','P','V':
          Appendstr(Tempstring1,'1');
        'C','G','J','K','Q','S','X','Z':
          Appendstr(Tempstring1,'2');
        'D','T':
          Appendstr(Tempstring1,'3');
        'L':
          Appendstr(Tempstring1,'4');
        'M','N':
          Appendstr(Tempstring1,'5');
        'R':
          Appendstr(Tempstring1,'6');
        {All other letters, punctuation and numbers are ignored}
      end;

  Appendstr(Tempstring2,OriginalWord[1]);

  {Go through the result, and remove any consecutive numberic values
   that are duplicates}
  for Count:=2 to length(Tempstring1) do
    if Tempstring1[Count-1]<>Tempstring1[Count] then
        Appendstr(Tempstring2,Tempstring1[Count]);

  Soundex:=Tempstring2; {This is the soundex value}

end;

function TSoundex.SoundexPlus(OriginalWord:string):string;
var
  Tempstring1,Tempstring2:string;
  Count:integer;
begin
  Tempstring1:='';
  Tempstring2:='';
  OriginalWord:=Uppercase(OriginalWord); {Make original word uppercase}

  for Count:=1 to length(OriginalWord) do
      {Assign a numeric value to each letter}
      case OriginalWord[Count] of
        'B','F','P','V':
          Appendstr(Tempstring1,'1');
        'C','G','J','K','Q','S','X','Z':
          Appendstr(Tempstring1,'2');
        'D','T':
          Appendstr(Tempstring1,'3');
        'L':
          Appendstr(Tempstring1,'4');
        'M','N':
          Appendstr(Tempstring1,'5');
        'R':
          Appendstr(Tempstring1,'6');
        {All other letters, punctuation and numbers are ignored}
      end;

  {Go through the result, and remove any consecutive numberic values
   that are duplicates}
  for Count:=1 to length(Tempstring1) do
    if Tempstring1[Count-1]<>Tempstring1[Count] then
        Appendstr(Tempstring2,Tempstring1[Count]);

  Soundexplus:=Tempstring2; {This is the soundexplus value}

end;



end.
0
 
LVL 27

Expert Comment

by:kretzschmar
ID: 2711000
0
 
LVL 4

Expert Comment

by:jeurk
ID: 2711009
I have also a
Metaphone Phonetic Hash Algorithm

    Version 2 BETA

This is an algorithm expressed in Object Pascal to do the Metaphone
phonetic hash. It's kind of like soundex, but a little more specific.

I can send it to you, if you want...
give me a mail...
0
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

 
LVL 4

Expert Comment

by:jeurk
ID: 2711012
Hello Meikl,
I'm lucky this time...
What I pasted is what is in you zip file ;)
except of the dcr...
Regards...
0
 
LVL 3

Author Comment

by:Stefaan
ID: 2711027
Hi Jeurk,

You can send it to Stefaan_Lesage@peopleware.be

I'll leave the question open till tomorow, because I need as many input as possible, and I still need to find the tool I'm looking for.

You and Meikl will both receive some points from me by tomorow.  If I seem to forget it, please remind me tomorow by adding a comment and I'll post two questions to reward you guys with the points.  In the mean time I'll leave the question open, so maybe I'll get some more input on the matter.
0
 
LVL 1

Expert Comment

by:JoeBooth
ID: 2711359
Consider the offer of the metaphone algorithm.  Soundex is limited and simple (it was originally developed for a paper system in the early 1900's).  Metaphone was more recently developed and will probably give you a better job with your matches.

Good luck...
0
 
LVL 4

Expert Comment

by:jeurk
ID: 2729931
Hello Stephan,
Could you evaluate the answer now ?
Or are things not working ?
CU
0

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Introduction The parallel port is a very commonly known port, it was widely used to connect a printer to the PC, if you look at the back of your computer, for those who don't have newer computers, there will be a port with 25 pins and a small print…
Hello everybody This Article will show you how to validate number with TEdit control, What's the TEdit control? TEdit is a standard Windows edit control on a form, it allows to user to write, read and copy/paste single line of text. Usua…
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …
The Email Laundry PDF encryption service allows companies to send confidential encrypted  emails to anybody. The PDF document can also contain attachments that are embedded in the encrypted PDF. The password is randomly generated by The Email Laundr…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question