asked on

How to do matching for wave file?

I would like to know how to matching a wave file using Delphi Professional ver 3.0. I have save a wave file in delphi database and i would like to do matching with the input voice in which is saved in the form of wave file. Both wave is similar word. The word only contain one phrase "saya".
The input voice is using microfon.
This technique is used to verify voice recognition.

Thank you.

keithcsl

Hi Mee Lan

The crudest technique is to perform a correlation test...

compare the bytes from the voice and correlate them with the wave file in the database

a high correlation gives a stronger % of a match, and vice versa.

Keith

keithcsl

Mee_Lan

ASKER

Please be more specify regarding the technique(The crudest technique is to perform a correlation test... ), maybe include with source code.

Thank you.

keithcsl

sorry mee lan

i have not personally tried it, but if u look up "correlation" under digital signal processing, that may help.

i do know that the gist of it is to perform direct data comparison between the incoming signal (voice on the phone) and your known signal (the word saya).

good luck
keith

Fatman121898

Hi Mee Lan,

The simple voice recognition systems use correlation between input signal and pattern signal.
But the correlation is not computed over the signals themselves (time domain). It is computed over the signal spectrums (frequency domain).
The matter is very complicated and requires good knowledge of digital signal processing.

Computing spectrums is the first of the steps you should perform. There are a lot of algorithms using FFT (Fast Fourrier Transformation). In same cases, in order to save time, it is better to store in database pattern spectrums instead patterns themselves.

The next step is to find correlation between pattern spectrums and input data spectrums.

The third step is to decide if the correlation is close enough or not.

All this stuff assumes that the input signal and the pattern one are stationary (means - their parameters are constant in given time interval), for example, they contain sound like 'a-a-a-a'. In real case when word or phrase are recorded and analized, they should be divided in small parts for wich we can say that they contain parts of signals having stationary spectrum.

In your case it is better (I tink) to store in your database patterns of sounds 'ssss', 'aaaa', 'yyyy' (or their spectrums).

Then, getting input signal, you should compute it's spectrum on short time slices (i.e.100 mS). You should discover that there are four parts in input signal wich have almost constant spectrums.

Then you should perform comparision (I mean correlation) between spectrums of stored patterns and these four parts in order to identify them.

I know that all this explanation is not big help, but it gives you trends. I've worked on voice and image recognition problems in last 10 years, and, beleave me, things are not simple.

Wish you success.
Jo.

Mee_Lan

ASKER

Hi Jo,

Can you please further explain to me what are the four parts in input signal which have almost constant spectrums? Are they amplitude, frequency, resonant and noise? Please advise. May I know what programming language you are using in your study in this voice recognition. TQ.

Mee Lan

Mee_Lan

ASKER

Adjusted points to 120

Mee_Lan

ASKER

Edited text of question.

Mee_Lan

ASKER

Adjusted points to 220

Fatman121898

Hey,hey, Mee Lan,
This is not for the points, I simply have not been paged for your comments (don't know why).

Saying "four parts" I meant the SPECTRUM of signal. They are four because as I could see in your case things are simple - you have word 'saya' consisting of 4 sounds each relatively prolonged which means that they will have constant spectrums.
The ADC (Analog-to-Digital Convertor) part of your soundcard converts the audiosignal to digits and stores them in array. These digits are representation of the electrical AMPLITUDE of the microphone signal measaured over very short time periods (mili- or microseconds, depending on blaster mode). As a result you have digitalized signal in TIME domain, recorded in array. On its base you compute spectrum in time slices using a FFT and sliding Window-function (nothing to do with Bill Gates's Windows ;-). The spectrum itself represents the signal in FREQUENCY domain.
Following this spectrum over more long time period (let see over whole signal preiod) you should see that it is more or less constant in different moments. The parts where the spectrum is constant are equivalent to long sounds in pronounced word (as 'aaa' or 'eee'). When spectrum changes quickly, this is equivalent to sounds like 'T' or 'K' etc.
The language I commonly use is Pascal(BP7&Delphi)+Assembler.

I dont know how can illustrate the result ot spectral transforms here using ASCII symbols only. If you use some serious program for sound processing (like Cool Edit Pro or Sound Forge) you should record some sound and then see its spectrum.
Oh, yes, if you use the famous WinAmp for music listenning, you can see the spectrum of music you listen in real time. Its signal display can show the signal being played in time domain (like oscylloscope) or in frequency domain (like spectrum analyzer).

It's time for me to go to sleep.
Sorry for long writings, English is not my mother's language *-).

Jo.

Mee_Lan

ASKER

dear jo..

i'm so glad to receive feedback from u again .....

yup .. i do understand quite clear now regarding spektrum sound. Now, the main problem i'm facing now is , how do i input my voice from a microphone ?

In my system interface there's a button "Input Speech", how do i control the microphone when i click on the button "Input Speech" ? How can i get and store my input voice upon clicking the "Input Speech" button?

Thank you.

keithcsl

Mee Lan

To get voice from the microphone, you should read up on Windows' low level multimedia APIs, ie WaveInxxx and WaveOutxxx routines.

With that, you can specify the source of the voice, may it be from your sound card, modem etc.

Keith

Mee_Lan

ASKER

Can anyone help me to create a new *.wav file in Delphi 3? I manage to record my voice already, but it was save in temp file.
i tried to create new file using below coding but to no avail.

if (button=btRecord) then begin
MediaPlayer1.Open;
MediaPlayer1.FileName := 'c:\saya.wav';
MediaPlayer1.StartPos := 0;
MediaPlayer1.StartRecording;
end

else if (button=btStop) then begin
MediaPlayer1.EndPos := 0;
MediaPlayer1.Stop;
MediaPlayer1.Save;
end;

Thank you so much.

Fatman121898

Hi Mee Lan,
There is a good sound recorder (source)on the Simonetti's (Alex is he is a Delphi expert) page:
http://www.bhnet.com.br/~simonet/
See How-to projects.
In fact this is Delphi2 project but there is no reason not to work with next Delphi versions.

Jo.

Mee_Lan

ASKER

Adjusted points to 270

Mee_Lan

ASKER

dear jo,

There is a good sound recorder (source)on the Simonetti's (Alex is he is a Delphi expert) page:
http://www.bhnet.com.br/~simonet/

i can't find the solution at the above webpage ..

Thank you.

Fatman121898

>'i can't find the solution'
What do you mean?

Mee_Lan

ASKER

dear jo,

I have browse the website you recommended but I can't find the solution to my problem that is saving the sound that i have recorded in a new wav file instead of keep on append to the file.

Thank you.

Fatman121898

Hi Mee Lan,

Did you find out the project WaveRec?
It can do this: recoring your voice into .WAV file. I've tried it with Delphi 3.
If you could not get this file I can send it over e-mail.

Jo.

Mee_Lan

ASKER

Dear Jo,

Nope. I can't find out the project WaveRec.
Yes, Please sent it over by e-mail: Mlleong_@hotmail.com

Thank you so much.

ASKER CERTIFIED SOLUTION

Fatman121898

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Mee_Lan

ASKER

Dear Jo,
Thank you so much for the help given. It has been great receiving feedback from you(prompt feedback).

Really appreciate the help.

Bye.... :-))

Fatman121898

See you again :-)