fatihbarut
asked on
Which plot type is the best for speech recognation or voice recognation
hi guys,
could you tell me which plot type in matlab is the best one for speech recognation?
could you tell me which plot type in matlab is the best one for speech recognation?
ASKER
They both are greath sources for who has time to read and more importantly with engineering past :)
On the other hand I got what you explained exactly, thank you.
and finally I am just trying to catch at most 200 words. I don't need segmantation of words to phonemes. I just need to differentiate them.
On the other hand I got what you explained exactly, thank you.
and finally I am just trying to catch at most 200 words. I don't need segmantation of words to phonemes. I just need to differentiate them.
You could probably implement The Clapper in software without doing an FFT.
The next level up in audio processing would be something like decoding Touch Tone Dialing. And I think you would need or want an FFT there.
Even if you are just trying to respond to short, simple commands: "STOP" "TURN LEFT" and "FIRE", you will probably need the FFT and you may still want to use phonemes.
If you can prerecord and process your target vocabulary, you might get by with correlating
the input spectrogram against the target words. This would work best for a single speaker.
The next level up in audio processing would be something like decoding Touch Tone Dialing. And I think you would need or want an FFT there.
Even if you are just trying to respond to short, simple commands: "STOP" "TURN LEFT" and "FIRE", you will probably need the FFT and you may still want to use phonemes.
If you can prerecord and process your target vocabulary, you might get by with correlating
the input spectrogram against the target words. This would work best for a single speaker.
ASKER
I used this code below
function wavimportandplot(filename)
[data,fs] = wavread(filename);
figure('visible','off')
stem(repmat((1:size(data,1 ))'/fs,1,s ize(data,2 )),data,'m arker','no ne')
xlim([1 size(data,1)]/fs)
xlabel('Time, sec')
print('-dpng','-r300',strr ep(filenam e,'.wav',' '))
close gcf
however bar stem area kind of graphics didn't satisfied me.
Just need much more usefull one.
function wavimportandplot(filename)
[data,fs] = wavread(filename);
figure('visible','off')
stem(repmat((1:size(data,1
xlim([1 size(data,1)]/fs)
xlabel('Time, sec')
print('-dpng','-r300',strr
close gcf
however bar stem area kind of graphics didn't satisfied me.
Just need much more usefull one.
How are you generating the input waveforms?
What is the sample rate and resolution?
What sort of processing are you trying to implement here:
stem(repmat((1:size(data,1 ))'/fs,1,s ize(data,2 )),data,'m arker','no ne')
As near as I can tell, the stem() function just plots the amplitude of each sample with a little circles on top.
What is the sample rate and resolution?
What sort of processing are you trying to implement here:
stem(repmat((1:size(data,1
As near as I can tell, the stem() function just plots the amplitude of each sample with a little circles on top.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
another great article.
However I just want to see my wave files (spoken words) in a format which even a human can differentiate by eye.
However I just want to see my wave files (spoken words) in a format which even a human can differentiate by eye.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I still need a clear answers.
Thanks for all the previous ones.
Thanks for all the previous ones.
ASKER
thanks I am trying to contact Mr. Zue
The question is resolved and relevant links are provided. I was the only responder.
I recommend a split between the following two posts:
d-glitch https:#a38002614 <== Accept
The author hoped to do speech recgniton without any signal precessing. My critique was (and still is) correct.
d-glitch https:#a37999009
This is a still-active link to an Ap Note for Isolated Word Recognition (the stated goal) in Matlab (the author's preferred language).
I recommend a split between the following two posts:
d-glitch https:#a38002614 <== Accept
The author hoped to do speech recgniton without any signal precessing. My critique was (and still is) correct.
d-glitch https:#a37999009
This is a still-active link to an Ap Note for Isolated Word Recognition (the stated goal) in Matlab (the author's preferred language).
Search for "speech recognition" "speech spectrogram" or "Victor Zue"
Here are a few examples:
http://www.bcs.rochester.edu/courses/crsinf/561/ARCHIVES/S06/0426/Zue.pdf
http://sipl.technion.ac.il/~rafi/spectrogram%20segmentation.pdf
The basic process is do break the speech in short time segments.
Then you do an FFT on each segment to get frequency information.
And plot the frequency content versus time (this is a spectrogram). There are examples in the second paper.
Then you have to break the spectrogram in to phonemes. This is really the the guts of the process. This means that you have to know what the spectrogram of each phoneme looks like.
And finally convert the phonemes into words.
If you want to understand the words, that is another level entirely.