Link to home
Start Free TrialLog in
Avatar of Stromhau
Stromhau

asked on

Spoken language identification

Hi,

In designing a spoken language identification system one often make use of phonemes and hidden markov models. How is this done ? I have multilingual speech samples labeled with phonemes.
Do i train the HMM with the phonemes ? I would be happy for an explanation ?

Tommy,
Avatar of d-glitch
d-glitch
Flag of United States of America image

Certainly an active research area.  You should be able to find material online:

For example:  http://www.cs.brown.edu/research/ai/dynamics/tutorial/Documents/HiddenMarkovModels.html

Are you just trying to identify the language [English, German, Mandarin] or do speech recognition as well?
Avatar of Stromhau
Stromhau

ASKER

Yes i have been studying some articles but it seems that how they use phonemes is so well known that they dont explain the actual process. I dont know the process  but i know that the use labeled speech files, use the phoneme data and train the model(hmm). I think maybe they extract all the phonemes and extract some features like MCC and formants for training. Then the use a phoneme recognizer to the input sampes and compare it the the phonemes in the model.

To clarify :

One other project i did was to classify emotional speech. Here i took the speech samples and extracted features like pitch, formants etc.
For each sample(angry, neutral, bored) i got a vector containing the features. Then i used SVM to classify the different emotions.    
I dont understand how to use phonemes in LID system.
And yes, i wont be using speech recognition here. Some language identification systems do use speech reconition to classify language but thats not my intention.

It could be that if i had known the speech recognition using phonemes i would have understood this problem.

Tommy,
Several billion dollars has been spent trying to get a usable speech recognition system.   Not much luck so far.

What gives you the confidence to assume that if you can't do it in 30 years and billions of dollars and lots of directed effort, that there's a good chance a computer will just "learn" how to do it?

You probably will have a better chance of telling "Sionara" from "Mamma mia", but maybe not a whole lot better.



|Several billion dollars has been spent trying to get a usable speech recognition system.   Not much luck |so far.

|What gives you the confidence to assume that if you can't do it in 30 years and billions of dollars and |lots of directed effort, that there's a good chance a computer will just "learn" how to do it?

|You probably will have a better chance of telling "Sionara" from "Mamma mia", but maybe not a whole lot |better.

You must be kidding me !
Didn't you read my post before answering ? I will not make a speech recognition system but a language identification system. It can be made without speech recognition.
What you wrote isn't what i asked for. If you cant answer the question please do not comment.

Tommy,
ASKER CERTIFIED SOLUTION
Avatar of neopolitan
neopolitan

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I'd like to add another simple reference, to people that are just beginning in this area, and would like to get some basic knowledge.  Look at
http://www.eecg.toronto.edu/~aamodt/ece341/speech-recognition/