fatihbarut
asked on
Wav to Text program using SAPI 5 and delphi
Hi,
I want to translate wav files into text using SAPI 5.1. It must be very easy because instead of the sound recorded by microphone I will use the sounds recorded elsewhere.
Microsoft said it is possible by SAPI 5.1 how ever I couldn't find how I can do it after I do these steps
- Instaling SAPI 5.1
- Installing SAPI components into Delphi
Briefly I need answers for this 2 questions
- Which sapi components should I use for it
- Which methods should I use and how
Thank you
I want to translate wav files into text using SAPI 5.1. It must be very easy because instead of the sound recorded by microphone I will use the sounds recorded elsewhere.
Microsoft said it is possible by SAPI 5.1 how ever I couldn't find how I can do it after I do these steps
- Instaling SAPI 5.1
- Installing SAPI components into Delphi
Briefly I need answers for this 2 questions
- Which sapi components should I use for it
- Which methods should I use and how
Thank you
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
hi, did you import the type library as i wrote you ? if yes then check if your unit has these files in the uses section
uses
Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
ActiveX, OleServer, ExtCtrls;
uses
Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
ActiveX, OleServer, ExtCtrls;
ASKER
By the way I realized something I am using windows 7 therefore my sapi is 5.4 what should I do know?
uninstall it somehow and reinstall 5.1?
uninstall it somehow and reinstall 5.1?
ASKER
Pardon after adding ActiveX, OleServer, ExtCtrls into uses problem solved, however, when I use "start" procedure there nothing happens, how can I write the recognized stream into a text file or into a memo
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
The language I use is English, I would be happy to learn it is language code
and I got "Undeclaired identifier for 'Grammar' word"
thanks again
and I got "Undeclaired identifier for 'Grammar' word"
thanks again
ASKER
Sorry my fault
I added Grammar: ISpeechRecoGrammar; in private area
how ever this time when I execute start procedure I got "OLE eror 80045001" message
I added Grammar: ISpeechRecoGrammar; in private area
how ever this time when I execute start procedure I got "OLE eror 80045001" message
ASKER
This is the last situation I am in.
With the help of twinsofts answers and 2 articles I linked below, I have made a working code which I added
http://edn.embarcadero.com/article/29583
and
http://www.delphi3000.com/articles/article_2629.asp
However, results are awfull. I need to upgrade sensitivity and accuracy,
Forexample
The real speech is:
"Welcome to the turbo power happy voice example program, press any key on your phones touch pad to start program"
The conversion is:
"Welcome to the cattle polish up where it can't be politically ample program but any key on your own cut that to start a program"
Any futher help will be very appreciated.
With the help of twinsofts answers and 2 articles I linked below, I have made a working code which I added
http://edn.embarcadero.com/article/29583
and
http://www.delphi3000.com/articles/article_2629.asp
However, results are awfull. I need to upgrade sensitivity and accuracy,
Forexample
The real speech is:
"Welcome to the turbo power happy voice example program, press any key on your phones touch pad to start program"
The conversion is:
"Welcome to the cattle polish up where it can't be politically ample program but any key on your own cut that to start a program"
Any futher help will be very appreciated.
unit AltYazarP;
interface
uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,Dialogs, SpeechLib_TLB, OleServer, StdCtrls,ActiveX,ExtCtrls;
type
TForm1 = class(TForm)
SR: TSpInProcRecoContext;
FileStream: TSpFileStream;
Button1: TButton;
Hipotezler: TMemo;
OpenDialog1: TOpenDialog;
Label1: TLabel;
Label2: TLabel;
Taninanlar: TMemo;
Button2: TButton;
procedure SRFalseRecognition(ASender: TObject; StreamNumber: Integer;
StreamPosition: OleVariant; const Result: ISpeechRecoResult);
procedure SREndStream(ASender: TObject; StreamNumber: Integer;
StreamPosition: OleVariant; StreamReleased: WordBool);
procedure Start;
procedure Button1Click(Sender: TObject);
procedure SRRecognition(ASender: TObject; StreamNumber: Integer;
StreamPosition: OleVariant; RecognitionType: TOleEnum; const Result: ISpeechRecoResult);
procedure FormCreate(Sender: TObject);
procedure SRHypothesis(ASender: TObject; StreamNumber: Integer;
StreamPosition: OleVariant; const Result: ISpeechRecoResult);
procedure Button2Click(Sender: TObject);
private
Grammar: ISpeechRecoGrammar;
{ Private declarations }
public
{ Public declarations }
end;
var
Form1: TForm1;
implementation
{$R *.dfm}
procedure TForm1.FormCreate(Sender: TObject);
begin
Grammar := SR.CreateGrammar(0);
Grammar.DictationSetState(SGDSActive);
end;
procedure TForm1.Start;
begin
if OpenDialog1.Execute then
begin
FileStream.Open(OpenDialog1.FileName, SPFM_OPEN_READONLY, False);
SR.Recognizer.AudioInputStream := FileStream.DefaultInterface;
end;
end;
procedure TForm1.SRFalseRecognition(ASender: TObject;
StreamNumber: Integer; StreamPosition: OleVariant;
const Result: ISpeechRecoResult);
begin
// Showmessage('Cannot recognize');
end;
procedure TForm1.SREndStream(ASender: TObject; StreamNumber: Integer;
StreamPosition: OleVariant; StreamReleased: WordBool);
begin
FileStream.Close;
end;
procedure TForm1.Button1Click(Sender: TObject);
begin
Start;
end;
procedure TForm1.SRRecognition(ASender: TObject; StreamNumber: Integer;
StreamPosition: OleVariant; RecognitionType: TOleEnum; const Result: ISpeechRecoResult);
var
SRResult: ISpeechRecoResult;
oItem: ISpeechPhraseProperty;
i: Integer;
begin
Taninanlar.Text {taninanlar is a Memo} := Taninanlar.Text+Result.PhraseInfo.GetText(0,-1,true);
end;
procedure TForm1.SRHypothesis(ASender: TObject; StreamNumber: Integer;
StreamPosition: OleVariant; const Result: ISpeechRecoResult);
begin
Hipotezler.Text{Hipotezler is another Memo) := Hipotezler.Text +Result.PhraseInfo.GetText(0,-1,false);
end;
procedure TForm1.Button2Click(Sender: TObject);
begin
//erased garbage
end;
end.
Hi, the ability of the engine to recognize words correctly has to do with the efficiency of the engine itself. The only thing that you can do is to use grammar rules to help the engine perform better but this is not easy as you have to create something like a vocabulary...
ASKER
Thanks again, however I didn't get what you said completely. How can I change the grammar rules, are there more to do besides arranging the parameters?
If there are paramaters for grammar rules, I d be happy to hear.
(For twinsoft) I have 4 more questions similar to this one but comperatively easy ones
Export M.S Speech Engine 5.1 (SAPI 5.1) and/or SAPI 4 library to another computer to save my recorded words
https://www.experts-exchange.com/questions/25100009/Export-M-S-Speech-Engine-5-1-SAPI-5-1-and-or-SAPI-4-library-to-another-computer-to-save-my-recorded-words.html
Convert .wav (sound) to Graphic (jpg etc.) using delphi: Wav to binary -> Interperate binary-> Number to points-> Join points and draw
https://www.experts-exchange.com/questions/25100383/Convert-wav-sound-to-Graphic-jpg-etc-using-delphi-Wav-to-binary-Interperate-binary-Number-to-points-Join-points-and-draw.html
Speech Recognation program using Delphi and Matlab
https://www.experts-exchange.com/questions/25100036/Speech-Recognation-program-using-Delphi-and-Matlab.html
Simple Speech Recognation program, which recognise numbers between 1-100 using Delphi (in turkish language)
https://www.experts-exchange.com/questions/25100689/Simple-Speech-Recognation-program-which-recognise-numbers-between-1-100-using-Delphi-in-turkish-language.html?fromWizard=true
If there are paramaters for grammar rules, I d be happy to hear.
(For twinsoft) I have 4 more questions similar to this one but comperatively easy ones
Export M.S Speech Engine 5.1 (SAPI 5.1) and/or SAPI 4 library to another computer to save my recorded words
https://www.experts-exchange.com/questions/25100009/Export-M-S-Speech-Engine-5-1-SAPI-5-1-and-or-SAPI-4-library-to-another-computer-to-save-my-recorded-words.html
Convert .wav (sound) to Graphic (jpg etc.) using delphi: Wav to binary -> Interperate binary-> Number to points-> Join points and draw
https://www.experts-exchange.com/questions/25100383/Convert-wav-sound-to-Graphic-jpg-etc-using-delphi-Wav-to-binary-Interperate-binary-Number-to-points-Join-points-and-draw.html
Speech Recognation program using Delphi and Matlab
https://www.experts-exchange.com/questions/25100036/Speech-Recognation-program-using-Delphi-and-Matlab.html
Simple Speech Recognation program, which recognise numbers between 1-100 using Delphi (in turkish language)
https://www.experts-exchange.com/questions/25100689/Simple-Speech-Recognation-program-which-recognise-numbers-between-1-100-using-Delphi-in-turkish-language.html?fromWizard=true
ASKER
Thanks it was good to work with you...
ASKER
Secondly I am sorry but I have a problem with my delphi 7,
it says [Error] Unit1.pas(14): Undeclared identifier: 'TOleEnum' for the procedure below...
procedure TForm1.OnSRRecognition(Sen