Text to Speech for Windows CE using .NET Compact Framework OR eVB

Is it possible to create a simple text to speechapplication for windows ce using .NET Compact Framework OR Embedded Visual Basic. I have SAPI 5.1 Installed on my computer which I am successfully using with my VB.NET & VB projects to build text to speech applications for PC. But I couldn't find a compact dll that I could refrence from within my .NET Compact project. Am I missing something here or is there a special version of speech SDK for Windows CE ? I have tried searching on the net and MSDN but all I found was a refrence to some 'SAPI 5.0 for Windows CE .NET' but there is no information on how to obtain the SAPI 5.0 SDK or on creating pocket applications using this SDK. It would be a great help if anyone could provide a sample code using VB.NET OR Embedded Visual Basic (eVB).

Thanks in advance,
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

apssAuthor Commented:
Thanks for the help but I have already visited the links provided by you. The first link leads to a discontinued project whereas the second and third links provide source code for win32 text to speech applications. Whereas I need source code for a 'Windows CE' text to speech application.


Compact Framework
i did this because the story for speech on devices is pitiful. PPC 2003 and SP 2003 devices follow the SAPI Lite interface. it is the MS standard interface for both Speech Recognition and Text To Speech (a lighter version of what SAPI is on the desktop). sadly, retail devices are not shipped with SR engines or TTS voices. even crappier, there are SAPI implementations that come with Platform Builder. so if you have a device that you can re-image, then you can image it to do SR and TTS. of course the people with retail devices are SOL. there are 3rd party implementations for doing speech on devices, but they are rag-tag at best. one vendor might offer SR only, while another only has TTS. trying to find one that offers both AND follow the SAPI interface is a chore. when looking around, i had a hell of a time finding a vendor that offered a free evaluation as well. so the state of the industry for speech on devices is crap ...

... so i finally got sick of bitching and just decided to port the above to the Compact Framework (CF). it actually works better than i expected [there is a video of it below]. from the time text is entered to the time that it 1st starts playing takes about a second. since it takes a while for spoken text to be played, you could queue up longer passages and do the processing in chunks over time. the real performance drag is loading the databases. on my HP4355, the 3.5 meg lexicon database took 2 minutes to load. the 7.5 meg 8 bit voice database takes 8 minutes. that is with CFv1 ... i'm not sure which service pack i have at the moment? luckily, the databases only have to be loaded once, then it can be used to speak large passages of text (or write to a file). would love to try this out on one of those Dell 600mhz beasts along with CFv2. other means could also be taken to improve performance. Flite used the technique of compiling the databases into the actual codebase ... that would work too

to reduce the footprint and increase performance, you could also switch to using phonemes instead of diphones ... which i tried. wrote a program to chop up the WAV format of the diphones and then reassemble those parts into the 41 phonemes (40 phonemes + pause). instead of 7.5 megs for the voice database, it reduces down to ~200KB raw WAV files. with the WAV files you dont have to load the database, so that saves 8 minutes. concatenation is also faster since the LPC algorithm does not have to be performed which was using floating point arithmetic. using phonemes instead of diphones this will definitely reduce quality ... but it is still recognizable, albeit more speak-and-spell'ish. it is more robust because it does not have the problem of missing rare diphones like the FreeTTS voice database does now the only performance problem is the 2 minutes it takes to load the diphone dictionary. attempted to get past this by dumping the 120K words into a SqlCE database. this gets rid of the 2 minute load time, but it takes a couple seconds to read from the database with that many entries. might be worthing trying this with SqlMobile, to see what its performance is

speech w/phonemes
1 + 3 = 4
casey chesnut
hello world
subliminal message

so this showed how to create a dead simple speech synthesis program. was able to get it to run on my Pocket PC with decent performance. in each scenario, it ran much better than i expected (considering performance and quality). the speech is monotonous and robotic sounding, but it is definitely recognizable as english speech. these sort of apps make perfect sense in mobile devices. envision walking around with a bluetooth Pocket PC in your pocket. it could connect to a bluetooth GPS and then report the direction you should travel to a bluetooth headset you are wearing. that scenario also makes sense in a car. multi-language scenarios involve typing text in your language and having it spoken in an alternate language. you could also do that in a learning scenario to learn another language. also, if you've used VoiceCommand on your mobile device, wouldn't you like to provide similar functionality in your own apps? or games! how about being able to create diphones from your own voice, and then auto generate podcasts from your textual blog posts that would sound basically the same as your own speaking voice (somebody tell Scoble i said podcast). etc. etc. for more ideas, Richard Sprague recently posted about : Cool demos for the next SAPI. the next SAPI ... why am i just now hearing about this? would love to get an alpha ...

   (1.1 megs)


Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Top Threats of Q1 & How to Defend Against Them

WEBINAR: Join WatchGuard CTO and our Threat Research Team on Aug. 2nd to hear the findings from our Q1 Internet Security Report! Learn more about the top threats detected in the first quarter and how you can defend your business against them!

apssAuthor Commented:
That was a nice article. It  provides a lot of information on creating your own text to speech engine. But right now I am out of time and require a ready made text to speech library which I could use in my pocket pc project. Had the writer provided some source code, I would have modified it to use in my project. I do not have enough time to create it right from scratch. Thanks for the help.

I use a C++ application that uses TTS libraryes of a comertial TTS.
I don´t have tested the Speech 5.0. I know that Platform Builder let´s you the hability to add that libraries to an embedded device, but I don´t know how to add to a PDA or another mobile device.

The C++ aplication spends the time waiting that another app says him a sentence to speech. I use sockets to comunicate with this app.

I don´t think this  helps you but I know nobody that uses that libraryes, but this is my opinion.
But i really don´t know that libraryes.

If it´s posible, i would sugest you to use comertial TTSs like Loquendo, or Scansoft. These are libraryes to use with C++, but you could use Sockets to comunicate with these TTS app.
apssAuthor Commented:
Yes, there are even free open source TTS libraries such as "Flite" which have a small footprint and were intended to work on embedded devices. But the problem is that it is available as linux binaries or as VC++ code which needs to be modified & recompiled as a .NET Compact Framework dll to work on windows CE.NET and I do not know a thing about  VC++. It is available at http://www.speech.cs.cmu.edu/flite/packed/flite-1.2/flite-1.2-release.tar.gz . Can you please check it out for me ?
I´ve been inspecting the code, and it´s very complicated.

All the libraryes are made using makefiles for Linux, and the first thing you have to do is write the makefile files with windows equivalent commands.
As Flite are libraries for Linux, I THINK you must modify the low level function to make it compatible in Windows, and that could be a hard work (I´m not sure about this, first change the makefiles).
I know which are the main functions to begin using the TTS engine with phrases or files, but I can´t help you more.

In the /flite-1.2-release main directory, you have a file named makefile, opening with text editor you will see the commands, that this file, runs. These are Linux commands, you must change these commands first. (ls, ln , tar, sed, ...)

After that you could see if  the program is compatible for windows.

As you can see it´s a hard work.
apssAuthor Commented:
Yeah, hard luck. Thanks a lot anyway. So, does that mean that I have no option other than using commercial libraries.
No, I only say that comertial TTS, are easier to use and I think have better help support(datasheets,...).
I think that this TTS can´t be used by anybody in .Net CF(Windows). I don´t know any free TTS for Windows CE.Net, but if you find, it could be a posible solution.

Good Luck.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Wireless Networking

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.