We help IT Professionals succeed at work.

Need a quick web UI for a great speech recognition API

curiouswebster used Ask the Experts™
Need a quick web UI for a great speech recognition API

I sampled an AWS back-end API (using their demo) and found the quality to be excellent. Meanwhile, I have heard Google and Watson also have great API's.

But, my friend, who can no longer type into a keyboard, can not find a way to access any of these great API's.

Can you tell me the names of these services?

Do you know of any consumer focused front ends that would provide access to these awesome API's?

If I decide to throw together a quick front-end, which API is easiest to develop with?

Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
NoahHardware Tester and Debugger
Hi there! :)

In terms of the actual development of these APIs, I can't really clearly state that there is an easier one to develop. You may refer to the following article for some substantial research and good examples.

Reference: https://codeburst.io/html5-speech-recognition-api-670846a50e92
curiouswebsterSoftware Engineer


Hi, I suppose I also need to enable navigation of my website with voice. Otherwise, someone who can not communicate any other way, needs to navigate to the part of my app which listens for a message the user would like to have translated.

I will leave this question open for a little while to see if anyone has any experience of advice
NoahHardware Tester and Debugger
Yes, you will need to to enable navigation of the website. This would come in the form of voice recognition code and add-ins. As for translation recording, a solution that could be integrated is Microsoft Translator Speech API.
Developer & EE Moderator
Fellow 2018
Most Valuable Expert 2013
I am confused by what you need.

Are you looking for the front end ui (html,css and javascript) to make something look good?  This will not have anything to do with the back end / talking to the API. Just taking in the input and displaying the output.

Or are you asking which API is easiest to work with?  This will have nothing to do with the front end.

Most of the API's will work in a similar fashion and that basically means you will set up your credentials in an admin panel.  Then make an api call to get an authorization token and all subsequent calls will require the auth token passed with each call until the token expires.

You can use any language you feel comfortable with. Some api's allow client side javascript. Just be aware JS means anybody can see data you are passing.

Some API's will have language specific SDK's. Google for instance has about 7 https://cloud.google.com/speech-to-text/docs/quickstart-client-libraries. For php as example, a quickstart is
# Detects speech in the audio file
$response = $client->recognize($config, $audio);

# Print most likely transcription
foreach ($response->getResults() as $result) {
    $alternatives = $result->getAlternatives();
    $mostLikely = $alternatives[0];
    $transcript = $mostLikely->getTranscript();
    printf('Transcript: %s' . PHP_EOL, $transcript);

Open in new window

All the xmlhttposts are being done by CURL in the bakground via the SDK.  

AWS does have SDK's https://aws.amazon.com/transcribe/resources/?nc=sn&loc=4

With all that said, why not just use Dragon?  https://www.nuance.com/dragon.html

One thing I do not like about Google as far as developing, they tend to change their minds and drop API's at will or at least change how the API works meaning you have to keep up with it.  Speech recognition is pretty important and widely used in mobile. I would think it has better chance of staying alive. But I have burned before with Google.

The best thing to do is give both a try and see how they work for you. Sometimes diving in like that makes for an easy decision on which way to go.
curiouswebsterSoftware Engineer


NoahHardware Tester and Debugger

You're welcome! Glad I was of help :)