how does Amazon Echo (Alexa) know what is said to it?

Hello and Good Afternoon Everyone,

            I am wondering how Amazon Echo (Alexa) understands human speech and able to respond with answers.  

            Thanks

            George
GMartinAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

William FulksSystems Analyst & WebmasterCommented:
The technology for computers to understand voice commands has been around for YEARS. Basically, the device has a microphone built into it and when it hears a voice, it process that info based on frequency patterns to identify words, phrases, numbers, etc. Basically, it's a glorified speech to text converter that runs the text through a search engine.
1

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
N8iveITCommented:
William is correct; an explanation and an update on the types of models are at:
1. Basic Explanation - https://www.cnet.com/how-to/amazon-echo-alexa-everything-you-need-to-know/
2. The Latest - https://www.cnet.com/how-to/amazon-echo-alexa-everything-you-need-to-know/

Since the "horsepower" of computing increases as the costs decrease (i.e. Moore's Law), more difficult / resource intensive tasks like voice commands / robotics and the like get processed much quicker and some are now instantaneous ... when they used to be either impossible, very slow or only possible on very expensive computers.
0
William FulksSystems Analyst & WebmasterCommented:
It works the same way your smartphone can responds to voice commands or even identify music. If you think of the physical representation of sounds, it is done in waves with a complex set of measurements that differentiate tones and such, so that the number 4 and the letter X will look different when shown as a wavelength. With this info you can create a database saying "word" matches whatever criteria and then you're just making a search.

This is why people with strong accents will sometimes have issues with voice commands because it will heart certain words as they are actually spoken and not take the accent into account. Some programs, like Dragon's Naturally Speaking, will learn from you the more you use it, effectively building a custom database based on your own voice and manner of speech. I know some disabled folks who use this for writing, etc.
0
Challenges in Government Cyber Security

Has cyber security been a challenge in your government organization? Are you looking to improve your government's network security? Learn more about how to improve your government organization's security by viewing our on-demand webinar!

serialbandCommented:
They also have a larger database of phrases now to help with more natural speech patterns.  Early Dragon, Naturally Speaking (back in the late 90s), had you... talk... one... word... at... a... time... and... required... a... break... between... each... word....
If you didn't leave pauses, it would mess up.  These days, our computers hold much more data and can process them faster as well.
0
GMartinAuthor Commented:
Hello and Good Afternoon Everyone,

         Thank you so very much for the enlightening feedback given in reply to my question.  I have to admit that I thoroughly enjoyed reading each person's shared thoughts and certainly did learn a great deal from this participation.

          George
0
Owen RubinConsultantCommented:
I hate to do this, but sorry William,  that answer is not correct for an Echo device.

An Amzon Echo only listens for its name, which it can usually recognize by simple pattern matching. Until it hears its key word, it throws away all other sounds.
When it hears its name, it records the voice that comes after that until a reasonable pause is heard, and streams that voice clip up to the Amazon servers.
Then the Amazon servers use very fast voice recognition and translation software and does a voice to "word" conversion, creating a string of words it heard. But that is not done on the Echo.
The string of words are then sent to a parser (in Amazon's cloud) to determine a best match to what you asked for.
Then, the Amazon servers send back a series of instructions and voice response info to do what you asked.

The actual voice to word conversion does not take place inside the Echo. This allows the processor to be lower speed and power, and allows the full power of high end servers to do the conversion.

Siri and Google do the same things. The conversion of voice to words is typically not done on the phones. That is why an internet connections is required to use those services. It is also why Amazon Alexa will tell you it cannot understand you when the internet is down.
0
William FulksSystems Analyst & WebmasterCommented:
Owen, his question was how it understands human speech not where the processing takes place. You are correct that it is a voice-acted interface for a cloud-based application. Same for phones and the like. My answer isn't incorrect, though.
0
Owen RubinConsultantCommented:
Ok, I guess we read that differently. On a basic level, I agree, your answer explains speech to word conversion. But he asked how an Amazon works, and to be clear, it does not do the conversion.
0
serialbandCommented:
To the layperson, it doesn't matter.  The modern computer eventually will evolve and we'll call cloud based systems a computer system.
0
Owen RubinConsultantCommented:
On the other hand, I have been nitpicked to death at times here when not putting in full details and trying to give a simple answer. Sorry, but if you are going to answer a question here, why not be as accurate as possible?  There is a big difference NOW between a device doing all the work, and the device sending the work to the cloud. In this particular case, the answer does not explain why it stops working and understanding when the network is lost. Since points were already awarded I was simply trying to add more accurate details. Sorry if some of you are offended at trying to be more accurate with an answer.
0
IT-ExpertCommented:
I know this question has been marked as SOLVED (so I'm not adding further comment just to try and gain points or anything), but:

@serialband - "The modern computer eventually will evolve and we'll call cloud based systems a computer system."

That's not the modern computer evolving really, and I thought we already DID call 'cloud' systems a computer system.... because that's what they are!
Or have I mis-interpreted the post?
0
N8iveITCommented:
@IT-Expert - I think what he's saying is, in the past, we have referred to the computer as the actual hardware, etcetera which is physically in our house. It did all computing without having to "go for assistance outside the room." Then we started offloading specific processes to math coprocessors, video cards, etcetera but they all still resided at the same address, on the same piece of motherboard with no outside assistance. When one says "computer system" today, most non-technical people still only think of the physical box at the physical address as the "system."

In the future, with Alexa being a good example, the systems at my physical address will be "smart enough" to get started and then offload the remaining computing process(es) to a more powerful system at another physical address via "the cloud" (which is nothing more than a consumerized name for remarketing the Internet) then bring the answers back to my physical address. As this becomes "more publicly understood", saying "computer system" will simply mean "the stuff that gets me answers / results regardless of location."

Sorry for the long answer but that's what I heard when he posted that statement ... an evolution of understanding for the non-technical people using the "plug and play devices" like Alexa; plug it in, put it on Wifi and get me my stuff ...
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Miscellaneous

From novice to tech pro — start learning today.