Solved

Detect language of a string

Posted on 2013-05-20
2
287 Views
Last Modified: 2013-05-20
Hello,

Is it possible to detect the language of a string?

For example:

"Hello" -> English
"über" -> German
"¿a¿¿µ¿¿a" -> Greek

I understand that words written in English but in another language cannot be filtered (e.g. "Guten Tag") however I am asking only for those that have characters specific to each language like the examples above.

Thank you very much!
0
Comment
Question by:infodigger
2 Comments
 
LVL 10

Assisted Solution

by:ienaxxx
ienaxxx earned 250 total points
ID: 39180973
AFAIK there should be something in the google translate API.
Not sure if you was searching for something like this...
HTH
0
 
LVL 109

Accepted Solution

by:
Ray Paseur earned 250 total points
ID: 39181018
Each language has a "signature" that can be detected from its vocabulary, however it's not 100% accurate and the risk of error goes way up on smaller strings because the orthography evident in short strings is rarely unique.  The Google API occasionally suggests that my computer code is written in Dutch, etc.  I think letter-only detection from a single word would be nearly useless except for a very few languages.  For example, the U-Umlaut (Diaresis) may appear in Hungarian, Karelian, Turkish, Uyghur Latin script, Estonian, Azeri, Turkmen, Crimean Tatar and Tatar Latin alphabets, as well as in German.

See http://en.wikipedia.org/wiki/Language_identification
0

Featured Post

Live: Real-Time Solutions, Start Here

Receive instant 1:1 support from technology experts, using our real-time conversation and whiteboard interface. Your first 5 minutes are always free.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction HTML checkboxes provide the perfect way for a web developer to receive client input when the client's options might be none, one or many.  But the PHP code for processing the checkboxes can be confusing at first.  What if a checkbox is…
Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question