Link to home
Start Free TrialLog in
Avatar of esak2000
esak2000

asked on

Program to determine language of text

Is there a program that can be used to determine the language of a text programmatically?
I would like to be able to pass some text as a parameter and get back the language of the text, similar to the way that Google's language detect works on the web.

I sent an email to Likasoft for their Polyglot 3000 software, but didn't get a response.

I'm using MS Visual FoxPro as my programming language.

Avatar of Pavel Celba
Pavel Celba
Flag of Czechia image

So, why don't you use http://www.google.com/uds/samples/language/detect.html from FoxPro? It should be feasible to start browser or browser OLE control, propagate the text, and read the result...

Then you may continue on many links here: http://tnlessone.wordpress.com/2007/05/13/how-to-detect-which-language-a-text-is-written-in-or-when-science-meets-human/ 
Avatar of esak2000
esak2000

ASKER

"So, why don't you use http://www.google.com/uds/samples/language/detect.html from FoxPro? It should be feasible to start browser or browser OLE control, propagate the text, and read the result..."

I tried that (and using Google's tool is my preferance), but the html result is not viewable in IE, so Foxpro can't 'see' the result. Can you view it?

I'll take a look at your links

Thank you!
Hi CaptainCyril,

I tried using the API, but the html results of the page show me:

 var container = document.getElementById("detection");
  container.innerHTML = text + " is: <b>" + language + "</b>";

so I can view the actual language on the web page that is generated, but the web code only shows me the variable 'language' and not the actual variable value. Do you know how I can view the variable language in the html of the response?

Thank you,

Daniel
ASKER CERTIFIED SOLUTION
Avatar of Cyril Joudieh
Cyril Joudieh
Flag of Lebanon image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I tried that, but it doesn't work. Is there some way to force the variable to a file?
You can also fill the variable in a hidden variable or textbox and then read the value of the textbox from VFP.
"You can also fill the variable in a hidden variable or textbox and then read the value of the textbox from VFP."

Can you write the basic code for that?
Are you able to change the java script?

document.getElementById('txtLanguage').value = language
I found the html in the innertext as you suggested earlier!
The problem was, the page was showing as 'complete' before it loaded the javascript completely, and therefore wasn't showing in my code. I adjusted the code to wait and now it works.

Thanks so much for your help!

Daniel
You are welcome.
It is quite inelegant to make use of IE and javascript and extract the result from a HTML page, as there is a version of the google API for non web/javascript users: http://code.google.com/intl/en/apis/ajaxlanguage/documentation/reference.html#_intro_fonje

You can call this for example the following way. The text must be UTF-8 encoded, eg by tcUTF=STRCONV("the text string",9) and then URLencoded.

If you want to stay with the javascript version also take a look at http://west-wind.com/weblog/posts/493536.aspx 
This shows in general how to call javascript from vfp and even get back a result. You could use this to more directly get the value of the language javascript variable instead of first writing to the HTML document and then extracting it from there.

Doesn't matter very much in performance, as the main bottleneck already is the web request anyway, but it may come in handy anyway.

Bye, Olaf.
o = CREATEOBJECT("Microsoft.XMLHTTP")
o.open("GET","http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=This%20is%20a%20test")
o.send()
? o.responseText

Open in new window

Something is missing: You also need to wait for o.readystate=4 before being able to access o.responsetext.

Bye, Olaf.

o = CREATEOBJECT("Microsoft.XMLHTTP")
o.open("GET","http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=This%20is%20a%20test")
o.send()
do while o.readystate<>4
 doevents force
enddo
? o.responseText

Open in new window

Great example Olaf!
Thank you for the posting! I will try it.
It worked, and saves a lot of time. Thanks for taking to the time!
Daniel
Thanks, pcelba and Daniel,

especially, because this could come in handy on day.

As a bonus some simple urlencode function and an example on using it with some spanish text.

Bye, Olaf.


lcText = "Hablamos Español"
lcText = UrlEncode(Strconv(lcText,9))

o = CREATEOBJECT("Microsoft.XMLHTTP")
o.open("GET","http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q="+lcText)
o.send()
do while o.readystate<>4
 doevents force
enddo
? o.responseText

#Define ccUrlChars '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ$-_.~'

Function UrlEncode(tcData)
      Local lcUrlEncoded, lnPos, lcChar
      lcUrlEncoded = ''

      For lnPos=1 To Len(tcData)
         lcChar = Substr(tcData,lnPos,1)
         If lcChar $ ccUrlChars
            lcUrlEncoded = lcUrlEncoded + lcChar
         Else
            lcUrlEncoded = lcUrlEncoded + '%'+Upper(Right(Transform(Asc(lcChar),'@0'),2))
         Endif
      Endfor

      Return lcUrlEncoded
 EndFunc 

Open in new window