Google translate v2 api returning non UTF-8 characters in PHP

I am sending a PHP CURL call to send english text to be returned in french. The french accent character comes back with improper coding.

Here is the code:

$url = 'https://www.googleapis.com/language/translate/v2?key='.$api_key.'&q='.rawurlencode($text);
$url .= '&target='.$target;
if($source) $url .= '&source='.$source;
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);                 
curl_close($ch);

Open in new window

If I send the text of: Be sure that your students love and know you

I get back:
Assurez-vous que vos élèves aiment et vous savez

If I manually go out to: http://translate.google.com I get this.
Assurez-vous que vos élèves aiment et vous savez

I know it's an issue with the UTF-8 Encoding, but I'm not sure how to get it to correctly display.
Paul KonstanskiProject SpecialistAsked:
Who is Participating?
 
Ray PaseurCommented:
There is no really simple "shortcut" to an answer with problems like character encoding.  You will have to make tradeoffs that affect the quality of your data.  The guidance in this article may help you adopt better choices.
http://iconoun.com/articles/collisions/

1). The DB is only one link in the chain.  You want consistency across all parts of the application.  The article shows exactly how to get this right in the DB, in the code, in the browser, in the transport layers.  It all has to be right, or else you'll get anomalous results.

2). All Western European languages may have issues with UTF-8 because the commonplace encodings like ISO-8859-1 and CP-1252 often use characters that are in the UTF-8 dead zone.  The article explains where these collisions may occur, why they happen, and what to do about it.  After you have read the article, if you still have questions, please post back with any specifics and I'll be glad to help.
0
 
F PCommented:
0
 
F PCommented:
More specifically:

$url = 'https://www.googleapis.com/language/translate/v2?key='.$api_key.'&q='.rawurlencode($text);
$url .= '&target='.$target;
if($source) $url .= '&source='.$source;
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);                 
curl_close($ch);

$response = utf8_encode($response);

echo $response;

Open in new window

0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
Paul KonstanskiProject SpecialistAuthor Commented:
Sorry, that doesn't work.
When I try that this happens:

Without utf8_encode($response):
Assurez-vous que vos élèves aiment et vous savez

With utf8_encode($response)
Assurez-vous que vos élèves aiment et vous savez
0
 
F PCommented:
$apiKey = '<paste your API key here>';
    $url = 'https://www.googleapis.com/language/translate/v2/languages?key=' . $apiKey;

    $handle = curl_init($url);
    curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);     //We want the result to be saved into variable, not printed out
    $response = curl_exec($handle);                         
    curl_close($handle);

    print_r(json_decode($response, true));

Open in new window


Try that set of code, which I found here:
http://www.sitepoint.com/using-google-translate-api-php/
0
 
Paul KonstanskiProject SpecialistAuthor Commented:
I don't see any real difference in the code you provided. That article is one of the ones from which I modeled my original code. I use a few name differences (e.g. "ch" instead of "handle") - Mine stands for "curl handle"... but other than that, all is the same.
0
 
Ray PaseurCommented:
You need to be consistent across all elements of the data - client input, database storage, PHP scripts, HTML output -- these all have to be right to get a good result.  I cannot give you a link to my article on E-E because even E-E has problems with this character-set encoding problem, but if you want to read a bit, you can learn the issues here on my web site.
http://iconoun.com/articles/collisions/
0
 
F PCommented:
json_decode($response, true)

Open in new window


there's the difference.
0
 
Ray PaseurCommented:
If you're getting a JSON response, it is UTF-8, by definition.  The article linked above explains it.  You have a character encoding collision.  Probably (but this is just a guess) your data is UTF-8, but your browser assumes, or has been told, it is getting a western character set.

JSON: http://www.json.org/
PHP: https://php.net/manual/en/function.json-decode.php
0
 
Paul KonstanskiProject SpecialistAuthor Commented:
So far it looks like only French is having the UTF8 problem at the point I'm inserting into the DB.

If I run "mysqli_set_charset('utf8',$conn); " before I do the insert or update, it appears to work okay. So two follow-up questions:

1). Is that an okay method to run that where needed to make sure the translation gets into the DB properly.
2). Do you know of other languages in addition to French that may have this issue?
0
 
Paul KonstanskiProject SpecialistAuthor Commented:
The article did help give insight at how deep you have to go to get it right.

I still have to do some work to get my databases all working right, but  it is pretty slick how it works. You can check it out here: https://essentials24.org.
0
 
Ray PaseurCommented:
I like it!  Slick design, easy to navigate -- really great piece of work!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.