What is the "best practice" for choosing character encoding with PHP/MySQL applications?

eurocoptersea
eurocoptersea used Ask the Experts™
on
Dear All:

May i know what is the "Best Practice" character encode if my application is open to world wide?

How should i make sure that the system and database is capture and store the information for all type of charset that user input?


Thank you.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
It is always to best use, "utf8_general_ci"  as you're going to be world wide.
utf8 charset, utf8_general_ci collation

Author

Commented:
I set my PHP default charset = utf 8, database table (Collation) = 'utf8_unicode_c'.

does this good enough?  or i need to using php function to convert all "POST" data to utf8?

Thank you,
11/26 Forrester Webinar: Savings for Enterprise

How can your organization benefit from savings just by replacing your legacy backup solutions with Acronis' #CyberProtection? Join Forrester's Joe Branca and Ryan Davis from Acronis live as they explain how you can too.

also you need to set on your pages,

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

Author

Commented:
thanks. Any quick way to implement? apache web server setting?

cause we have about ten application. if added for each pages may be need a lot of time and resources.

thank you.
You suppose to stored the encoded strings texts in the tables from database... so it will display properly.
You can update it in my.cnf also,

[client]
default-character-set=utf8

[mysql]
default-character-set=utf8

Open in new window


http://dev.mysql.com/doc/refman/5.6/en/charset-configuration.html
ref. http://dev.mysql.com/doc/refman/5.0/en/charset-applications.html

Author

Commented:
current i the charset for my database was "latin1_swedish_ci", if i change it to "utf8_unicode_ci", will it cause any problems?
I am not sure whether it will set for existing table data. As I tried, it does not work one of my project. you could check that.

Author

Commented:
Yes, logudotcom.....it will not convert the data you have stored in the table but i just worries whether will it be missing any information if i changes it directly.

Commented:
You have to set UTF-8 all along the process :

- At coding time, set your IDE to encode files with UTF-8.
- For browsers, meta charset of HTML files have to be UTF-8 (data sent to server will be UTF-8 encoded)
- Create database schema and tables using UTF-8 for charset and collation
- And finally, (thing people often forget) set the database connection to UTF-8, you can do it in your scripts at database initializing, the very first query :

SET NAMES UTF8

or

SET CHARSET UTF8

And your done.
Most Valuable Expert 2011
Top Expert 2016

Commented:
No points for this, but the background information is explained here:
http://www.joelonsoftware.com/articles/Unicode.html

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial