Link to home
Start Free TrialLog in
Avatar of eurocoptersea
eurocoptersea

asked on

What is the "best practice" for choosing character encoding with PHP/MySQL applications?

Dear All:

May i know what is the "Best Practice" character encode if my application is open to world wide?

How should i make sure that the system and database is capture and store the information for all type of charset that user input?


Thank you.
ASKER CERTIFIED SOLUTION
Avatar of Loganathan Natarajan
Loganathan Natarajan
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of eurocoptersea
eurocoptersea

ASKER

I set my PHP default charset = utf 8, database table (Collation) = 'utf8_unicode_c'.

does this good enough?  or i need to using php function to convert all "POST" data to utf8?

Thank you,
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
thanks. Any quick way to implement? apache web server setting?

cause we have about ten application. if added for each pages may be need a lot of time and resources.

thank you.
You suppose to stored the encoded strings texts in the tables from database... so it will display properly.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
current i the charset for my database was "latin1_swedish_ci", if i change it to "utf8_unicode_ci", will it cause any problems?
I am not sure whether it will set for existing table data. As I tried, it does not work one of my project. you could check that.
Yes, logudotcom.....it will not convert the data you have stored in the table but i just worries whether will it be missing any information if i changes it directly.
You have to set UTF-8 all along the process :

- At coding time, set your IDE to encode files with UTF-8.
- For browsers, meta charset of HTML files have to be UTF-8 (data sent to server will be UTF-8 encoded)
- Create database schema and tables using UTF-8 for charset and collation
- And finally, (thing people often forget) set the database connection to UTF-8, you can do it in your scripts at database initializing, the very first query :

SET NAMES UTF8

or

SET CHARSET UTF8

And your done.
No points for this, but the background information is explained here:
http://www.joelonsoftware.com/articles/Unicode.html