HtmlEncode and Curly Quotes, from Mysql to Ajax to Textarea, back to Mysql

Posted on 2012-04-03
Medium Priority
Last Modified: 2012-12-09
I need help on properly ENCODING the following:

1 - grab a record in MySQL with French Characters and curly braces
2 - pass it via ajax to a textarea
3 - view all foreign characters normally inside textarea
4 - edit text and send it back for update via ajax to MySQL

Can you provide a simple example on how to grab this text, edit it, and update it with proper encoding.

Je m’apelle François, J’ai “tois enfants”
Gérard et à “wow” c’est bon àâçéèêëïîôùù

This may be simple to a seasoned programmer, but it's been kicking  my you know what...

I tried htmlentities() before sending to ajax but that didn't work, help.
Question by:dimsouple
  • 5
  • 2
  • 2
  • +1

Accepted Solution

designatedinitializer earned 2000 total points
ID: 37804118
The fundamental thing to have in mind is to use UTF-8 encoding. Use UTF-8 in your database and in your php files.

In your ajax, use some serialization function.
I assume you are using jQuery. (if you're not, then you should).
If so, here's an example of an ajax request:

			type	: "POST",
			cache	: false,
			url		: "../participar.php",
			data	: $("#prizeForm").serializeArray(),
			success	: function(data) {

Open in new window

And here is a snippet of the PHP code that receives the ajax request.
You simply treat it as POST:
if(array_key_exists('action', $_POST)) {	switch($_POST['action'])
		case "alterar":
			// This is an AJAX request from the main window
			$user = new user();
			$user = $session->getVar("user");
			if(!is_a($user,"user")) {
				// Logout
				die("A sua sessão terminou. Por favor faça login novamente.");
			$name     = trim($_POST['altNome']);
			$password = trim($_POST['altPass']);


Open in new window

Then be careful to use mysql_real_escape_string() on all user input, before inserting into the database.

Assisted Solution

designatedinitializer earned 2000 total points
ID: 37804126
IMPORTANT: in your text editor, be sure to change your php files' encoding to "UTF-8, with no BOM"!
(Simply put, the BOM is a bunch of non-visible garbage that gets into the start of your file and can mess with your request headers and spawn misterious errors)
LVL 83

Expert Comment

ID: 37804215
... and if you already have records in an other encoding(latin-1/ISO 8859-1), you should consider this data as corrupted
Train for your Pen Testing Engineer Certification

Enroll today in this bundle of courses to gain experience in the logistics of pen testing, Linux fundamentals, vulnerability assessments, detecting live systems, and more! This series, valued at $3,000, is free for Premium members, Team Accounts, and Qualified Experts.

LVL 111

Expert Comment

by:Ray Paseur
ID: 37810390
You do not need unicode for western european characters.  ISO-8859-1 works perfectly.  The central issue with this or any other encoding problem is getting consistency across the platforms.  This article explains some of it.

See http://www.laprbass.com/RAY_temp_dimsouple.php
<?php // RAY_temp_dimsouple.php

$html = <<<HTML
<!DOCTYPE html>
<html dir="ltr" lang="en-US">
<meta charset="iso-8859-1" />
<title>Accented Characters in ISO-8859-1</title>
Je m’apelle François, J’ai "tois enfants"
Gérard et à "wow" c’est bon àâçéèêëïîôùù

echo $html;

Open in new window


Expert Comment

ID: 37810497
@Ray: Of course ISO-8859-1 encodes french diacritics and such, but there are strong reasons for ditching it in favor of utf-8 (as Joel does in the article you posted...)
LVL 111

Expert Comment

by:Ray Paseur
ID: 37810607
The one reason I would be careful about ditching any ANSI font goes to the need for consistency across all the levels of the platform.  This means the data base, the file system, things that were stored in cookies, client keyboard input, JavaScript, values created inside scripts, HTML, etc.  Any of these things may come with the legacy assumption that they are all single-byte characters.  That assumption may lead to encoding collisions, and in my experience the resulting encoding collisions are very difficult to explain since the conversion to UTF-8 may be difficult for financial managers to understand.  A common response goes something like, "You did what?  It was working before.  Why did you eff with it?"

Expert Comment

ID: 37810629
I do agree with you on this: if it is working, there's no need to fix it.
However, if you are starting something from scratch, always go with Unicode.

Author Comment

ID: 37816468
Thank you all so much. the part about the data being corrupted is no lie. because I failed to specify the charset in the old pages, the form input were coming in in many different formats.

now I've changed everything to UTF-8 and unfortunately, some of the data is in other format.

I've found out that this does the trick on the coruppted data


Assisted Solution

designatedinitializer earned 2000 total points
ID: 37817122
People like me are eagerly (not that much, but anyway...) awating for the next major release of PHP, which supposedly is going to have native support for Unicode, down to variable names and other language tokens.
Meanwhile, we use UTF-8 and we are careful to specify utf-8 files with no BOM.

Other useful features are the utf8_encode and utf8_decode PHP functions, and in MySQL, the ability to specify the character encoding down to the SELECT level: you can have different SELECT statements specify different character encodings.
One other thing to keep in mind is that the character encoding is not the collation (some people tend to confuse these two).
LVL 83

Expert Comment

ID: 37833084

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
There are times when I have encountered the need to decompress a response from a PHP request. This is how it's done, but you must have control of the request and you can set the Accept-Encoding header.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…

624 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question