PHP data encoding question

On the browser side, I am passing this string for the 'notes' field:

This is a single quote: '

Open in new window

On the server side, my code to receive the 'notes' field is:

    if (isset($_POST['notes'])) {
        $notes = trim(filter_input(INPUT_POST, 'notes', FILTER_SANITIZE_FULL_SPECIAL_CHARS));

Open in new window

The variable $notes ends up getting the following value:

This is a single quote: '

Open in new window

It seems like PHP has so many function for this purpose that it is getting confusing. What should I do so that I can capture the input of the user EXACTLY as it was typed without having to worry about this kind of a result?

Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Dave BaldwinFixer of ProblemsCommented:
I don't believe there is any way that you can not worry about these things.  Note this is not peculiar to PHP.  These are issues with all browsers and programming languages.  I use this site to check on issues with character sets and 'htmlentities'.

The ' should render as a single quote in all browsers because it is the code for a single quote.  The reason it is encoded is because it could be a delimiter for many functions.
Ray PaseurCommented:
Try this: Use var_dump($_POST); to print the information.  Then look at it with your browser "view source."  You will see the original data.  What you do with this data is up to you and your applications exact needs.

And I agree with Dave.  How you handle external input is among the most important jobs a programmer has!  By definition it is tainted and dangerous, yet it may be your only path of communication with the outside world.  We spend a lot of time on things like that because it's critical to get it right.

More information on how to work with quotes in PHP is available in this article.
elepilAuthor Commented:
My question was very specific, and I was asking about how to make the single quote appear correctly when it reaches the server.

Instead of providing me links to reference manuals and voluminous pages of 'haystack' information, can any of you provide an exact code sample how I can get a simple single quote across to the server side correctly?
OWASP Proactive Controls

Learn the most important control and control categories that every architect and developer should include in their projects.

Ray PaseurCommented:
Sure.  Just type the quote into the form.  I'll post a code example in a second.
Ray PaseurCommented:
<?php // temp_elepil.php



$form = <<<EOD
<form method="post">
<input name="thing" />
<input type="submit" value="POST TO THE SERVER" />

echo $form;

Open in new window

Outputs something like this:
array(1) { ["thing"]=> string(1) "'" }

Open in new window

The quote marks are hard to read unless you look very closely.  The output of var_dump() includes quote marks around the data string.  The data string is a single quote.  So it may look like 5 single quotes, but it's really this (spacing added for clarity)
array(1) { ["thing"]=> string(1) " ' " }

Open in new window

Ray PaseurCommented:
By way of explanation, please refer back to the original question.  We have this:


The man pages that describe your code are here:

By using the FILTER_SANITIZE_FULL_SPECIAL_CHARS you are telling PHP to change your data.  This is equivalent to calling htmlspecialchars() with ENT_QUOTES set. Encoding quotes can be disabled by setting FILTER_FLAG_NO_ENCODE_QUOTES.
elepilAuthor Commented:
Ray, you just demonstrated a var_dump. and how to interpret the results. But how do I receive this input with a single quote on the server to make it look right?
Ray PaseurCommented:
PHP var_dump() is how we look at data to see what it contains.  The var_dump() in this example proves that the single quote was received on the server.  Try the link I posted above and put in O'Brien.  You will see something like this in the output:

array(1) { ["thing"]=> string(7) "O'Brien" }

There is the quote, right in the middle of O'Brien. do I receive this input with a single quote on the server to make it look right?
Follow the example I posted in the code snippet above.  The quote or apostrophe has no special meaning in the value attribute of a POST input control.  It comes through to your server unalloyed.  The reason you're seeing the entity number 039 instead of the character apostrophe is because you're not looking at the data in the POST array.  Instead, you're looking at the data after your script told PHP to change the data by calling filter_input()!
elepilAuthor Commented:
Ray, you are right that the single quote is arriving as a single quote in the POST array.

My objective is to receive the input just as the user typed it, regardless of whatever character he typed in. Apparently filter_input is doing more than I want it to. So how can I be sure that I get the user's input exactly the way he typed it?
Ray PaseurCommented:
To avoid the unwanted effects of filter_input(), don't use filter_input().  It's changing the raw data (you want) to filtered and translated data (you don't want), according to the man page references we posted earlier.
elepilAuthor Commented:
Okay, so I know I shouldn't use filter_input, and htmlspecialcharacters() has problems, too, albeit in a different way.

So how would YOU handle it if your objective was to receive the user's input in the exact likeness by which he typed it in?
Ray PaseurCommented:
I wouldn't go so far as to say you shouldn't use filter_input() or any other PHP function.  You just need to know what these functions do to your data, and then you can know whether it is appropriate to use them.  

To get the exact input, write your code exactly the way I've shown in the code snippets here.  The exact information the client typed in will be presented to your PHP script in the request variables array ($_GET or $_POST).  There are some obsolete settings like Magic Quotes that can mess this up, but that risk has been known for years and nobody would be using that any more.

Quotes have special meaning to many things in computer programming, and PHP + MySQL have special functions for dealing with quotes.  That is why I posted the link to the article about quotes.  Because once you have gotten the "exact likeness" you may still need to use the data that contains the quotes, and there are some things you need to understand about using data with embedded quotes.  It's a common question with a lot of intricacies in the answer, and it's too long to spell it all out every time someone asks, so I tried to cover the important points in the article.  I hope you find it useful.

Best of luck with your project, ~Ray
elepilAuthor Commented:
Ray, the only reason I used filter_input (or htmlspecialcharacters()) was to emulate what I used to do with uuencode and uudecode. I thought filter_input and htmlspecialcharacters() were the PHP counterpart to the uuencode and uudecode I've been accustomed to.

But from what I have seen with my experimentation, would it be correct for me to say that PHP already automatically does an uuencode on the client side and a uudecode on the server side without my having to use filter_input or htmlspecialcharacters()?
Dave BaldwinFixer of ProblemsCommented:
Forms are always 'urlencoded' on the client side by the browsers.  PHP automatically 'urldecodes' the data on the server.  But it's not that simple because some characters have special meanings.  And you can't send binary values without encoding them in some fashion so that normally limits the characters you can send to the ASCII subset and then only the printing characters.  Note that any character code above 127 involves character sets.  Latin and UTF-8 view those differently.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Dave BaldwinFixer of ProblemsCommented:
More info here and here which shows the differences between Windows-1252 and UTF-8.
Ray PaseurCommented:
I'm not familiar with uuencode, but to Dave's point about characters above #127, this article tries to cover the waterfront.  Sorry there are not simple solutions for all of this stuff, but the world is making it up as we go - and we're far from perfect in our early attempts to create software standards!
elepilAuthor Commented:
Dave's answer was what really resonated with me when he said:

"Forms are always 'urlencoded' on the client side by the browsers.  PHP automatically 'urldecodes' the data on the server. "

Fortunately, I'm only concerned with the English language and its character set.  Thank you for both your help.
Dave BaldwinFixer of ProblemsCommented:
You're welcome.  I'm glad also to only have to deal with ASCII English myself.
Ray PaseurCommented:
Fortunately, I'm only concerned with the English language and its character set.
Welcome to 1989.  Sorry, but that world does not exist anymore, now that we laid the undersea cable.

Best of luck with your project, ~Ray
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.