• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 595
  • Last Modified:

Carriage returns and apostrophes input into database - how to?

Built a form that submits info to a database that is retrieved later and displayed on the page. MySQL seems to ignore carriage returns or line breaks I make when submitting the forms and single/double quotes are returned as unencoded, so everything is one huge block of text in the database and messy looking when it is retrieved for display.

How to correct this? I'm using htmlspecialchars for security purposes, but I'm pretty sure this doesn't address formatting.

What SQL or PHP command do I use to address proper formatting re: paragraphs and apostrophes?

6 Solutions
Dave BaldwinFixer of ProblemsCommented:
Use mysql_real_escape_string http://us2.php.net/manual/en/function.mysql-real-escape-string.php or mysqli_real_escape_string http://us2.php.net/manual/en/mysqli.real-escape-string.php to encode special characters when you put them in the database.  They are supposed to come back un-escaped though.

Browsers don't recognize returns or line breaks when they display your text.  There is a PHP function nl2br() http://us2.php.net/manual/en/function.nl2br.php that converts the \r and \n into <br> tags.  But there isn't any PHP 'paragraph' command that will convert your text into <p>...</p> blocks.
Actually there are a couple of things to keep in mind here.

1 the browser doesn't show line brakes unless you display in <pre> tags.

2 htmlspecialcharacters changes all html characters and doesnt give security in the db. For security use mysql_real_escape_string

3 you can use nl2br() to translate linebrakes to html br code.

4 check the source of the web page and you will probably see the linebrakes.
Julian HansenCommented:
This will do a conversion to <p>

Assume you retrieve your text into a variable like so
$textbock = htmlspecialchars($row['formdata']);

// The inner str_replace converts any \r\n pairs into \n
// The second replaces the \n with </p><p> closing and opening paragraph tags
// Only one of these will actually be necessary but I include both for completeness

$textblock = str_replace("\n", "</p><p>", str_replace("\r\n","\n",$textblock));

// We now output with the beginning and end <p> tags to complete the block

echo "<p>" . $textblock . "</p>";

Open in new window

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Why so complex.

$text = nl2br($originaltext);

No htmlspecialchars is necessary as its only used for MySQL security and there we should use mysql_real_escape_string
Julian HansenCommented:
"What SQL or PHP command do I use to address proper formatting re: paragraphs and apostrophes?"

nl2br replaces \n with <br/>

If you want <p> then the above code will do that.
Ray PaseurCommented:
Filter Input Escape Output

To secure your external inputs, you want to follow the standard security mantra: Accept Only Known Good Values.  This is the "filter input" side of FIEO.  Only you can decide what is a Known Good Value, but this is important.  And it is not the same as Exclude Known Bad Values -- exclusion rules leave the hackers in control of the agenda.

The "escape output" side is the right place to use htmlentities() or htmlspecialchars().  The reason you want to use that function before sending any output to the browser is pretty simple, but also pretty important.  In the unlikely event that your data base somehow gets polluted with evil JavaScript, you protect your client community by escaping the output.  The JavaScript will be displayed by the browser, but not executed.

I do not recommend munging the data with htmlspecialchars() before storing it in the data base.  You should store an exact copy of the client input, not an artificially modified version.  One reason for this is that MySQL will silently truncate strings that are longer than the table length definition.  You reduce that risk if you store only what the client sent you.

As a tangentially interesting exercise, you might want to read this article.

Here is the summary of how to employ the FIEO technologies...

In the action script, apply any filtering rules to the external data, then apply the mysql_real_escape_string() function.  Use the resulting string in your INSERT or UPDATE query.  This string will contain the appropriate escape characters, which will be stripped by MySQL.  The result will be a true (and filtered) copy of the external data.

In the display script, SELECT the columns from your data base and first apply htmlentities() to make the data safe for client display.  Next apply nl2br().  For what will become obvious reasons, you do not want to reverse the order of these functions ;-)  You may also find that wordwrap() is useful for formatting the output.

Online man pages applicable here:
http://php.net/manual/en/function.mysql-real-escape-string.php (See also MySQLi)

HTH, ~Ray
ShootFromtheHipAuthor Commented:
Wow, all great answers from everybody. Very thorough. Thank you.

Only problem is - which answer to take? XD

I'm sure a combination of the above information would be most prudent. Perhaps a combo of htmlentities, mysqli_real_escape_string and others would work. I like figuring things out on my own, yet the proper (most secure) way to go about handling data when inserted into the database via forms is still a bit shaky to me.

Ray, thanks for your comprehensive and brilliant answers as always. I'm not nearly as advanced as you and my questions may seem a bit silly, but here it is anyway:

On FIEO - I'm not sure what the difference would be between input and output as far as htmlspecialchars and entities goes. Surely, we are applying htmlspecialchars and its cousin to data that is being input to a field, no? Or is it transforming the security risks AFTER the data has been output to the browser (or maybe a little of both?).

And just so I get this straight, is this accurate:

htmlspecialchars = transforms risky HTML characters into their safer counterparts so no malicious code (JavaScript) can be output to the browser.

mysqli_real_escape_string = escapes certain characters like ', " and / to prevent bad characters from polluting the DB - SQL injection attacks and such.

That's my very basic (mis)understanding of what they do. Correct me if I'm mistaken.

JulianH - after reading your comment on nl2br, I looked at this page's code. Looks like EE may be using nl2br to display our messages like you're saying. Very strange to me that we'd have to resort to two <br> tags to mimic a <p>. I would think there would be some sort of PHP function that creates <p> tags all by themselves. Maybe I'll create one, but I digress...

My last question - should htmlspecialchars and mysqli_real_escape_string be used in tandem with one another? E.g.

$noviceProgrammer = mysqli_real_escape_string(htmlspecialchars('stopAttackingMyDB'));

Thanks for your detailed and quick responses. I appreciate them.
Julian HansenCommented:
On <p> tags an nl2br - essentially these both do a string search and replace. I prefer the <p> tags as it allows me to properly style paragraphs and spacing with CSS

Regarding cleaning data - there are two schools of thought

1. Clean the data and convert potentially unsafe code before putting it in the database
2. Clean it before displaying it

There are arguments both ways

However, certain cleansing is required for db insertion - hence the mysql_real_escape_string - which ensures that all potentially DB critical chars are converted to their safe counter parts.

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now