Filtering User Input

Hi Experts,

I have a textarea that the user needs to provide comments for a varchar(4000) MySQL field.

Using PHP, how can I filter(or better - escape) malicious or even naive input like a semicolon in a sentence.

For my particular scenario I cannot use any binding (at least I don't think I can), because I am updating up to 50 records with the same comment.

Actually, I know I can bind within my loop, but that would mean hitting the database up to 50 times
APD TorontoSoftware DeveloperAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

If you are executing something like:
UPDATE Shoes SET discount =? WHERE `color`=?

and you bind $discount='10%', $color='red', it doesn't matter if there are 100 pairs of red shoes or just 1 -- they will all be updated with a 10% discount.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Marco GasiFreelancerCommented:
You can build an array of forbidden values and use it in str_replace. Suppose you want get rid of ';', '<', '>':

$comment = $_POST['comment'];
$forbidden_values = array(';', '<', '>');
$result = str_replace($forbidden_values, '', $comment);

Open in new window

You might be interested in learning about filter_input and sanitizing filters
Ray PaseurCommented:
Further to Hielo's comment, the SQL standard applies the query to every row of the database, except as restricted by the WHERE and LIMIT clause.  If you run a query that says DELETE FROM myTable, Poof!  All your data is gone.  This applies to any query that mutates rows, and also applies to SELECT queries without WHERE and LIMIT clauses -- all rows are inspected.  That can create a hidden performance bug, much like writing SELECT * instead of selecting the columns you need.

There is nothing wrong with a semicolon in user input, but there might be something wrong with the application design, if I am understanding this question correctly.  Here is what I think I might be getting from the question.

1. A user visits the web site and puts up to 4,000 characters of data into a textarea
2. When the form is submitted the server takes the textarea and puts its contents into more than one row of the database

If that's the case, the application violates the DRY principle.  A better design might be to put the contents of the textarea into a database table, get the key of the inserted row, and put this key into the related rows of the database.  That's using the relational part of a relational database correctly!

Marco's links for filter and sanitize are important to understand.  There is a little more to it.  The way this set of principles is often written is "filter input, escape output."  A general design goes something like this:

1. Upon receiving input of any sort, make absolutely certain that it contains exactly what your programming expects.  For example, if it's an integer, test to see that it is an integer, or explicitly cast it to integer values.  If it's expected to be a positive integer, make sure it is >0.  Stuff like that.  In the case of text from a keyboard, you might use the PHP filters or some kind of regular expression to nullify the non-keyboard characters.

2. After you have sanitized your data, escape it with MySQLI_Real_Escape_String() or similar.  Prepare the query and Bind the variable to the query if you're using PDO.  Store it in the database as little changed as possible, subject to your rules of sanitization.  There are code examples in this article.

3. When you have run a SELECT query and recovered the data from the database and you're ready to prepare it for output (either directly with echo or in a template with HEREDOC notation), always run all textual data through htmlspecialchars() or similar.  This last step is important because you do not want to inadvertently broadcast evil JavaScript from your web site.
APD TorontoSoftware DeveloperAuthor Commented:
This actually needs to settle up to 50 invoices that the user identifies via checkbox.  

I forgot about using WHERE IN, which will make it possible for me to use binding.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.