Solved

striptags vs. htmlentities

Posted on 2010-11-29
9
922 Views
Last Modified: 2013-12-12
I am having trouble understanding why I would need striptags if I already use htmlenties.

Doesn't htmlentities render tags such as script, php, html etc. harmless?

What additional benefit would strip tags provide?

Thanks.
0
Comment
Question by:kadin
  • 4
  • 3
  • 2
9 Comments
 
LVL 7

Accepted Solution

by:
lexlythius earned 125 total points
Comment Utility
They serve different purposes.

htmlentities encode XML/HTML metacharacters such as <, >, &, etc so they can be safely included inside, say, a <TEXTAREA></TEXTAREA> element.

strip_tags is better used when you want to store text that will be rendered:
in plain-text context, and the HTML tags will clutter the output with garbage, or
in HTML context, but you want to prevent that stored text will be rendered as HTML, tipically to prevent final users from posting HTML and scripts on a web page

Anyway, keep in mind that strip_tags can be easily fooled.
0
 
LVL 7

Expert Comment

by:lexlythius
Comment Utility
Forgot to state that strip_tags deletes the HTML tags as well as any text within them
0
 
LVL 18

Expert Comment

by:Sudaraka Wijesinghe
Comment Utility
striptags only remove the HTML tags, even after you remove the tags, you might end up with text containing symbols that could mess up the HTML code like < or > in a sentence. Also if you have unicode characters or some thing outside the standard printable ASCII range, you will need to use htmlenties.
0
 

Author Comment

by:kadin
Comment Utility
Thanks for your response. I am still a little foggy on how I should go about this.

I am receiving user input such as a paragraph in a textarea, inputting it into a database and displaying it back on a web page.

If strip_tags can be easily fooled, maybe I should forget about that function.

Does htmlentities stop javascript, php or any kind of xss?

I am using pdo and prepared statements.
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 18

Expert Comment

by:Sudaraka Wijesinghe
Comment Utility
If you are getting the HTML from user and displaying it on the page, it's best to strip out any javascript tags using regular expression. Something like /<script[^>]*>(.*)</script>/i maybe? (not tested)

If you want to display the content with the formatting user entered, you should not use htmlentities.
0
 

Author Comment

by:kadin
Comment Utility
I hope I am not causing confusion.

I am just trying to receive text from the user, not html. If the user inputs html such as <script>, I thought htmlentities would change < and > to entities, thus rendering a script tag useless.

If that is so, then I would not need a regex like you described above correct?
0
 
LVL 18

Assisted Solution

by:Sudaraka Wijesinghe
Sudaraka Wijesinghe earned 125 total points
Comment Utility
Yes, htmlentities will make any script tags display as text, so any code will no execute with in them.

But it is safer to just filter out any javascript codes that user might enter. For example let's say one day you desided to transfer that content using AJAX or some method like that. Then if the correct encoding was not used there is a chance that javascript code might execute.
0
 

Author Comment

by:kadin
Comment Utility
Thanks so much for your help. I think I am starting to understand this a little better.
0
 
LVL 18

Expert Comment

by:Sudaraka Wijesinghe
Comment Utility
Glad to help. Thanks for the points.
0

Featured Post

Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Consider the following scenario: You are working on a website and make something great - something that lets the server work with information submitted by your users. This could be anything, from a simple guestbook to a e-Money solution. But what…
Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to count occurrences of each item in an array.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now