Solved

Strip unsafe tags from Html in javascript

Posted on 2016-09-06
5
42 Views
Last Modified: 2016-10-06
HI,
There is a library called Jsoup written in java using which i can remove unsafe tags in html like <script></script etc.
I want to do the same thing. But i need to strip of unsafe tags in javascript.
Jsoup implementation doesnt seem to exist for javascript.

Is there any library or any way i can strip unsafe tags in an html using javascript ?

eg.
If i have # rohit
<script>alert(10)</script>
I want to get # rohit

The use case for this is :
I am writing a markdown editor. User enters markdown in a textarea then switches to markdown mode and i show the corresponding HTML in another pane.
This is all happening on client side.
Now in my case whats happening is user can type stuff like # rohit and when switches to other tab using a lib called
marked  i convert it to HTML which causes the unsafe html tags if present like <script>alert(10)</script> to execute.
Although marked does have an option sanitize but it just replaces < > with &lt etc..
which does prevent the script tag from executing. But the issue is if i type something like <b> rohit </b> in raw markdown the converted HTML will show it as bold. But after sanitization this will show as it is which is wrong.

Thanks
0
Comment
Question by:Rohit Bajaj
  • 2
  • 2
5 Comments
 
LVL 30

Accepted Solution

by:
Alexandre Simões earned 250 total points
ID: 41786850
Hi mate,
in my opinion, the sanitization is working properly and you should keep it.

Your example with the <b> rohit </b> is actually behaving properly because it is not Markdown.
In Markdown, if you want to do bold you do **rohit**

Remember that Markdown is and abstraction language which can be used to generate formats other than HTML. This means that <b> rohit </b> will appear exactly like this if you use a Markdown to PDF converter, for instance.

Bottom line, force users to user the Markdown syntax when they are in the Markdown editor, and you won't have any problems.

Cheers,
Alex
0
 

Author Comment

by:Rohit Bajaj
ID: 41787265
Hi,
But github gist which uses github flavoured markdown does so. If you type in any unsafe tag it does strips it and shows the ones in <b> rohit </b> in bold. I want to make my application in line with github flavoured markdown...
0
 
LVL 1

Assisted Solution

by:tr0gd0r
tr0gd0r earned 250 total points
ID: 41788589
I suggest using a javascript library that converts markdown to html such as marked. Or alternatively a full HTML parser.

Academically speaking, you can use the browser's native ability to parse HTML by doing something like this:

// create a div in memory
var div = document.createElement('div');
// set the html
div.innerHTML = '<script>alert("pwnd")</script><img src="http://hacker/virus.png" onclick="alert(\'pwnd onclick\')" onerror="alert(\'pwned onerror\')">';
// log the html, which will have <script> removed
console.log(div.innerHTML); // <img src="http://hacker/virus.png" onclick="alert('pwnd onclick')" onerror="alert('pwned error')"> 

Open in new window

Some tags including <script> and <style> will be stripped out or ignored automatically.

HOWEVER, the browser will make requests to things like image sources and run events like onload and onerror. Plus if you append the in-memory node to the DOM without cleaning it first, clicking the img would run any onclick code.
0
 
LVL 30

Expert Comment

by:Alexandre Simões
ID: 41788734
In that case, use an HTML sanitizer instead and disable the Marked sanitizer.
It won't touch anything other than the unsafe tags.

Another thing you should consider is to "re-sanitize" on server-side.
You shouldn't trust the client, ever. It's Ok to do the sanitizing client-side in order to be more user friendly, but before saving you should check it with your server-side code.

Cheers
1
 
LVL 1

Expert Comment

by:tr0gd0r
ID: 41788745
@Alexandre Exactly. Hopefully you would use the JavaScript-based sanitation for the preview and only send the markdown to the server. Then the server would parse and clean the markdown and not store HTML at all.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Introduction If you're like most people, you have occasionally made a typographical error when you're entering information into an online form.  And to your consternation, the browser remembers the error, and offers to autocomplete your future entr…
This article discusses how to create an extensible mechanism for linked drop downs.
In this tutorial viewers will learn how to embed Flash content in a webpage using HTML5. Ensure your DOCTYPE declaration is set to HTML5: "<!DOCTYPE html>": Use the <object> tag to embed Flash content.: To specify that the object is Flash content, d…
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now