Link to home
Start Free TrialLog in
Avatar of Rohit Bajaj
Rohit BajajFlag for India

asked on

Strip unsafe tags from Html in javascript

HI,
There is a library called Jsoup written in java using which i can remove unsafe tags in html like <script></script etc.
I want to do the same thing. But i need to strip of unsafe tags in javascript.
Jsoup implementation doesnt seem to exist for javascript.

Is there any library or any way i can strip unsafe tags in an html using javascript ?

eg.
If i have # rohit
<script>alert(10)</script>
I want to get # rohit

The use case for this is :
I am writing a markdown editor. User enters markdown in a textarea then switches to markdown mode and i show the corresponding HTML in another pane.
This is all happening on client side.
Now in my case whats happening is user can type stuff like # rohit and when switches to other tab using a lib called
marked  i convert it to HTML which causes the unsafe html tags if present like <script>alert(10)</script> to execute.
Although marked does have an option sanitize but it just replaces < > with &lt etc..
which does prevent the script tag from executing. But the issue is if i type something like <b> rohit </b> in raw markdown the converted HTML will show it as bold. But after sanitization this will show as it is which is wrong.

Thanks
ASKER CERTIFIED SOLUTION
Avatar of Alexandre Simões
Alexandre Simões
Flag of Switzerland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Rohit Bajaj

ASKER

Hi,
But github gist which uses github flavoured markdown does so. If you type in any unsafe tag it does strips it and shows the ones in <b> rohit </b> in bold. I want to make my application in line with github flavoured markdown...
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
In that case, use an HTML sanitizer instead and disable the Marked sanitizer.
It won't touch anything other than the unsafe tags.

Another thing you should consider is to "re-sanitize" on server-side.
You shouldn't trust the client, ever. It's Ok to do the sanitizing client-side in order to be more user friendly, but before saving you should check it with your server-side code.

Cheers
Avatar of tr0gd0r
tr0gd0r

@Alexandre Exactly. Hopefully you would use the JavaScript-based sanitation for the preview and only send the markdown to the server. Then the server would parse and clean the markdown and not store HTML at all.