HTML - Escape / Sanitizer

About

A sanitizer is a program that will:

  • not accept all HTML elements
  • and or transform them as text (escape)

This is to avoid script injection and should be used on the server side (ie not client) to validate/transform all inputs.

Example of sanitizing

Description

From

To

<img src=x onerror=alert(1)//>
<img src="x">

Delete the onload and makes the svg XHTML conform

<svg><g/onload=alert(2)//<p>
<svg><g></g></svg>

Delete the iframe

<p>abc<iframe//src=jAva&Tab;script:alert(3)>def</p>
<p>abc</p>

Delete the script node

<math><mi//xlink:href="data:x,<script>alert(4)</script>">
<math><mi></mi></math>

Make the HTML conform

<TABLE><tr><td>HELLO</tr></TABL>
<UL><li><A HREF=//google.com>click</UL>
<table><tbody><tr><td>HELLO</td></tr></tbody></table>
<ul><li><a href="//google.com">click</a></li></ul>

Usage

Library


Powered by ComboStrap