<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: bleach</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/bleach.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2010-10-25T13:32:00+00:00</updated><author><name>Simon Willison</name></author><entry><title>Bleach, HTML sanitizer and auto-linker</title><link href="https://simonwillison.net/2010/Oct/25/bleach/#atom-tag" rel="alternate"/><published>2010-10-25T13:32:00+00:00</published><updated>2010-10-25T13:32:00+00:00</updated><id>https://simonwillison.net/2010/Oct/25/bleach/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://coffeeonthekeyboard.com/bleach-html-sanitizer-and-auto-linker-for-django-344/"&gt;Bleach, HTML sanitizer and auto-linker&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
HTML sanitisation is notoriously difficult to do correctly, but Bleach (a Python library) looks like an excellent effort. It uses the html5lib parsing library to deal with potentially malformed HTML, uses a whitelist rather than a blacklist and has a neat feature for auto-linking URLs that is aware of the DOM (so it won’t try to auto-link a URL that is already wrapped in a link element). It was written by the Mozilla team for addons.mozilla.org and support.mozilla.org so it should be production ready.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/bleach"&gt;bleach&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="python"/><category term="security"/><category term="recovered"/><category term="bleach"/></entry></feed>