HTML::StripScripts and markdown incompatibilities

Michel Fortin michel.fortin at michelf.com
Tue Aug 24 09:34:02 EDT 2010


Le 2010-08-24 à 8:49, Louis-David Mitterrand a écrit :


> On Tue, Aug 24, 2010 at 08:41:05AM -0400, Michel Fortin wrote:

>> Le 2010-08-24 à 8:27, Louis-David Mitterrand a écrit :

>>

>>> I'm using perl's HTML::StripScripts to clean out unwanted/broken html

>>> from forum post on my web site but it also removes <http://example.com>

>>> or <user at example.com> markdown constructs.

>>>

>>> Any idea how to make these two live together in harmony?

>>

>> Are you calling StripScripts before or after Markdown? You should

>> always filter tags after converting to HTML, as it seems StripScripts

>> was designed to filter HTML, not Markdown-formatted text.

>>

>> Long explanation:

>> <http://michelf.com/weblog/2010/markdown-and-xss/>

>

> Actually I save the forum posts to the DB in non-converted markdown and

> filtered of any unwanted html.

>

> Should I save the raw unfiltered post to DB and then (1) expand markdown

> and (2) filter with StripScripts only when _displaying_ the post? That

> would entail keeping some potentially "unclean" posts in the DB and

> having to StripScripts them repeatedly.


The only important thing for correctness of the output is to apply the Markdown filter before ScripScripts. The rest is just optimization.

For performance reasons it might be a good idea to save the (Markdown+StripScripts)-processed text in the DB, but if you allows users to edit their posts once published it'd be more convenient for them to have start from the original unprocessed Markdown source. So you might want to save either one, or both.

--
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/





More information about the Markdown-Discuss mailing list