Adding a "Safe" option?

John Gruber gruber at fedora.net
Fri Apr 30 18:01:01 EDT 2004


Jason Clark <jason at jclark.org> wrote on 04/27/04 at 8:03p:

> At first, I thought so too.  It would be easier to scrub the Markdown 
> *output*, but I have an ulterior motive which I neglected to mention... 
> my site is xhtml1.1 strict, and I do not want a comment to be able to 
> make a page invalid.

I think this is a good idea. I thought about it way back before I
released the first beta -- that Markdown would be pretty good for
comments, too, except that it ought to strip raw HTML rather than
allow it.

I'm not sure "safe" is the right adjective for this.


> Markdown is guaranteed to output valid xhtml, but 
> users aren't.

Ideally, yes, but in fact there are ways to trick Markdown.pl into
generating invalid HTML. Simon Willison points out one of them here:

  <http://simon.incutio.com/archive/2004/04/13/myriadOfMarkupSystems>

These should be considered bugs in Markdown.pl, however.

Bob's xMarkdown is designed *only* to generate valid HTML, however.

* * *

Off the top of my head, I was thinking you could do this by
preprocessing Markdown input, according to the following rules:

+   if a line is indented, it's a code block, so leave tags alone

+   if a line is not indented, turn each < into `&lt;`

However, that won't work for inline `code spans`. And, once you take
inline code spans into account, you can't think about doing it
line-by-line, because you might have a code span that starts on one
line and ends on another.

But I think you could work around that in the following way:

1.  Process the file in "paragraph" mode, grabbing multiple
    consecutive lines at once.

2.  If the first line of the paragraph is indented, skip it, because
    it's a code block.

3.  Turn every < / > into &lt; / &gt;, except for ones which occur
    between `...` delimiters.
    
There's a bit of hand-waving in my step #3, but I don't have time to
write this myself. :^)

-J.G.


More information about the Markdown-discuss mailing list