Optional features (was: Markdown Extra Specification (First Draft))
Yuri Takhteyev
qaramazov at gmail.com
Sat May 24 15:34:04 EDT 2008
> It seems to me that filtering is a red herring in your case. If
> you want to allow users to enter literal tags, you will have this
> problem whether you filter the ultimate output or not.
If I want to allow them, then yes, but this is not the case I was
considering. Suppose I do _not_ want to allow them to enter HTML
tags. This is easy to implement as an option in a Markdown converter.
However, if the converter doesn't do that, then I have a much harder
task: user's tags are now mixed with Markdown's tags, and I have to
figure out how to sort them out. There _is_ a difference between the
<em> inserted by markdown and the <em> inserted by the user. I know
Markdown's em will be balanced. I am not sure that the user's will
be. At this point the only way to be sure that the HTML is valid is
to parse it.
> If your XHTML parser has a streaming input mode, you can couple
> your Markdown converter directly to the XHTML parser and feed the
> HTML output to it as you go. If the XHTML parser throws a well-
> formedness error, you can then relate it to the vicinity of the
> last Markdown chunk you converted to HTML and passed into the
> XHTML parser.
I am not quite sure what you mean, but Markdown documents can't always
be processed on a chunk by chunk basis. Consider:
Here is a [link][id].
... 100KB of text...
[id]: http://example.com/ "Optional Title Here"
This document cannot be processed correctly unless it's considered all
at the same time.
> If you don't want to couple the Markdown converter with an XHTML
> parser that closely, it's still possible to do this, but the
> Markdown converter will have to be able to accept streaming input
> itself and will need to generate output sufficiently frequently
> that you can track the correlation of input and output with a
> useful amount of precision.
Sure, if you want to drop support for references, footnotes, etc. But
it's much simpler to implement a "safe mode" that escapes or validates
all HTML submitted by the user.
- yuri
--
http://sputnik.freewisdom.org/
More information about the Markdown-Discuss
mailing list