Markdown Extra Spec: Parsing Section
Michel Fortin
michel.fortin at michelf.com
Sun May 11 22:26:33 EDT 2008
Le 2008-05-11 à 20:55, Jacob Rus a écrit :
> You should write it in something closer to a BNF-like format. The
> current version is about 10x more verbose than necessary, and it
> makes reading the spec considerably more difficult.
The reason I'm doing it like this is that I doubt everything will be
expressible in a BNF format. Using plain english descriptions allows
me to not bother about fitting things to a specific grammar and just
write what I feel is the most natural and the easier to understand.
Shopping for a more formal and less verbose grammar, if we need one,
will be much easier once we know what we need, once we can compare
existing grammars against a checklist of what is necessary to
implement the given parsing algorithm.
If you remember the timetable I've given, you'll see that I've booked
about half a year for polishing things out. This includes rephrasing
sentences, refactorizing the syntax, and reformatting the spec to make
it easier to understand. This *could* include switching to a new
grammar format if it makes things more intuitive and readable.
> Also, you're still going to have quite a few sticky edge cases with
> your current parsing model. What happens when we have a `<>`-
> delimited URL inside a blockquote? For instance:
>
> > what about this <http://
> > google.com/> case?
Well, currently newlines aren't allowed inside automatic links in
Markdown.pl, PHP Markdown and some others. Implementations who see an
automatic link there sees it as a link to "http://
google.com/" (notice the space) or "http://" (notice what's missing).
<http://babelmark.bobtfish.net/?markdown=%0D%0A%3E+what+about+this+%3Chttp%3A%2F%2F%0D%0A%3E+google.com%2F%3E+case%3F&normalize=on&src=1&dest=2
>
Anyway, with the parsing model in three passes I'm currently defining
it's pretty trivial to do correctly: the block elements pass extracts
the text of the blockquote, leaving this to parse by the span element
pass:
what about this <http://
google.com/> case?
The span element pass would then see an autolink and just ignore any
newline it finds in the URL.
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
More information about the Markdown-Discuss
mailing list