evolving the spec (was: forking Markdown.pl?)

Michel Fortin michel.fortin at michelf.com
Tue Mar 4 23:02:05 EST 2008

Le 2008-03-04 à 0:49, Allan Odgaard a écrit :

> On 3 Mar 2008, at 13:30, Michel Fortin wrote:


>> [...]

>>> 1. A regexp that makes the parser enter the context the rule

>>> represents (e.g. block quote, list, raw, etc.).


>>> 2. A list of which rules are allowed in the context of this rule.


>>> 3. A regexp for leaving the context of this rule.


>>> 4. A regexp which is pushed onto a stack when entering the context

>>> of

>>> this rule, and popped again when leaving this rule.


>>> The fourth item here is really the interesting part, because it is

>>> what made Markdown nesting work (99% of the time) despite this being

>>> 100% rule-driven.


>> I'm not sure that the regular expression in 4 does, beside being

>> pushed and popped from the stack


> Yeah, I accidentally sent the letter w/o noticing I forgot to

> explain the fourth rule.


> [big explanation]

So you're basically using a line by line approach. I was thinking
about that as a possibility for parsing blocks, but I don't think I'll
do that because I need backtracking to be able to rewind beyond the
current line. Or can you do it?

I'm particularly curious about how you can handle headers of this form:


> Now take the rule for block quote:


> BQ[1] = /\g {,3}> {,3}/ # We start it for lines with > allowing

> # up to 3 spaces before/after.


> BQ[2] = [ BQ, RAW, PAR, … ] # Basically all block elements

> # can go inside block quote.


> BQ[3] = /\g( *$|«hr»)/ # We leave block quote at empty lines or

> # horizontal rulers¹. The actual

> pattern for

> # «hr» is something like:

> # [ ]{,3}(?<M>[-*_])([ ]{,2}\k<M>)

> {2,}[ \t]*+$


> BQ[4] = /\g( {,3}> ?)?/ # While in BQ eat leading quote

> characters.


> ¹ I am actually not sure if this is “the spec” or just a bug. But

> placing a horizontal ruler just below a block quoted paragraph does

> not give the expected “lazy mode” and places the <hr> inside the

> block quote, instead it leaves the block quote.

I'm not sure what's the problem with horizontal rules in blockquotes.
I've tried many variations of:

> test
> ***
> test

and couldn't make it end the blockquote prematurely. If it did, I'd
say it'd be a bug because I see no way the user would expect the
horizontal rule to break the blockquote and no reason for the parser
to do so either.

> [...]


> Okay, enough writing — I hope the above gives a better understanding

> of how the rules are used.

Indeed, it was quite insightful. Thank you.

Michel Fortin
michel.fortin at michelf.com

More information about the Markdown-Discuss mailing list