evolving the spec (was: forking Markdown.pl?)
Michel Fortin
michel.fortin at michelf.com
Tue Mar 4 23:02:05 EST 2008
Le 2008-03-04 à 0:49, Allan Odgaard a écrit :
> On 3 Mar 2008, at 13:30, Michel Fortin wrote:
>
>> [...]
>>> 1. A regexp that makes the parser enter the context the rule
>>> represents (e.g. block quote, list, raw, etc.).
>>>
>>> 2. A list of which rules are allowed in the context of this rule.
>>>
>>> 3. A regexp for leaving the context of this rule.
>>>
>>> 4. A regexp which is pushed onto a stack when entering the context
>>> of
>>> this rule, and popped again when leaving this rule.
>>>
>>> The fourth item here is really the interesting part, because it is
>>> what made Markdown nesting work (99% of the time) despite this being
>>> 100% rule-driven.
>>
>> I'm not sure that the regular expression in 4 does, beside being
>> pushed and popped from the stack
>
> Yeah, I accidentally sent the letter w/o noticing I forgot to
> explain the fourth rule.
>
> [big explanation]
So you're basically using a line by line approach. I was thinking
about that as a possibility for parsing blocks, but I don't think I'll
do that because I need backtracking to be able to rewind beyond the
current line. Or can you do it?
I'm particularly curious about how you can handle headers of this form:
Header
======
> Now take the rule for block quote:
>
> BQ[1] = /\g {,3}> {,3}/ # We start it for lines with > allowing
> # up to 3 spaces before/after.
>
> BQ[2] = [ BQ, RAW, PAR, … ] # Basically all block elements
> # can go inside block quote.
>
> BQ[3] = /\g( *$|«hr»)/ # We leave block quote at empty lines or
> # horizontal rulers¹. The actual
> pattern for
> # «hr» is something like:
> # [ ]{,3}(?<M>[-*_])([ ]{,2}\k<M>)
> {2,}[ \t]*+$
>
> BQ[4] = /\g( {,3}> ?)?/ # While in BQ eat leading quote
> characters.
>
> ¹ I am actually not sure if this is “the spec” or just a bug. But
> placing a horizontal ruler just below a block quoted paragraph does
> not give the expected “lazy mode” and places the <hr> inside the
> block quote, instead it leaves the block quote.
I'm not sure what's the problem with horizontal rules in blockquotes.
I've tried many variations of:
> test
>
> ***
>
> test
and couldn't make it end the blockquote prematurely. If it did, I'd
say it'd be a bug because I see no way the user would expect the
horizontal rule to break the blockquote and no reason for the parser
to do so either.
> [...]
>
> Okay, enough writing — I hope the above gives a better understanding
> of how the rules are used.
Indeed, it was quite insightful. Thank you.
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
More information about the Markdown-Discuss
mailing list