Bug: INS/DEL in block context (1.04b)
John Gruber
gruber at fedora.net
Thu Apr 29 19:48:30 EDT 2004
Jay Allen <markdown at openwire.com> wrote on 04/28/04 at 5:14p:
> 2) Hack the Markdown source code on line 289 (v1.04b) from:
>
> my $block_tag_re =
> qr/p|div|h[1-6]|blockquote|pre|table|dl|ol|ul|script/;
>
> to
> my $block_tag_re =
> qr/p|div|h[1-6]|blockquote|pre|table|dl|ol|ul|script|ins|del/;
>
> #2 may also have some unintended consequences since it seems that it
> will always treat an ins or del as a block-level element.
I think I see a way around this. Right now, Markdown performs two
searches for block-level HTML tags. The first tries to be a little
smart, and looks for nested instances of the outermost tag. Thus
it'll match the following as a single block:
<div>
<div>
blah
</div>
</div>
The second search is very naive, and just matches from `<tag>` to
`</tag>` for any block-level tag. Thus, if the above nested div
hadn't already been escaped by the first match, the second match
would incorrectly match from the first `<div>` to the first
`</div>`, which would leave the closing `</div>` just hanging there.
Right now, both of these matches use the same pattern to identify
block level tags -- the `$block_tag_re` variable mentioned above.
I'm pretty sure that we can Do The Right Thing most of the time with
`<ins>` and `<del>` if we use two different patterns for the two matches.
For the first match, we'll use:
qr/p|div|h[1-6]|blockquote|pre|table|dl|ol|ul|script|ins|del/;
For the second match, we'll use:
qr/p|div|h[1-6]|blockquote|pre|table|dl|ol|ul|script|/;
This does not solve the problem completely. But, it solves it about
as well as Markdown deals with other inline HTML blocks. The problem
is simply that Markdown.pl is a very simplistic HTML parser. Someday
I'd like to fix this, but not now.
With I've just implemented, Markdown will turn the following input:
<ins>
Blocky.
</ins>
<ins>Spanny.</ins>
into:
<ins>
Blocky.
</ins>
<p><ins>Spanny.</ins></p>
I.e., if you put the ins (or del) tags on lines by themselves,
starting at the left margin, then Markdown.pl will treat them as
block-level tags. Otherwise, they're treated as span-level tags.
* * *
This unintelligence of Markdown.pl's HTML parser is something I'd
like to document. The spec for Markdown, the formatting language,
should specify that a truly compliant implementation will Do The
Right Thing in more circumstances than Markdown.pl, my
implementation, currently handles.
* * *
I've been really, really busy with other things so far this month,
and I've gotten out of the swing with Markdown. I'm going to package
up my current development build as a beta, but I'm only going to
publish it here on this list. Stand by.
-J.G.
More information about the Markdown-discuss
mailing list