when rational discussion was still a possibility

Michel Fortin michel.fortin at michelf.ca
Sat Sep 6 23:39:12 EDT 2014


Le 6-sept.-2014 à 21:38, John MacFarlane <jgm at berkeley.edu> a écrit :

> Michel, I also wanted to comment on these failing test cases,
> to explain why they fail.  The current spec is a work in progress,
> and certainly still up for comment and revision.  Your comments would
> be most welcome!

Some comments below.

>> https://github.com/michelf/mdtest/blob/master/Markdown.mdtest/Hard-wrapped%20paragraphs%20with%20list-like%20lines.text
> 
> CommonMark currently allows a list to interrupt a paragraph (as Markdown
> 1.0.0 and earlier did, but not later versions).  I am not certain about
> this choice, but as I see it, these are the tradeoffs.
> 
> CON:  There is a danger that a hard-wrapped numeral at the end of
> a sentence will be misinterpreted as a list item.  (Of course, this
> could be avoided with a backslash escape, but it might escape the
> author's notice.)
> 
> PRO 1:  It is natural and common to write things like:
> 
>   Grocery list:
>   1.  eggs
>   2.  milk
>   3.  juice
> 
> People write things like this in web forms all the time.
> 
> PRO 2:  Allowing lists to interrupt a paragraph allows us to keep
> a very nice property, which is that a block of text, when converted
> into a list by prepending '1.' and indenting, will have the same
> meaning inside the list as it had without it.
> 
> So, in CommonMark,
> 
>   Groceries:
>   - eggs
>   - milk
> 
> is a (paragraph followed by a list) by itself, and also a
> (paragraph followed by a list) inside a list item:
> 
>   - Groceries:
>     - eggs
>     - milk
> 
> We lose this (to my mind very natural and desirable) property if
> we don't allow a list to interrupt a paragraph.
> 
> I think that the PROs outweigh the CON here.

John Gruber thought otherwise ten years ago. He decided that at the root of the document it'd take a black line, but when making a sublist inside a list item it'd not require one. Also, lists with sublists like your last example won't create a `<p>` in the outer list item so it's not really the same thing.

I did propose this rule back in 2004 which would solve the cases above. Here's a few archived posts:
http://six.pairlist.net/pipermail/markdown-discuss/2004-March/000232.html
http://six.pairlist.net/pipermail/markdown-discuss/2004-March/000241.html
http://six.pairlist.net/pipermail/markdown-discuss/2004-March/000318.html


>> https://github.com/michelf/mdtest/blob/master/Markdown.mdtest/Literal%20quotes%20in%20titles.text
> 
> See here http://jgm.github.io/stmd/spec.html#link-title
> and particularly the paragraph:
> 
> "(Note: Markdown.pl did allow double quotes inside a double-quoted title, and
> its test suite included a test demonstrating this. But it is hard to see a good
> rationale for the extra complexity this brings, since there are already many
> ways--backslash escaping, entities, or using a different quote type for the
> enclosing title--to write titles containing double quotes. Markdown.pl’s
> handling of titles has a number of other strange features. For example, it
> allows single-quoted titles in inline links, but not reference links. And, in
> reference links but not inline links, it allows a title to begin with " and end
> with ). Markdown.pl 1.0.1 even allows titles with no closing quotation mark,
> though 1.0.2b8 does not. It seems preferable to adopt a simple, rational rule
> that works the same way in inline links and link reference definitions.)"

There sure is room for more consistency with various quote styles and disallowing non-sensial combinations of `"` and `)`. But take note:

1. stmd is the only implementation not supporting unescaped quotes. http://johnmacfarlane.net/babelmark2/?normalize=1&text=Foo+%5Bbar%5D(%2Furl%2F+%22Title+with+%22quotes%22+inside%22).

2. neither Markdown.pl, PHP Markdown, nor many other parsers let you escape a double quote (or a single quote), so the obvious solution is unfortunately non-portable and you'll have to recommend using `"`. http://johnmacfarlane.net/babelmark2/?normalize=1&text=%5C%22quotes%5C%22


>> Failing tests from PHP Markdown test suite:
>> https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Backslash%20escapes.text
> 
> As far as I can see, stmd's output is semantically equivalent HTML; it's
> just a matter of whether '>' is escaped as '>'.

If I look on Babelmark 2, that's the case. For some reason, the copy of stmd I compiled locally and passed through MDTest returned that a single paragraph:

	<p>Tricky combinaisons: backslash with \-- two dashes backslash with \> greater than \[test](not a link) \*no emphasis*</p>

The problem seems to be that this file for some reason uses classic Mac OS line endings (CR), but the browser corrects that automatically when pasting it in Babelmark 2 so you get a correct result there.


>> https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Email%20auto%20links.text
> 
> For email addresses we used the "non-normative regex" from the HTML5 spec,
> which seemed a nonarbitrary and practical thing to use:
> http://jgm.github.io/stmd/spec.html#email-autolink
> It seems not to allow the international example or the crazier ones
> (with strange symbols and quotes).  Probably this should be fixed in our
> spec.

Here is my regex if you're interested: https://github.com/michelf/php-markdown/blob/lib/Michelf/Markdown.php#L1321


>> https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Ins%20%26%20del.text
> 
> The spec does not include ins or del among the list of HTML block tags.
> I can't recall where we got this list, and it now seems a mistake.
> Adding these to the list would still yield different output from PHP
> Markdown, because of differences in treatment of HTML blocks,
> but more reasonable output.

PHP Markdown treats them as hybrid block/inline depending on the context. If alone on its line, <ins> or <del> is a block-level tag, otherwise it's a span-level tag. This is because they can be both in HTML.


>> https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Code%20block%20in%20a%20list%20item.text
> 
> Here I just need to refer you to the extensive discussion in the spec
> of the motivation for the list rules we chose.
> http://jgm.github.io/stmd/spec.html#motivation
> This was one of the hardest things to work out in a (to me) satisfactory
> way.  NO choices here will be perfectly backwards compatible with every
> implementation, since they go in so many directions.  But I'm pretty confident
> that the choices we've made are better than any of the alternatives I've
> considered.  I would be interested to hear your feedback on this!

Are you sure we're talking about the same thing? The issue is that in the middle item your code block content has a two space indent while it is indented by four spaces (minus one for the list marker) in the original. I'd have expected a four character indent to produce a code block with no leading space in its content. One thing that might be confusing here is that the input mixes tabs and spaces and tabs often get rendered as a 8 space indent. Is this really what stmd should be outputing:

http://johnmacfarlane.net/babelmark2/?normalize=1&text=*%09List+Item%3A%0A%0A%09%09code+block%0A%0A%09%09with+a+blank+line%0A%0A%09within+a+list+item.%0A%0A*+++++++code+block%0A%0A++++++++as+first+element+of+a+list+item%0A%0A*%09List+Item%3A%0A%0A%09%09code+block+with+whitespace+on+preceding+line


-- 
Michel Fortin
michel.fortin at michelf.ca
http://michelf.ca



More information about the Markdown-Discuss mailing list