doesn't that make you wonder?

Fletcher T. Penney fletcher at fletcherpenney.net
Thu Oct 20 13:22:18 EDT 2011



On Oct 20, 2011, at 12:52 PM, Tao Klerks wrote:


> OK, I interpret this (besides the quite-reasonable "why should I care?" vibe), combined with previous references to the PEG grammar(s) implicit in MultiMarkdown and peg-markdown's implementations, as "a formal specification already exists".


True. "A" formal spec exists, but there needs to be agreement when "THE" formal spec exists.

I personally believe that peg-markdown is a fantastic start, and could easily become the standard after being vetted by various developers and users around the troublesome edge cases that are out there.

Others may disagree with me, but I see little reason to reinvent the wheel when there is a really good wheel sitting in front of us that might just need some tweaking.


> That sounds fine - I'd love to use peg-(multi)markdown in my programs, but I see two practical problems for that right now:

> · A C library, while portable to any processor architecture, if not portable to any development environment, and certainly not future-proof. I, personally, would like to be able to rely on the exact same syntax not only across the OSs that Fletcher mentions, but also in C# code (that can be sandboxed safely - no P/Invoke calls to unsafe code) and also in browsers. There are numerous other environments (Java, etc) where a similar "safe code" requirement precludes the use of a C library (at least in browser and other high-sandboxing environments).


I suspect that no piece of code is going to be usable across all languages. If you want to write a Markdown parser in language X, you will likely have to rewrite it no matter how the standard is defined. But the PEG defines the syntax, and you can use that definition in whatever language you choose to write your parsing algorithm in.

A library in ANSI C would seem to me to be as cross-compatible as we're going to get. It's not going to meet everyone's needs, but nothing will. We don't need a universal Markdown library. We need a definition of a standard. Said definition can be used in pascal or fortran for all I care. ;)


That said, I would argue that an ANSI C program with no external requirements (peg-markdown itself requires glib2, but this can be removed as Daniel and I did with peg-multimarkdown) is the most cross-platform code available. If there is a standard, it should be able to be used on as many machine architectures as possible with as few requirements as possible. It seems that a portable chunk of C code fits this need.

For example, pandoc, IIRC, is written in Haskell. Haskell may be a great language, but I've never used it and had trouble trying to get it all installed just so I could try things out. Perl is a pain in the neck for the average person to get running on Windows. I think the canonical definition should be available in something really basic, and C seems like the best approach to me.


> · The PEG grammars(s) may be formal reproducible reusable specifications, but my understanding is that they are not so meaningful without a definition of what they map to. If my brief reading on this is correct, a PEG grammar allows you to define behaviours to be executed upon encountering certain source structures; those behaviours implement a conversion to HTML, or latex, or whatever, but are implicit in the Program, not the Specification/Grammar.


Yes - you have to have a definition of how you output the HTML. peg-markdown also does this through example, not through a formal specification. I'm not an expert on formal grammars, but I wonder if there is a single formal specification that will define the syntax and the output in a single document? Or will there need to be a different approach? In either case, this is also where the test suite will come into play.


> So if we take the current PEG-based implementations as a starting point, what would it take to produce a specification that formally establishes not only the source structures to be matched, but also the behaviour/conversions to be implemented against them? Is this just a question of detailed documentation and an open test-suite?


I think so (but others might have a better solution)

Any standard is going to require:

1) a definition of the Markdown syntax
2) a definition of the HTML output from said syntax
3) a test-suite that can be easily used for verifying various implementations

peg-markdown provides all of these (using Gruber's perl test suite as a starting point - it is incomplete in regards to edge cases). I'm not saying it's perfect. Just that it's really good, and probably as good or better than almost anything out there, and it could be easily refined where necessary.




> Another applicable question is: are PEG grammars easily usable in other environments? A brief search suggests so (js: http://stackoverflow.com/questions/79584/are-there-any-parsing-expression-grammar-peg-libraries-for-javascript-or-php, C#: http://www.codeproject.com/KB/recipes/grammar_support_1.aspx not GPL-compatible, sadly…); does anyone know how fesible it is to take the existing PEG grammar(s) and reuse them in other languages?


The PEG grammar as a syntax definition itself should be fairly reusable, but the problem is that part of your specific implementation gets tied up in the code. I had to modify the MMD PEG when I dropped it into MMD Composer, for example. But the actual language definitions were the same (e.g. "A header starts with 1 through 6 '#' characters, followed by text and optionally ending the line with '#' characters")

It's not like you can magically take the PEG, drop it in a perl script, and now your program can take plain text in and output HTML.

By using the PEG from peg-markdown, you get *both* a formal syntax definition, *and* a cross-platform implementation in one go. *AND* the work has already been done. *AND* I've proven that it can be fairly easily extended to include new syntax features, and different output formats. *AND* Ali Rantakari showed that it could be used to build a fast syntax highlighter. So, it's flexible and unambiguous.

None of this would require that anyone else rewrite their Markdown implementation from scratch, or in a different language. It would provide a benchmark implementation, as well as a "rulebook" that would allow other developers to show that their implementation is compliant.


F-

--
Fletcher T. Penney
fletcher at fletcherpenney.net

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4899 bytes
Desc: not available
Url : <http://six.pairlist.net/pipermail/markdown-discuss/attachments/20111020/cc70cd09/attachment.bin>


More information about the Markdown-Discuss mailing list