Punchline: variants and processor (text/markdown)

Sean Leonard dev+ietf at seantek.com
Tue Jul 15 03:20:27 EDT 2014


Thank you all for the informative feedback and comments.

Let me get to the punchline. Now having a much better understanding of 
the extraordinary diversity of Markdown expressions that are out there, 
I think that the "flavor" parameter does not make sense. Instead let me 
introduce proposal rev 2, which includes two optional parameters: 
variants and processor. (This is very similar to Carl Jacobsen's 
proposal--the main differences being that I am adding more formality.)

[This is not specification text, but something like it might appear in 
draft -01. For the sake of this post, I am avoiding explicit discussion 
of syntax.]

Parameters are defined in RFC 6838 as "companion data", that is, data 
that assists with the meaning or interpretation. Parameters can be 
"advisory" (derived from the content--thus allowing a consumer to avoid 
parsing the content), "tangential" (informational but not affecting the 
interpretation of the content), or "material" (has a material effect on 
how the content is interpreted). In the case of Markdown, the processor 
and variants parameters are material in that they reflect the author's 
intent on how best to interpret the content. If absent, the author 
expresses no opinion on how to interpret the content; a recipient can 
use any Markdown workflow, including a workflow of the recipient's 
choice, or a workflow inferred from the broader context (e.g., a build 
script for a group of Markdown files).

***

processor: The processor parameter identifies a specific Markdown 
implementation and the arguments to be fed to the processor. The 
processor parameter has three sub-parameters:
   1. Processor name. This is the common-sense, unambiguous name of the 
processor. For example, John Gruber's implementation would be called 
"Markdown.pl"; pandoc would be called "pandoc".
   (Optional) 2. Version. If specified, this is the version of the 
processor tool. For example, the Markdown.pl processor could have 
version 1.0.1 or 1.0.2b8.
   (Optional) 3. Processor-specific arguments. If specified, these 
arguments would be used with the processor. Each processor gets to 
define the meaning of its arguments; processors that are not 
command-line based (e.g., a C library) shall define a mapping between 
the argument strings and programmatic parameters to be used when 
invoking the processor.

IANA would create a sub-registry of processors. Each registry entry must 
contain the processor name (identifier), the full name of the tool (if 
it differs from the processor name), the authors or maintainers, and any 
URL or other address at which to locate the processor tool and 
documentation. Optionally, versions and processor-specific arguments can 
be documented in the registry entry.

***

variants [could also be called rulesets or rules]: The variants 
parameter identifies sets of rules ("rulesets") that formally specify 
how to turn Markdown control characters into markup. The variants 
parameter is an ordered list of rulesets. A ruleset is an identifier of 
a set of rules. When multiple rulesets are included in the variants 
parameter, they are stacked on top of each other. A rule that directly 
contradicts a prior rule (mentioned earlier in the list) gets overruled. 
The definition of a ruleset can include not only specific rules, but 
also other rulesets. Therefore, there can be a ruleset whose primary 
purpose is to group together several rulesets.

There is a semantic difference between an absent variants parameter, and 
an empty variants parameter (variants=""). An absent variants parameter 
means that the author has not expressed a preference or intent for how 
to interpret particular Markdown control sequences. An empty variants 
parameter means that the author intends for the Markdown rules of John 
Gruber's syntax <http://daringfireball.net/projects/markdown/syntax> (as 
of the publication of this document) to apply. Gruber's syntax (also 
called the "baseline") leaves many cases ambiguous, contradictory, or 
unsatisfactory. These gripes are inherent to Markdown's evolution, and 
therefore, MUST stay as-is. That is, two different Markdown processors 
can claim to conform to the baseline and produce wildly different output.

Examples of variants: the extensions included in pandoc such as 
"line_blocks", "fenced_code_blocks", and "strict".

IANA would create a sub-registry of rulesets for the variants parameter. 
Each registry entry must include the ruleset identifier, a formal 
description of the rules, and identification of included rulesets. 
Optionally the entry may describe processors (including versions and 
arguments) that are known to implement the ruleset.

Each ruleset identifier shall uniquely identify that set of rules. I.e., 
if "fenced_code_blocks" is registered, "guarded_code_blocks" cannot be 
registered if the effective rules in "guarded_code_blocks" are the same 
as "fenced_code_blocks".

***

When both variants and processor are present, processor takes 
precedence. I.e., the processor choice is considered the best expression 
of the author's intent.

Comments welcome.

-Sean



More information about the Markdown-Discuss mailing list