Punchline: variants and processor (text/markdown)
Sean Leonard
dev+ietf at seantek.com
Tue Jul 15 03:20:27 EDT 2014
Thank you all for the informative feedback and comments.
Let me get to the punchline. Now having a much better understanding of
the extraordinary diversity of Markdown expressions that are out there,
I think that the "flavor" parameter does not make sense. Instead let me
introduce proposal rev 2, which includes two optional parameters:
variants and processor. (This is very similar to Carl Jacobsen's
proposal--the main differences being that I am adding more formality.)
[This is not specification text, but something like it might appear in
draft -01. For the sake of this post, I am avoiding explicit discussion
of syntax.]
Parameters are defined in RFC 6838 as "companion data", that is, data
that assists with the meaning or interpretation. Parameters can be
"advisory" (derived from the content--thus allowing a consumer to avoid
parsing the content), "tangential" (informational but not affecting the
interpretation of the content), or "material" (has a material effect on
how the content is interpreted). In the case of Markdown, the processor
and variants parameters are material in that they reflect the author's
intent on how best to interpret the content. If absent, the author
expresses no opinion on how to interpret the content; a recipient can
use any Markdown workflow, including a workflow of the recipient's
choice, or a workflow inferred from the broader context (e.g., a build
script for a group of Markdown files).
***
processor: The processor parameter identifies a specific Markdown
implementation and the arguments to be fed to the processor. The
processor parameter has three sub-parameters:
1. Processor name. This is the common-sense, unambiguous name of the
processor. For example, John Gruber's implementation would be called
"Markdown.pl"; pandoc would be called "pandoc".
(Optional) 2. Version. If specified, this is the version of the
processor tool. For example, the Markdown.pl processor could have
version 1.0.1 or 1.0.2b8.
(Optional) 3. Processor-specific arguments. If specified, these
arguments would be used with the processor. Each processor gets to
define the meaning of its arguments; processors that are not
command-line based (e.g., a C library) shall define a mapping between
the argument strings and programmatic parameters to be used when
invoking the processor.
IANA would create a sub-registry of processors. Each registry entry must
contain the processor name (identifier), the full name of the tool (if
it differs from the processor name), the authors or maintainers, and any
URL or other address at which to locate the processor tool and
documentation. Optionally, versions and processor-specific arguments can
be documented in the registry entry.
***
variants [could also be called rulesets or rules]: The variants
parameter identifies sets of rules ("rulesets") that formally specify
how to turn Markdown control characters into markup. The variants
parameter is an ordered list of rulesets. A ruleset is an identifier of
a set of rules. When multiple rulesets are included in the variants
parameter, they are stacked on top of each other. A rule that directly
contradicts a prior rule (mentioned earlier in the list) gets overruled.
The definition of a ruleset can include not only specific rules, but
also other rulesets. Therefore, there can be a ruleset whose primary
purpose is to group together several rulesets.
There is a semantic difference between an absent variants parameter, and
an empty variants parameter (variants=""). An absent variants parameter
means that the author has not expressed a preference or intent for how
to interpret particular Markdown control sequences. An empty variants
parameter means that the author intends for the Markdown rules of John
Gruber's syntax <http://daringfireball.net/projects/markdown/syntax> (as
of the publication of this document) to apply. Gruber's syntax (also
called the "baseline") leaves many cases ambiguous, contradictory, or
unsatisfactory. These gripes are inherent to Markdown's evolution, and
therefore, MUST stay as-is. That is, two different Markdown processors
can claim to conform to the baseline and produce wildly different output.
Examples of variants: the extensions included in pandoc such as
"line_blocks", "fenced_code_blocks", and "strict".
IANA would create a sub-registry of rulesets for the variants parameter.
Each registry entry must include the ruleset identifier, a formal
description of the rules, and identification of included rulesets.
Optionally the entry may describe processors (including versions and
arguments) that are known to implement the ruleset.
Each ruleset identifier shall uniquely identify that set of rules. I.e.,
if "fenced_code_blocks" is registered, "guarded_code_blocks" cannot be
registered if the effective rules in "guarded_code_blocks" are the same
as "fenced_code_blocks".
***
When both variants and processor are present, processor takes
precedence. I.e., the processor choice is considered the best expression
of the author's intent.
Comments welcome.
-Sean
More information about the Markdown-Discuss
mailing list