Punchline: variants and processor (text/markdown)

Tue Jul 15 09:26:14 EDT 2014

On 7/15/2014 5:59 AM, Michel Fortin wrote:
> Le 15-juil.-2014 à 3:20, Sean Leonard <dev+ietf at seantek.com> a écrit :
>
>> IANA would create a sub-registry of processors. Each registry entry must contain the processor name (identifier), the full name of the tool (if it differs from the processor name), the authors or maintainers, and any URL or other address at which to locate the processor tool and documentation. Optionally, versions and processor-specific arguments can be documented in the registry entry.
> ...
>
>> IANA would create a sub-registry of rulesets for the variants parameter. Each registry entry must include the ruleset identifier, a formal description of the rules, and identification of included rulesets. Optionally the entry may describe processors (including versions and arguments) that are known to implement the ruleset.
>>
>> Each ruleset identifier shall uniquely identify that set of rules. I.e., if "fenced_code_blocks" is registered, "guarded_code_blocks" cannot be registered if the effective rules in "guarded_code_blocks" are the same as "fenced_code_blocks".
> But how does a document get annotated with the attributes in the first place? Who chooses the processor and variant attributes of a document and based on what? And where is it stored? Do you have any specific example of how that could work in any given setup?

I am working on all of that.

The author chooses the processor and variant attributes; or, the 
author's editing software will do this for the author. For example, a 
tool like MarkdownPad can save out this metadata in the "right place". I 
put it in quotes because I know that is an issue. One thing obvious 
(from the metadata sub-thread) is that it cannot be stored in a generic 
Markdown file in a broadly compatible way--I am thinking of something 
adjacent.

If it is in a version control system like Subversion, or a CMS, then it 
could be stored in the properties/attributes. If it is in an e-mail (in 
particular, an e-mail generated by a CMS, see below), then it can be 
stored in the usual MIME way.

I am trying not to invent another metadata format, so I am still looking 
at the existing options out there.

>
> My impression is that all this is going to do is define some metadata flags that no one will use.
>
> What is the goal here? Is the goal to have most Markdown documents on the internet be annotated in this way so some browser software can pick automatically a sort-of compatible implementation for a given document? Or is it a way to have inside a given system (a CMS for instance) a way to annotate which Markdown implementation to use internally to parse a specific document?

Definitely the latter--for a system like a CMS to store the Markdown 
content with metadata, so that it can parse a specific document in a 
specific way. Perhaps more importantly than storage, it is meant for 
interchange--like when you export content from one CMS to another CMS. 
Presumably, most CMSes will use one parser for its (public)-facing 
implementation. In that case the parameters are implied. But when you 
export data from that CMS (and import it into another CMS), it would be 
very useful to record what Markdown features were used, so that the new 
CMS can interpret the data the ways in which the original users intended 
for it to be understood.

For example, take fenced code blocks. Your old CMS supported fenced code 
blocks; the new one does not (for security reasons or because it's not 
germane to the purpose of the CMS). Or maybe your old CMS supported 
Fancy Tables Type #1 and the new one supports Fancy Tables Type #2. 
Well, when you import your data into the new CMS, the new CMS can see 
that its preferred Markdown processor is going to mangle the content, so 
as part of the import process, it invokes the Markdown processor in the 
metadata, converting the fenced code blocks to HTML (or the Fancy Tables 
Type #1 to HTML). Then the content is going to look as the users 
intended, but you don't have to maintain two contradictory 
implementations in the new CMS. The Markdown processor for the imported 
data can be invoked "offline" (i.e., as part of the bulk-import 
process). This also alleviates security concerns since the import 
process can be operated in another VM.

Sean