Metadata syntax (was Universal syntax for Markdown)

John MacFarlane jgm at berkeley.edu
Tue Sep 20 11:30:26 EDT 2011


+++ Tao Klerks [Sep 20 11 10:34 ]:

> On Tue, Sep 20, 2011 at 9:56 AM, John MacFarlane <[1]jgm at berkeley.edu>

> wrote:

>

> I think that the abstract is a fine case. Although one *could*

> handle

> it the way you suggest, by having the metadata specify a section

> of the document to use as the abstract, I don't see the advantage of

> that. It is natural distinguish between the body text, which is

> *always* part

> of the produced document, whether a fragment or a standalone

> document is being

> produced, and regardless of the format or template used, and the

> metadata,

> which sometimes appear in the produced document, depending on one's

> purposes,

> and which appear differently in different formats. Once you make

> this

> distinction, the abstract clearly falls on the side of the metadata.

>

> In that case, you're talking about metadata in the more general sense -

> like link definitions, footnotes, and other constructs that are

> currently treated as a special case in markdown. I'm all for having a

> special syntax for defining the abstract, as long as the author doesn't

> have to worry about any escaping conventions and can just write it like

> he/she would any other regular markdown content.


Yes, absolutely. There are two ways to approach this while keeping
'abstract' a metadata field:

(1) There could be a special syntax for designating metadata fields
as markdown (or alternatively markdown could be the default, and there
could be a special syntax for designating them plain strings).
I showed in my original post how lunamark implements this:

abstract = m[[
Here's the abstract. You can put anything you want
here, including blank lines. No special escaping is
needed. It can be flush left, but I've left a small
indent because it looks nice.

* item 1
* item 2
]]

The 'm' indicates that the content is markdown. If you left it
out, you'd have a plain string.

(2) It could just be conventional that certain fields ('abstract',
'title', etc.) are interpreted as markdown.


> Other cases:

> * bibliographic data for the document itself, which you might want

> to print in some presentations but not others

> * revision history

> * tags

> * bibliography entries used in the document

> * settings for things like default stylesheets

>

> Point taken, most of these are good cases for supporting structured

> content, but not formattable/markdown content, right?


Right in most cases, but one might want a free-form revision history
that is just markdown, and bibliographic entries might include
abstracts etc.


> Currently you need to specify the bibliography database on the

> command line as well (it can be bibtex, endnote, or any number

> of other formats). Ideally, though, the document itself should

> specify where its bibliographical entries are coming from.

> This could just be a file path, but if you want the document to

> be truly portable, it would be nice to be able to include the

> structured

> bibliography entries themselves in metadata at the end of the

> document.

> This could be done easily with a data description language as

> powerful as lua/yaml/json.

>

> Absolutely - but the (possibly unattainable) ideal would be a situation

> where tools and experts can specify complex structured metadata, and

> regular joe can change his title, author, and other basic/simple values

> and lists, specifying values that contain apostrophes, commas and other

> natural punctuation, wihout blowing anything up in the process. As soon

> as he needs to specify/modify something that contains structure (or

> even something multi-line?) it seems fair that he should have to use a

> tool or do some research on the standard (esp. as most if not all of

> the structured-data use cases relate to tools already).

> My concern with a pure-lua/yaml/json metadata format is that it

> requires specialized knowledge (not related to the existing markdown

> standards/experience) on the part of the user for even the most trivial

> changes to the simplest fields - *especially* if structured/markdown

> content such as the abstract is placed in a metadata field!


I understand the concern. YAML is particularly bad this way, because you
get used to not quoting or escaping things, but then your document blows up
when you have a colon in the field. I think lua is a nice compromise--more
regular and predictable, but you don't have to quote the fields as in json,
and you have a really nice multiline string syntax that eliminates the need
for escaping.[^1] But my lua-based proposal is compatible with also having a
simpler way of specifying title, author, and date -- e.g. pandoc's, or Michael
Thompson's proposal involving centering, or MMD's (though I think the Hamlet
problem is serious).

[^1]: What if your abstract contains `]]`, you might ask?
Well, then you just need to use another delimiter for the multiline
string, such as `[=[` and `]=]`.



More information about the Markdown-Discuss mailing list