Tightening the rules for literal `[` and `]` chars in link ids

Jacob Rus jrus at hcs.harvard.edu
Tue Sep 26 16:42:20 EDT 2006


John Gruber wrote:

> Jacob Rus <jrus at hcs.harvard.edu> wrote on 9/25/06 at 4:42 PM:

>> It has always seemed to me that links should just be

>> "[A-Za-z][A-Za-z0-9._-]+", and maybe 2-3 other characters, but as few as

>> possible. There's no reason for a reference name that no one will ever

>> see to have a creative title with lots of punctuation.

>

> I disagree. To date, Markdown's liberal policy on the allowable

> characters for link reference IDs hasn't generated any complaints.

> Today's stuff about disallowing `[` and `]` is for good reason.

> Why would you want to disallow more characters than that?

>

> Spaces, for certain, need to be included. And don't forget

> non-ASCII alphabet characters, (é, ü, etc.)

>

> Nor do I see any problem with obscure punctuation in a link ref ID.


Sorry, I didn't mean to suggest that accented characters, and other
unicode letters shouldn't be allowed, but what about quotation marks,
various types of spaces, commas and periods, backticks, ampersands, em
and en dashes, etc. etc.

These are just identifiers. They don't show up at all in the final
document. So there's no reason for people to put fancy characters in
them. Doing so only makes life harder for readers, writers, and the
computer.

In your example, you could have easily called the link "cstar" or
something, and then you wouldn't need to have a special method of
entering such an obscure character. Anyone else who wants to add
another link to that source in your document would need to use
copy/paste to do it.

This permissiveness doesn't increase usability or friendliness. It
doesn't increase usability. What it does does do is make life more
confusing, because there are more problems to figure out w.r.t. parsing
and ensuring proper nesting, for humans and machines alike. There's no
reason to have the ability to put stars and backticks and parentheses in
reference names, as they will never be shown. So all we accomplish
through letting these by is the creation of obscure bugs in the parser,
and decreased readability.

I don't really mind being extremely liberal though in what gets through.
It doesn't bother me. If you really think it adds value to be able to
type whatever garbage as a reference name, that's fine by me. It should
be possible to allow "`[^\]\r\n]`" as our limit on names, if we want to.

-Jacob



More information about the Markdown-Discuss mailing list