Escaping "<"

A. Pagaltzis pagaltzis at gmx.de
Sat Apr 23 13:18:11 EDT 2005


Sorry to reply so late to a thread, but for edification, here’s a
note to be aware of:

* Jelks Cabaniss <jelks at jelks.nu> [2005-03-17 17:35]:
> But in parsed character data, you don't have to write
> 
> 	<p>The "meaning" of life is ...
> 
> as
> 
> 	<p>The &quot;meaning&quot; of life is ...
> 
> (although you *could* if you wanted, and a number of older --
> and even a few current -- HTML editors misguidedly seem to
> think that they *must*).

So far this is correct.

> Same goes for `>`.

This, however, is not, because:

    <p>&gt;[[ Nope, not well-formed. ]]></p>

The literal sequence `]]>` cannot be part of a well-formed
document. It is only legal as the closing delimited of a
`<![CDATA[ ]]>` section or similar constructs.

You are almost never going to encounter this edge case, of
course, but this is the reason why most all XML serializers
emit &gt; for a literal closing angle bracket. It’s simply less
work for them to guarantee correctness that way than adding a
special case would be.

------

Markdown currently encodes all `>` that appear within code
sections, but not those that appear in text. This should probably
be fixed at some point, though the urgency here is low. It’s one
of the steps to take to guarantee valid XHTML output, though. It
should be as simple as addressing the special case with just a
`s/]]>/]]&gt;/g`.

(I do not advocate adding CDATA section awareness to Markdown,
the language. I won’t go into details, but supporting such a
rarely-used convenience-only feature is not worth the required
complexity.)

Regards,
-- 
Aristotle


More information about the Markdown-Discuss mailing list