HTML entities in URLs and urlencoding

Waylan Limberg waylan at
Mon Mar 31 21:45:13 EDT 2008

We recently received the following bug report for the python-markdown

> The "&" are escaped in URLs.


> An example:

> [Link](


> Should output:

> <a href="">Link</a>


> Currently outputs:

> <a href=";param2=value1">Link</a>


> So the "&" must not be escaped!

A fix is easy, but it occurred to me that perhaps links should be
urlencoded -- at least some chars should be. Specifically the "unsafe"
chars listed in RFC 1738 [1]. The "reserved" chars probably should too
when not used in their approved manner (i.e.: A colon should only be
allowed after the scheme (http://) or in the location
(usr:pass at host:port) but should be encoded anywhere else). Of course,
that involves extra work. So I went to check what other
implementations do [2] and discovered that every one escapes with html
entities. Is there something I'm missing or is this a bug? As far as I
can tell, the "&amp;" breaks the query string.

Waylan Limberg
waylan at

More information about the Markdown-Discuss mailing list