Javascript in URLs (was: Markdown doesn't always generate XHTML)

Michel Fortin michel.fortin at michelf.com
Sat Mar 15 08:17:48 EDT 2008


Le 2008-03-15 à 0:39, Waylan Limberg a écrit :


> On Fri, Mar 14, 2008 at 11:22 PM, Michel Fortin

>

>> PHP Markdown also has a no-markup mode which would filter script tags

>> and any other HTML tags. But this doesn't prevent anyone from

>> inserting their own script on the page. Do you know you can inject a

>> script in a URL? Guess what this does:

>>

>> [link](javascript:alert%28'Hello%20world!'%29)

>

> This is a good point, and something I hadn't thought about myself. I

> would think that markdown should *not* allow that regardless of any

> safe/no-markup/whatever-you-call-it mode. If someone legitimately

> wants javascript in their links/images/etc then they should be writing

> raw html. What do you think?


Well if you want your "safe" mode to be really safe, then sure you
should not allow `javascript:` URIs indeed.

But in general I believe Markdown should work with any URI. Markdown
is a mean of writing web documents of all kinds, not only content from
external untrusted sources, and there are many legitimate reasons one
would want to write a `javascript:` URI.

Why would you want a "non-safe" Markdown to disallow such URIs in its
link syntax if we're going to be able to add them using HTML tags
anyway?



> Of course, then how do we do that? Some possabilites I came up with

> without much thought:

>

> 1. Trunicate a url at "javascript:"

> 2. Completely remove the entire url (perhaps replace with blank

> string or "#")

> 3. Leave the markup for the entire link as plan text (in other words -

> its not considered a match)

> 4. Do some kind of escaping (not sure what at this point) and leave it

> in the url


Whatever you do, you first have to detect script URIs, all of them;
this is no trivial matters. Most of these will run a script in IE or
some other browser (based on the [XSS cheat sheet][1]):

[link](vbscript:msgbox%28%22Hello%20world!%22%29)
[link](livescript:alert%28'Hello%20world!'%29)
[link](mocha:[code])
[link](jAvAsCrIpT:alert%28'Hello%20world!'%29)
[link](ja vas cr ipt:alert%28'Hello%20world!'%29)
[link](ja vas cr ipt:alert%28'Hello%20world!'%29)
[link](ja vas cr ipt:alert%28'Hello%20world!'%29)
[link](ja%09 %0Avas cr
ipt:alert%28'Hello
%20world!'%29)
[link](ja%20vas%20cr%20ipt:alert%28'Hello%20world!'%29)
[link](live%20script:alert%28'Hello%20world!'%29)

I can't claim this is an exhaustive list, nor that they're all going
to work, but it should give an idea of the problem at hand.

I think blacklisting known dangerous schemes is always going to leave
holes. A better approach is to have a white list of known "safe" URI
schemes and disallow any scheme not in that list. But would be utterly
restrictive for any "non-safe" Markdown.

Security filters already exist to do that (like kses); I'd say it's
much simpler *and* safer to use such a specialized filter on
Markdown's output than trying to come with our own integrated within
Markdown.

[1]: http://ha.ckers.org/xss.html


Michel Fortin
michel.fortin at michelf.com
http://michelf.com/




More information about the Markdown-Discuss mailing list