Javascript in URLs (was: Markdown doesn't always generate XHTML)
Michel Fortin
michel.fortin at michelf.com
Sat Mar 15 08:17:48 EDT 2008
Le 2008-03-15 à 0:39, Waylan Limberg a écrit :
> On Fri, Mar 14, 2008 at 11:22 PM, Michel Fortin
>
>> PHP Markdown also has a no-markup mode which would filter script tags
>> and any other HTML tags. But this doesn't prevent anyone from
>> inserting their own script on the page. Do you know you can inject a
>> script in a URL? Guess what this does:
>>
>> [link](javascript:alert%28'Hello%20world!'%29)
>
> This is a good point, and something I hadn't thought about myself. I
> would think that markdown should *not* allow that regardless of any
> safe/no-markup/whatever-you-call-it mode. If someone legitimately
> wants javascript in their links/images/etc then they should be writing
> raw html. What do you think?
Well if you want your "safe" mode to be really safe, then sure you
should not allow `javascript:` URIs indeed.
But in general I believe Markdown should work with any URI. Markdown
is a mean of writing web documents of all kinds, not only content from
external untrusted sources, and there are many legitimate reasons one
would want to write a `javascript:` URI.
Why would you want a "non-safe" Markdown to disallow such URIs in its
link syntax if we're going to be able to add them using HTML tags
anyway?
> Of course, then how do we do that? Some possabilites I came up with
> without much thought:
>
> 1. Trunicate a url at "javascript:"
> 2. Completely remove the entire url (perhaps replace with blank
> string or "#")
> 3. Leave the markup for the entire link as plan text (in other words -
> its not considered a match)
> 4. Do some kind of escaping (not sure what at this point) and leave it
> in the url
Whatever you do, you first have to detect script URIs, all of them;
this is no trivial matters. Most of these will run a script in IE or
some other browser (based on the [XSS cheat sheet][1]):
[link](vbscript:msgbox%28%22Hello%20world!%22%29)
[link](livescript:alert%28'Hello%20world!'%29)
[link](mocha:[code])
[link](jAvAsCrIpT:alert%28'Hello%20world!'%29)
[link](ja vas cr ipt:alert%28'Hello%20world!'%29)
[link](ja vas cr ipt:alert%28'Hello%20world!'%29)
[link](ja vas cr ipt:alert%28'Hello%20world!'%29)
[link](ja%09 %0Avas cr
ipt:alert%28'Hello
%20world!'%29)
[link](ja%20vas%20cr%20ipt:alert%28'Hello%20world!'%29)
[link](live%20script:alert%28'Hello%20world!'%29)
I can't claim this is an exhaustive list, nor that they're all going
to work, but it should give an idea of the problem at hand.
I think blacklisting known dangerous schemes is always going to leave
holes. A better approach is to have a white list of known "safe" URI
schemes and disallow any scheme not in that list. But would be utterly
restrictive for any "non-safe" Markdown.
Security filters already exist to do that (like kses); I'd say it's
much simpler *and* safer to use such a specialized filter on
Markdown's output than trying to come with our own integrated within
Markdown.
[1]: http://ha.ckers.org/xss.html
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
More information about the Markdown-Discuss
mailing list