Inline styles mystery
Michel Fortin
michel.fortin at michelf.com
Thu Dec 9 20:48:37 EST 2004
Le 9 déc. 2004, à 14:20, Lou Quillio a écrit :
> I've stumbled onto some curious Markdown behavior today, while
> trying to figure why Drupal's MarkSmarty module is educating quotes
> around explicit-HTML attributes (it applies SmartyPants after
> Markdown, I'm using Michel's current php modules, and MarkSmarty
> applies them as-is). Anyhow ...
"Current" as in 1.0.1b7 I presume, not 1.0?
Ok, I have absolutely not explanation for the "medium" thing or the
lost tailing slash and anything I tried failed to replicate your
problem.
But I found a bug present since 1.0.1b2 that would lead to an incorrect
behaviour with this. In fact, it's a pretty big bug and I am glad to
find it *now*, before the official 1.0.1 release. Maybe somehow it is
related to your problem, I don't know.
The new regex in `_TokenizeHTML` was pretty bad... In fact it could
only match tags that:
1. Has no empty attribute like this one `alt=""`.
2. Has a whitespace after the name of the tag, unlike this: `<h1>`.
3. Has a tag name consisting of only one letter!... like `<a
href="test">`.
As you can see, it was so bad I wonder how it made it there. The only
explanation I have is that I tested it with tags like `<a href="test">`
and used an old version of the regex... because I knew the first
version of it I made had this problem too.
By the way, something interesting to note is that PHP Markdown and PHP
SmartyPants both share the same `_TokenizeHTML` function. By that I
mean that the function is defined only once when the files parsed -- by
the first file included. So if there is a bug in Markdown version of
the function, it can "propagate" to PHP SmartyPants too. This is
exactly what happens on the dingus. If I use the "both" filter I get
this "img" tag:
<img
src=”http://www.michelf.com/img/photo/michel-fortin-
arbre.jpg”
alt=”” width=”155” height=”155”
style=” float:right; clear:right;
border:none; padding:1em;” />
Markdown is included first by the dingus so it's version of
`_TokenizeHTML` is used by both, even if the version in PHP SmartyPants
does not have this bug. If I select PHP SmartyPants alone, the problem
does not occur.
By the way, I planned an update to PHP SmartyPants at the same time PHP
Markdown 1.0.1 is released so that the change to _TokenizeHTML happens
in both.
In the meantime, expect 1.0.1b7.1 soon.
> The image tag is inline HTML. Shouldn't it be literally untouched
> by Markdown (hands-off my trailing slash)? And I was very surprised to
> see it reaching into my style attribute.
I am too. Are you sure it's really Markdown that does that? Does this
happens on the dingus? I can't replicate that.
> Also, shouldn't the
> final output be identical to the HTML Source output? Or is there
> something I'm not getting?
I don't see anything different in the HTML source and the real HTML
source of the page. I'm not sure what I can do about this, but it
should be the same.
Michel Fortin
michel.fortin at michelf.com
http://www.michelf.com/
More information about the Markdown-discuss
mailing list