PHP 5 port of Markdown, plugin-based
John Gruber
gruber at fedora.net
Wed Aug 16 01:11:55 EDT 2006
Michel Fortin <michel.fortin at michelf.com> wrote on 8/15/06 at 5:40 PM:
> > I have taken some liberties with this implementation, such as using
> > delimited integer markers instead of MD5 hashes, and changing some
> > of the rule names (e.g., from "Anchors" to "Links" in one case, and
> > from "ItalicsAndBold" to "EmStrong" in another).
>
> I've always wondered why John chose these function names. And going
> away from hashes seems like a good idea too.
The MD5 hashing is (I thought) clearly just a very odd
implementation detail in Markdown.pl. It has nothing whatsoever to
do with Markdown itself.
And it ends up that Perl's `Digest::MD5` module has some serious
incompatibilities with Unicode text (the whole bytes-vs-characters
thing), which I never uncovered even though I pass UTF-8 input to
Markdown.pl every single day. The difference is that in all the
places where I use Markdown, my strings aren't explicitly encoded
as UTF-8 from Perl's perspective. Perl just treats my input as a
sequence of bytes, which MD5 hashes properly, and the Right Thing
just happens. However, if anyone uses Markdown in a script where
input is explicitly encoded as UTF-8, such that Perl is aware of
the string as UTF-8, then `Digest::MD5` will choke.
So: soon(ish), Markdown.pl will no longer do the hashing thing
internally, either. It's really quite silly if you think about it,
but at the time I was writing that code, it seemed easier than
keeping a counter.
_DoAnchors() still seems sensible to me, in that `<a>` tags are
"anchor" tags.
`_DoItalicsAndBold()` matches how I think of *this* and **that**,
in my mind.
I know why Michel followed my use of MD5, internally -- by copying
as much of my algorithm as possible, he made it as easy as
possible to sync changes between our implementations.
-J.G.
More information about the Markdown-Discuss
mailing list