[ANN] PHP Markdown 1.0.2b7

Michel Fortin michel.fortin at michelf.com
Sat Sep 16 17:23:56 EDT 2006

This is a new release for PHP Markdown, following Markdown.pl 1.0.2b7
from a few weeks ago. It fix the same bugs, and some more; it also
introduce more radical backend changes. It can be downloaded here:


and you can test it on the PHP Markdown Dingus:


This version inaugurates the truly extensible version of PHP Markdown
which should make it a lot easier to write extensions, like my own
PHP Markdown Extra. If you want to create your own extension, I
suggest to take a look at the code for the new PHP Markdown Extra
I'll release in a few minutes. The most interesting part is probably
the constructor for the MarkdownExtra_Parser class.

Another big change is the automatic hashing of all Markdown-generated
HTML content. Previous versions of PHP Markdown Extra were already
doing this, but it was limited on block-level elements only and was
done to have less call to make to the expensive html block parser.
This has been ported to the more basic PHP Markdown, and in addition
to hashing block-level content it now also hash span-level elements:
this has the benefit of preventing bad nesting of elements, so
something like this:

*Some **strange* emphasis**

will now give valid HTML:

*Some <strong>strange* emphasis</strong>

It should be noted however that fixing this introduced other nesting
problems -- like being able to put a [link [inside](#) a link](#).
These problems will be addressed in a future release.

Original improvements in PHP Markdown 1.0.2b7:

* Changed span and block gamut methods so that they loop over a
customizable list of methods. This makes subclassing the parser
a more
interesting option for creating syntax extensions.

* Also added a "document" gamut loop which can be used to hook
methods (like for striping link definitions).

* Changed all methods which were inserting HTML code so that they
now return
a hashed representation of the code. New methods `hashSpan` and
are used to hash respectivly span- and block-level generated
content. This
has a couple of significant effects:

1. It prevents invalid nesting of Markdown-generated elements
could occur occuring with constructs like `*something [link*]
2. It prevents problems occuring with deeply nested lists on which
paragraphs were ill-formed.
3. It removes the need to call `hashHTMLBlocks` twice during the the
block gamut.

Hashes are turned back to HTML prior output.

* Made the block-level HTML parser smarter using a specially-
crafted regular
expression capable of handling nested tags.

* Solved backtick issues in tag attributes by rewriting the HTML
tokenizer to
be aware of code spans. All these lines should work correctly now:

<span attr='`ticks`'>bar</span>
<span attr='``double ticks``'>bar</span>
`<test a="` content of attribute `">`

* Changed the parsing of HTML comments to match simply from `<!--`
to `-->`
instead using of the more complicated SGML-style rule with
paired `--`.
This is how most browsers parse comments and how XML defines
them too.

* `<address>` has been added to the list of block-level elements
and is no
being incorrectly wrapped within paragraph tags.

Improvements borrowed from Markdown.pl:

* Now only trim trailing newlines from code blocks, instead of
all trailing whitespace characters.

* Fixed bug where this:

[text](http://m.com "title" )

wasn't working as expected, because the parser wasn't allowing
for spaces
before the closing paren.

* Filthy hack to support markdown='1' in div tags.

* _DoAutoLinks() now supports the 'dict://' URL scheme.

* PHP- and ASP-style processor instructions are now protected as
raw HTML blocks.

<? ... ?>
<% ... %>

* Experimental support for [this] as a synonym for [this][].

* Fix for escaped backticks still triggering code spans:

There are two raw backticks here: \` and here: \`, not a
code span

Michel Fortin
michel.fortin at michelf.com

More information about the Markdown-Discuss mailing list