PHP Markdown 1.0.1b2
Michel Fortin
michel.fortin at michelf.com
Sat Nov 27 17:27:26 EST 2004
This is PHP Markdown 1.0.1 beta 2, which includes the sames features as
[Perl] Markdown 1.0.1 beta 2, minus Perl/Blosxom-related fixes, plus
PHP-specific correction/improvements.
PHP Markdown 1.0.1b2 can be found here:
<http://www.michelf.com/docs/projets/php-markdown-1.0.1b2.zip>
Two important things to know about this release:
1. It should be a lot faster, especially if you write your paragraphs
without line breaks (hard-wrapped).
The pattern for searching for Setext-style headers caused the regex
parser to backtrack a lot more than necessary.
2. There may be an issue when dealing with complex nesting of tags
inside attributes:
<tag attr="<tag>"> -- this is supported
<tag attr="<tag attr="text">"> -- this is supported too
<tag attr="<tag attr="<tag>">"> -- this is not
Every one of these was correctly supported by older versions -- and
still is by John's Perl Markdown -- but the regex used to match the
last example above was trigging a segmentation fault with PHP 4.3.8.
So I made a simpler and faster expression to replace the complex
one which does not support the last line in the example above. If
you ever need to write something like it, you can use single quotes:
<tag attr='<tag attr="<tag>">'> -- this is supported
Of course, all these examples are invalid html since you should
always escape what is inside an attribute using `<` and `>`.
I don't think this will be a big issue. If it is to you, tell me.
On a side note, this allows correct parsing of a tag like this one:
<a alt="4*2*8 < 100"></a>
(this isn't valid html either), which become this with
PHP Markdown 1.0 and current Perl Markdown:
<p><a alt="4<em>2</em>8 < 100"></a></p>
I would like to have some feedback on these two changes. Thanks.
* * *
Changes since 1.0:
+ Fixed annoying bug where nested lists would wind up with
spurious (and invalid) `<p>` tags.
+ Changed _StripLinkDefinitions() so that link definitions must
occur within three spaces of the left margin. Thus if you indent
a link definition by four spaces or a tab, it will now be a code
block.
+ You can now write empty links:
[like this]()
and they'll be turned into anchor tags with empty href attributes.
This should have worked before, but didn't.
+ `***this***` and `___this___` are now turned into
<strong><em>this</em></strong>
Instead of
<strong><em>this</strong></em>
which isn't valid.
+ Fixed problem for links defined with urls that include parens, e.g.:
[1]: http://sources.wikipedia.org/wiki/Middle_East_Policy_(Chomsky)
"Chomsky" was being erroneously treated as the URL's title.
+ Improved significantly performance of setext-style header
matching in _DoHeader. This was making long lines very long
to parse.
+ Changed a regular expression in _TokenizeHTML that could lead to
a segmentation fault with PHP 4.3.8 on Linux.
+ Replaced a call to `htmlentities` for `htmlspecialchars`. This
fixed a bug where multibyte characters present in the title
of a link reference could lead to invalid utf-8 characters.
+ Fixed some notices that could show up if PHP error reporting
E_NOTICE flag was set.
Michel Fortin
michel.fortin at michelf.com
http://www.michelf.com/
More information about the Markdown-discuss
mailing list