PHP Markdown 1.0.1b2

Michel Fortin michel.fortin at michelf.com
Sat Nov 27 17:27:26 EST 2004


This is PHP Markdown 1.0.1 beta 2, which includes the sames features as 
[Perl] Markdown 1.0.1 beta 2, minus Perl/Blosxom-related fixes, plus 
PHP-specific correction/improvements.

PHP Markdown 1.0.1b2 can be found here:
<http://www.michelf.com/docs/projets/php-markdown-1.0.1b2.zip>

Two important things to know about this release:

1.	It should be a lot faster, especially if you write your paragraphs
	without line breaks (hard-wrapped).

	The pattern for searching for Setext-style headers caused the regex
	parser to backtrack a lot more than necessary.

2.	There may be an issue when dealing with complex nesting of tags
	inside attributes:

		<tag attr="<tag>">                -- this is supported
		<tag attr="<tag attr="text">">    -- this is supported too
		<tag attr="<tag attr="<tag>">">   -- this is not

	Every one of these was correctly supported by older versions -- and
	still is by John's Perl Markdown -- but the regex used to match the
	last example above was trigging a segmentation fault with PHP 4.3.8.
	So I made a simpler and faster expression to replace the complex
	one which does not support the last line in the example above. If
	you ever need to write something like it, you can use single quotes:

		<tag attr='<tag attr="<tag>">'>   -- this is supported

	Of course, all these examples are invalid html since you should
	always escape what is inside an attribute using `&lt;` and `&gt;`.

	I don't think this will be a big issue. If it is to you, tell me.
	On a side note, this allows correct parsing of a tag like this one:

		<a alt="4*2*8 < 100"></a>

	(this isn't valid html either), which become this with
	PHP Markdown 1.0 and current Perl Markdown:

		<p><a alt="4<em>2</em>8 &lt; 100"></a></p>

I would like to have some feedback on these two changes. Thanks.

* * *

Changes since 1.0:

+	Fixed annoying bug where nested lists would wind up with
	spurious (and invalid) `<p>` tags.

+	Changed _StripLinkDefinitions() so that link definitions must
	occur within three spaces of the left margin. Thus if you indent
	a link definition by four spaces or a tab, it will now be a code
	block.

+	You can now write empty links:

		[like this]()
	
	and they'll be turned into anchor tags with empty href attributes.
	This should have worked before, but didn't.

+	`***this***` and `___this___` are now turned into

		<strong><em>this</em></strong>

	Instead of
	
		<strong><em>this</strong></em>
	
	which isn't valid.

+	Fixed problem for links defined with urls that include parens, e.g.:

		[1]: http://sources.wikipedia.org/wiki/Middle_East_Policy_(Chomsky)

	"Chomsky" was being erroneously treated as the URL's title.

+	Improved significantly performance of setext-style header
	matching in _DoHeader. This was making long lines very long
	to parse.

+	Changed a regular expression in _TokenizeHTML that could lead to
	a segmentation fault with PHP 4.3.8 on Linux.

+	Replaced a call to `htmlentities` for `htmlspecialchars`. This
	fixed a bug where multibyte characters present in the title
	of a link reference could lead to invalid utf-8 characters.

+	Fixed some notices that could show up if PHP error reporting
	E_NOTICE flag was set.


Michel Fortin
michel.fortin at michelf.com
http://www.michelf.com/



More information about the Markdown-discuss mailing list