Detab should be multi-byte aware?
Michel Fortin
michel.fortin at michelf.com
Mon Oct 9 18:52:18 EDT 2006
Le 9 oct. 2006 à 17:02, Allan Odgaard a écrit :
> As for #2, Markdown doesn’t know the encoding of the source
> document, so that would mean it can’t really be aware of things
> such as UTF-8 mb sequences, OTOH if it changes my pre-formatted
> text, I would like to have it do the right thing.
Currently, Markdown.pl and my own PHP implementation of Markdown both
support any superset of ASCII; that includes UTF-8. UTF-8 multi-byte
sequences have the interesting property of being entirely composed of
bytes above 127, over ASCII range. So while Markdown isn't really
"aware" of these multi-byte sequences in the sense that it treats
them as one character, it isn't changing them into anything either.
From your description of the problem, I believe you're not using UTF-8.
Michel Fortin
michel.fortin at michelf.com
http://www.michelf.com/
More information about the Markdown-Discuss
mailing list