converting html with \xa9 to Markdown and using iconv?

John MacFarlane jgm at berkeley.edu
Thu Mar 22 19:00:14 EDT 2007


You could try html2markdown, which uses iconv, tidy, and pandoc.
It should have no trouble with these characters. It's included in the
pandoc distribution: http://sophos.berkeley.edu/macfarlane/pandoc/

JM

+++ Jeremy C. Reed [Mar 22 07 15:52 ]:

> The html document various characters like

>   \xa0

> © \xa9 (Copyright symbol)

> (and others).

>

> I tried using html2text.py but it didn't like these characters.

>

> Any ideas on how I can use iconv or another tool to convert documents like

> this so I can then convert to Markdown?

>

> I don't want to do manually as I have around 500+ documents.

>

>

> Jeremy C. Reed

> _______________________________________________

> Markdown-Discuss mailing list

> Markdown-Discuss at six.pairlist.net

> http://six.pairlist.net/mailman/listinfo/markdown-discuss




More information about the Markdown-Discuss mailing list