converting html with \xa9 to Markdown and using iconv?
John MacFarlane
jgm at berkeley.edu
Thu Mar 22 19:00:14 EDT 2007
You could try html2markdown, which uses iconv, tidy, and pandoc.
It should have no trouble with these characters. It's included in the
pandoc distribution: http://sophos.berkeley.edu/macfarlane/pandoc/
JM
+++ Jeremy C. Reed [Mar 22 07 15:52 ]:
> The html document various characters like
> \xa0
> © \xa9 (Copyright symbol)
> (and others).
>
> I tried using html2text.py but it didn't like these characters.
>
> Any ideas on how I can use iconv or another tool to convert documents like
> this so I can then convert to Markdown?
>
> I don't want to do manually as I have around 500+ documents.
>
>
> Jeremy C. Reed
> _______________________________________________
> Markdown-Discuss mailing list
> Markdown-Discuss at six.pairlist.net
> http://six.pairlist.net/mailman/listinfo/markdown-discuss
More information about the Markdown-Discuss
mailing list