Markdown-Discuss Digest, Vol 26, Issue 22

Jon Noring jon at noring.name
Thu Dec 1 11:11:04 EST 2005


Yuri wrote:


> The problem is, italis is also used for things other than emphasis.

> E.g., a common use for italics is to mark foreign words, but it might

> be a bit of a _faux paux_ to put <em> around them. Same for book

> titles.


Per conventions, emphasized text is used for communicating all kinds of
text semantics. Unfortunately, in the print world, there's a limited
number of acceptable ways to highlight text: bold, italic,
strikethrough, underline, and different foreground or background ink
colors. Fortunately, readers are intelligent and can figure out why
certain text is highlighted. But computers are not intelligent enough
to do so (complicated algorithms can be used to try to figure out text
semantics, but won't be 100% accurate.)

Complicating things further is that conventions vary from
country-to-country, script-to-script and even era-to-era. So the
conventions themselves are not universal.

For English documents, here's a partial list of different semantics
expressed by text highlighting:

linguistic emphasis
literary emphasis (not the same as the above!)
foreign phrase
word used as a word
title of book, manuscript, article, header, etc.
name of ship, plane, etc.

(And then there's weak and strong emphasis, oftentimes approximated
by italic and bold. I believe in other languages/scripts that text
highlighting is used for other kinds of semantics.)

Another issue complicating matters is accessibility, such as
text-to-speech for the visually impaired. Here, the highlighted text
has to be communicated by voice, and italics/bold have no meaning and
can often be misconstrued. One would like to be able to semantically
markup inline text so text-to-speech engines know the difference
between linguistic emphasis and the name of a ship, for example.

With minimal markup techniques it gets difficult to apply semantics to
inline text describing what it is, and systems which try to handle
most of the semantic variations will end up being complicated to
apply, and interrupt the reading of the plain text. So compromise is
necessary (such as giving up full accessibility, limiting visual
styling of inline text, etc.) That is, one pretty much has to assume
visual reading of the text per the conventions of the audience, and
let the end-user figure out the semantics.

Jon Noring



More information about the Markdown-Discuss mailing list