SmartyPants and dashes
Waylan Limberg
waylan.limberg at icloud.com
Tue Oct 28 19:45:44 EDT 2014
While Calibre does in fact use Python-Markdown, they do not use our
SmartyPants extension, which is relatively new. Instead they use the
standalone SmartyPants Python library [1] (see source here [2]) which uses
the same default as the original Perl implementation and has been around for
a long time. The Python library's documentation restates [3] the comments in
the Perl implementation.
Interestingly, that behavior is labeled as "oldschool" **and** it is the
default. Probably kept the old behavior for backward compatibility. Don't
want to break all the old existing documents. Of course, new users can
override the defaults and start fresh with new documents using the behavior
that actually makes sense ("--" and "---" to en- and em-dash entities
respectively). That said, Calibre has been around for a long time and has a
pretty big user base, so I suppose it makes sense for them to stick with the
old behavior.
I think your confusion comes from the fact that the documentation doesn't
exactly make it clear that the "oldschool" behavior is the default. In fact,
the docs for the original Perl implementation don't even mention the
"oldschool" behavior, but the implementation clearly has support for it as
evidenced by the Dingus. In fact, I just noticed that the version available
for download [4] is 1.5.1 but the version used by the Dingus [5] is 1.6.0b6
(which is presumably newer). Why? Only J.G. can answer that. The good news
is that the implementation you care about (the one used by Calibre - which
you are using) is fully documented. So I would suggest using that
documentation for SmartyPants.
As an aside, when we reimplemented SmartyPants as a Python-Markdown
extension, we only implemented the newer behavior. We don't even offer the
option to revert to the "oldschool" behavior - unless you want to override
the relevant "Substitution Keys" manually (although we offer that
customization for another purpose - to support other languages which have
different rules than English).
I hope that helps clear things up a little.
Waylan Limberg
[1]: http://pythonhosted.org/smartypants/index.html
[2]:
https://github.com/kovidgoyal/calibre/blob/master/src/calibre/ebooks/convers
ion/preprocess.py#L76
[3]:
http://pythonhosted.org/smartypants/reference.html#smartypants.convert_dashe
s_oldschool_inverted
[4]: http://daringfireball.net/projects/smartypants/
[5]: http://daringfireball.net/projects/markdown/dingus
From: Markdown-Discuss [mailto:markdown-discuss-bounces at six.pairlist.net] On
Behalf Of Virgil Arrington
Sent: Tuesday, October 28, 2014 12:07 PM
To: markdown-discuss at six.pairlist.net
Subject: SmartyPants and dashes
Please bear with me as I am just a user and, by no means, a developer.
However, I've noticed some inconsistent behavior among different
implementations of SmartyPants when it comes to en-dashes and em-dashes.
1. Calibre 2.7
When I recently uploaded a Markdown source file to Calibre and selected its
"Smarten punctuation" feature, it converted a double-hyphen "--" into an
em-dash, and a triple hyphen "---" into an en-dash. This behavior was the
opposite that I have come to expect using both LaTeX and ReText, my default
Markdown editor.
I reported the matter as a Calibre bug, but received a response saying that
the Calibre behavior was based on the official SmartyPants source code,
which states as follows:
"The string, with each instance of "--" translated to an em-dash HTML
entity, and each "---" translated to an en-dash HTML entity. Two reasons
why: First, unlike the en- and em-dash syntax supported by
EducateDashesOldSchool(), it's compatible with existing entries written
before SmartyPants 1.1, back when "--" was only used for em-dashes. Second,
em-dashes are more common than en-dashes, and so it sort of makes sense that
the shortcut should be shorter to type. (Thanks to Aaron Swartz for the
idea.)"
Confused by this, I tested two other methods of using SmartyPants, and
received inconsistent results.
2. Python Markdown v2.5.1
I use ReText as my Markdown editor. It uses the Smartypants extension that
is provided with Python Markdown v2.5.1. It translates "--" as an endash and
"---" as an emdash, the *opposite* as Calibre. Below is the translation
table found at https://pythonhosted.org/Markdown/extensions/smarty.html
ASCII symbol
Replacements
HTML Entities
Substitution Keys
'
' '
‘ ’
'left-single-quote', 'right-single-quote'
"
" "
“ ”
'left-double-quote', 'right-double-quote'
<< >>
< >
« »
'left-angle-quote', 'right-angle-quote'
...
.
…
'ellipsis'
--
-
–
'ndash'
---
-
—
'mdash'
At the bottom of the Python/SmartyPants extension page is the following:
"SmartyPants extension is based on the original SmartyPants implementation
by John Gruber. Please read it's documentation
<http://daringfireball.net/projects/smartypants/> for details."
Based on this, I went to Gruber's page and got yet more inconsistency.
3. SmartyPants by John Gruber.
At Gruber's SmartyPants page,
(http://daringfireball.net/projects/smartypants/) the following is found:
"SmartyPants can perform the following transformations:
* Straight quotes ( " and ' ) into "curly" quote HTML entities
* Backticks-style quotes (``like this'') into "curly" quote HTML
entities
* Dashes ("--" and "---") into en- and em-dash entities
* Three consecutive dots ("...") into an ellipsis entity"
Based on the order of the dashes listed, it would appear as if Gruber is
suggesting that "--" would turn into an en-dash, and "---" into an em-dash
(consistent with ReText, but not with Calibre). But, if I use Gruber's
online Dingus translator
(http://daringfireball.net/projects/markdown/dingus), I get yet a third
variation of the conversion. Gruber's online translator converts "--" into
an em-dash (as does Calibre) but it turns "---" into an em-dash plus a
hyphen (no en-dash).
There appears to be either confusion or disagreement in the
Markdown/SmartyPants world as to how to create typographic dashes. Is there
any way that the developers can come together on this very small part of the
Markdown world?
Virgil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist6.pair.net/pipermail/markdown-discuss/attachments/20141028/c3cf856e/attachment-0001.html>
More information about the Markdown-Discuss
mailing list