17 lines
793 B
Plaintext
17 lines
793 B
Plaintext
It often happens that you have text data in Unicode, but you need
|
|
to represent it in ASCII. For example when integrating with legacy
|
|
code that doesn't support Unicode, or for ease of entry of non-Roman
|
|
names on a US keyboard, or when constructing ASCII machine identifiers
|
|
from human-readable Unicode strings that should still be somewhat
|
|
intelligeble (a popular example of this is when making an URL slug
|
|
from an article title).
|
|
|
|
Note that this module generally produces better results than simply
|
|
stripping accents from characters (which can be done in Python with
|
|
built-in functions). It is based on hand-tuned character mappings
|
|
that for example also contain ASCII approximations for symbols and
|
|
non-Latin alphabets.
|
|
|
|
This is a Python port of Text::Unidecode Perl module by Sean M.
|
|
Burke.
|