freebsd-ports/converters/py-webencodings/pkg-descr
Yuri Victorovich 79735e8a8b New port: converters/py-webencodings: Character encoding aliases for legacy web content
PR:		226316
Submitted by:	Marcin Cieślak <saper@saper.info>
Approved by:	tcberner (mentor, implicit)
2018-03-04 09:10:41 +00:00

17 lines
686 B
Plaintext

In order to be compatible with legacy web content when interpreting
something like Content-Type: text/html; charset=latin1, tools need
to use a particular set of aliases for encoding labels as well as
some overriding rules.
For example, US-ASCII and iso-8859-1 on the web are actually aliases for
windows-1252, and an UTF-8 or UTF-16 BOM takes precedence over any other
encoding declaration.
The Encoding standard defines all such details so that implementations
do not have to reverse-engineer each other.
This module has encoding labels and BOM detection, but the actual
implementation for encoders and decoders is Python's.
WWW: https://github.com/SimonSapin/python-webencodings