mirror of
https://github.com/rkd77/elinks.git
synced 2024-12-04 14:46:47 -05:00
I18N bug 1112: Use strange_chars[] for UTF-8 output too
Make u2cp_() map code points U+0080 to U+009F via strange_chars[] even if the target codepage is UTF-8. This helps with buggy web pages that use ’ when they mean ’. This change does not affect how ELinks decodes raw bytes 0x80 to 0x9F in HTML. u2cp_() is used only via the u2cp and u2cp_no_nbsp macros. Possible side effects of this change at each use of these macros: * get_translation_table(): Not affected because it does not call u2cp if the target codepage is UTF-8. * get_entity_string(): Numeric character references are affected, as intended. Character entity references are not affected because entities[] does not define any entities in the U+0080...U+009F range. * kbd_field(), term_send_ucs(), field_op(): Affected. It is no longer possible to enter code points U+0080...U+009F from the terminal. This should not be a problem in practice because those would be control characters anyway and should therefore be filtered by the slave process (which doesn't yet recognize them; bug 777).
This commit is contained in:
parent
e3cb8d6a77
commit
f735bfbe72
5
NEWS
5
NEWS
@ -40,6 +40,11 @@ Miscellaneous:
|
||||
``Background and Notify'' via the download manager in some terminal,
|
||||
reassociate the download with that terminal. These changes do not
|
||||
apply to downloads to external handlers.
|
||||
* bug 1112: Map most numeric character references € ... Ÿ
|
||||
to graphical characters also when the output charset is UTF-8.
|
||||
(ELinks 0.12pre1 was the first release that supported UTF-8 as the
|
||||
terminal charset, and ELinks 0.12pre5 was the first release that
|
||||
supported UTF-8 as the dump charset.)
|
||||
* Really retry forever when connection.retries = 0.
|
||||
* enhancement: Session-specific options. Any options changed with
|
||||
toggle-* actions no longer affect other tabs or other terminals.
|
||||
|
@ -203,6 +203,11 @@ u2cp_(unicode_val_T u, int to, enum nbsp_mode nbsp_mode)
|
||||
|
||||
if (u < 128) return strings[u];
|
||||
|
||||
if (u < 0xa0) {
|
||||
u = strange_chars[u - 0x80];
|
||||
if (!u) return NULL;
|
||||
}
|
||||
|
||||
to &= ~SYSTEM_CHARSET_FLAG;
|
||||
|
||||
if (is_cp_ptr_utf8(&codepages[to]))
|
||||
@ -216,13 +221,6 @@ u2cp_(unicode_val_T u, int to, enum nbsp_mode nbsp_mode)
|
||||
}
|
||||
if (u == UCS_SOFT_HYPHEN) return "";
|
||||
|
||||
if (u < 0xa0) {
|
||||
unicode_val_T strange = strange_chars[u - 0x80];
|
||||
|
||||
if (!strange) return NULL;
|
||||
return u2cp_(strange, to, nbsp_mode);
|
||||
}
|
||||
|
||||
if (u < 0xFFFF)
|
||||
for (j = 0; j < 0x80; j++)
|
||||
if (codepages[to].highhalf[j] == u)
|
||||
|
Loading…
Reference in New Issue
Block a user