1
0
mirror of https://github.com/rkd77/elinks.git synced 2024-12-04 14:46:47 -05:00

I18N bug 1112: Use strange_chars[] for UTF-8 output too

Make u2cp_() map code points U+0080 to U+009F via strange_chars[] even
if the target codepage is UTF-8.  This helps with buggy web pages that
use ’ when they mean ’.  This change does not affect how
ELinks decodes raw bytes 0x80 to 0x9F in HTML.

u2cp_() is used only via the u2cp and u2cp_no_nbsp macros.
Possible side effects of this change at each use of these macros:

* get_translation_table(): Not affected because it does not call u2cp
  if the target codepage is UTF-8.
* get_entity_string(): Numeric character references are affected, as intended.
  Character entity references are not affected because entities[]
  does not define any entities in the U+0080...U+009F range.
* kbd_field(), term_send_ucs(), field_op(): Affected.  It is no longer
  possible to enter code points U+0080...U+009F from the terminal.
  This should not be a problem in practice because those would be
  control characters anyway and should therefore be filtered by the
  slave process (which doesn't yet recognize them; bug 777).
This commit is contained in:
Kalle Olavi Niemitalo 2011-04-17 18:09:29 +03:00 committed by Kalle Olavi Niemitalo
parent e3cb8d6a77
commit f735bfbe72
2 changed files with 10 additions and 7 deletions

5
NEWS
View File

@ -40,6 +40,11 @@ Miscellaneous:
``Background and Notify'' via the download manager in some terminal, ``Background and Notify'' via the download manager in some terminal,
reassociate the download with that terminal. These changes do not reassociate the download with that terminal. These changes do not
apply to downloads to external handlers. apply to downloads to external handlers.
* bug 1112: Map most numeric character references € ... Ÿ
to graphical characters also when the output charset is UTF-8.
(ELinks 0.12pre1 was the first release that supported UTF-8 as the
terminal charset, and ELinks 0.12pre5 was the first release that
supported UTF-8 as the dump charset.)
* Really retry forever when connection.retries = 0. * Really retry forever when connection.retries = 0.
* enhancement: Session-specific options. Any options changed with * enhancement: Session-specific options. Any options changed with
toggle-* actions no longer affect other tabs or other terminals. toggle-* actions no longer affect other tabs or other terminals.

View File

@ -203,6 +203,11 @@ u2cp_(unicode_val_T u, int to, enum nbsp_mode nbsp_mode)
if (u < 128) return strings[u]; if (u < 128) return strings[u];
if (u < 0xa0) {
u = strange_chars[u - 0x80];
if (!u) return NULL;
}
to &= ~SYSTEM_CHARSET_FLAG; to &= ~SYSTEM_CHARSET_FLAG;
if (is_cp_ptr_utf8(&codepages[to])) if (is_cp_ptr_utf8(&codepages[to]))
@ -216,13 +221,6 @@ u2cp_(unicode_val_T u, int to, enum nbsp_mode nbsp_mode)
} }
if (u == UCS_SOFT_HYPHEN) return ""; if (u == UCS_SOFT_HYPHEN) return "";
if (u < 0xa0) {
unicode_val_T strange = strange_chars[u - 0x80];
if (!strange) return NULL;
return u2cp_(strange, to, nbsp_mode);
}
if (u < 0xFFFF) if (u < 0xFFFF)
for (j = 0; j < 0x80; j++) for (j = 0; j < 0x80; j++)
if (codepages[to].highhalf[j] == u) if (codepages[to].highhalf[j] == u)