The current rules are:
term.utf8
CONFIG_UTF8 UTF-8 I/O widget_data.cdata
----------- --------- ------------------
undefined disabled charset of the terminal
undefined enabled charset of the terminal
defined disabled charset of the terminal (*)
defined enabled always UTF-8
(*) kbd_field was incorrectly assuming UTF-8 in this case.
Explicitly compare the value that is returned by the widget handler
against EVENT_NOT_PROCESSED rather than relying on the fact that
EVENT_NOT_PROCESSED is equal to 1.
ffeedbdc5045a6a5db2bc75ecaab56bfe46c80ea
UCS_NO_CHAR here means the input was too short. Because the strings
generally come from the source code or from PO files, they should not
end in the middle of a character. However, the whole character may be
missing if the string is empty. So select_button_by_key() now checks
for that case separately.
UCS_NO_CHAR must not be passed to unicode_fold_label_case() because
the result is undefined.
The configure script no longer recognizes "CONFIG_UTF_8=yes" lines
in custom features.conf files. They will have to be changed to
"CONFIG_UTF8=yes". This incompatibility was deemed acceptable
because no released version of ELinks supports CONFIG_UTF_8.
The --enable-utf-8 option was not renamed.
Suggested by Miciah on #elinks.
What was renamed:
add_utf_8 => add_utf8
cp2utf_8 => cp2utf8
encode_utf_8 => encode_utf8
get_translation_table_to_utf_8 => get_translation_table_to_utf8
goto invalid_utf_8_start_byte => goto invalid_utf8_start_byte
goto utf_8 => goto utf8
goto utf_8_select => goto utf8_select
terminal_interlink.utf_8 => terminal_interlink.utf8
utf_8_to_unicode => utf8_to_unicode
What was not renamed:
terminal._template_.utf_8_io option, TERM_OPT_UTF_8_IO
Compatibility with existing elinks.conf files would require an alias.
--enable-utf-8
Because the name of the charset is UTF-8, --enable-utf-8 looks better
than --enable-utf8.
CONFIG_UTF_8
Will be renamed in a later commit.
Unicode/utf_8.cp, table_utf_8, aliases_utf_8
Will be renamed in a later commit.
This causes the documented-slow cp2u() to be called in a loop, which
fortunately doesn't have very many iterations. If this is too slow,
then cp2u() can be rewritten, or the hotkeys can be cached in struct
widget or struct widget_data.
Note that check_kbd_label_key() does not yet allow non-ASCII
characters when CONFIG_UTF_8 is defined. Before they are allowed,
menu.c should also be updated.
To reproduce before this patch:
- Run ELinks with an 80x25 terminal.
- Set document.browse.forms.confirm_submit = 1.
- Go to <http://bugzilla.elinks.cz/query.cgi>.
- Click the [ Search ] submit button.
- ELinks asks "Do you want to post form data to URL".
Each line of the URL begins at the horizontal center of the dialog,
and bleeds outside the right border of the dialog. Also, the
[ Yes ] and [ No ] buttons appear to float below the dialog.
Form fields and BFU text-input widgets then convert from UCS-4 to UTF-8.
If not all UTF-8 bytes fit, they don't insert anything. Thus it is no
longer possible to get invalid UTF-8 by hitting the length limit.
It is unclear to me which charset is supposed to be used for strings
in internal buffers. I made BFU insert UTF-8 whenever CONFIG_UTF_8,
but form fields use the charset of the terminal; that may have to be
changed.
As a side effect, this change should solve bug 782, because
term_send_ucs no longer encodes in UTF-8 if CONFIG_UTF_8 is defined.
I think the UTF-8 and codepage encoding calls I added are safe, too.
A similar bug may still surface somewhere else, but 782 could be
closed for now.
This change also lays the foundation for binding actions to non-ASCII
keys, but the keystroke name parser doesn't yet support that.
The CONFIG_UTF_8 mode does not currently support non-ASCII characters
in hot keys, either.
do_move_bookmark was only updating the selection in the bookmarks manager
window in which the Move button was pressed. Now all windows are updated.
This patch also prevents a crash when the first item that was displayed
in a box was the last child of a folder and was being moved (the comment
removed in this patch was incorrect in assuming that bm->box->next must
be valid because it neglected to account for non-root children).
This change required that I move the definition of struct
hierbox_dialog_list_item from src/bfu/hierbox.c to src/bfu/hierbox.h.
Thanks to Kalle Olavi Niemitalo for finding both the update problem
and the crash.
With regular comments in the definition of the structure itself,
and with xgettext:c-format comments in constants of that type,
if xgettext would otherwise guess wrong; so that translators
will know they'll have to double any percent signs they add.
I didn't regenerate PO files, though.
src/bfu/menu.c (scroll_menu): Let neither menu->selected nor pos
become -2.
src/bfu/menu.c (menu_mouse_handler): Call set_menu_selection directly
rather than via scroll_menu, as sel is already known to be selectable.
(Not required for fixing the bug.)
src/bfu/menu.c (menu_search_handler): Break infinite loops also if
menu->selected is -1 initially.
src/bfu/menu.c (menu_handler): Instead of tweaking menu->selected
directly, let scroll_menu do it.
This fixes two bugs:
1. Pressing F9 did not make the main menu visible, but then pressing
e.g. Right made it visible.
2. Pressing F9 and then Down displayed the first submenu (File) at the
wrong position on the screen.
src/bfu/README: This new file currently contains a diagram of how the
various struct types of src/bfu/ and src/terminal/ relate to each
other. More documentation may be added later, although if it is
specific to a particular structure, then it should probably go in the
corresponding header file so that people will remember to update it.
Note: there is ugly hack in ACT_EDIT_BACKSPACE where last byte of UTF-8
character is removed and then moving to left until complete UTF-8
character is found.
With UTF-8 support in terminal enabled it is possible to use double-width
UTF-8 strings as margins of buttons. Although they are displayed wrong when
UTF-8 support in terminal is disabled.