The previous version used only the low 8 bits of the key code.
This one arranges for the whole key to be rejected if it's not ASCII.
Perhaps the modifiers should be checked too, but I'm not changing that now.
In revision 1.15 of dns.c (as it was called way back then), pasky
backported a fix from Links 0.97pre2 to try gethostbyaddr before
trying gethostbyname for DNS lookups:
MacOS address resolution fix (Aldy Hernandez) (from 0.97pre2)
However, that fix introduced a bug, because it was calling gethostbyaddr
on all addresses, not just IP addresses. Mikulas fixed that bug in Links
0.98:
Do not call gethostbyaddr when name is not ip address (it should avoid
some useless nameserver queries)'
This fix was never backported to ELinks. Until today.
This commit is functionally the same as the fix in Links 0.98, plus it uses
inet_aton for great correctness!
This fixes a bug reported in #elinks by tnks, whereby lookups for
yubnub.org resulted in 121.117.98.110 == 0x7975626E == 'y', 'u', 'b', 'n'.
I believe that it also fixes bug 691 (which is already closed with a
workaround).
To reproduce before this patch:
- Run ELinks with an 80x25 terminal.
- Set document.browse.forms.confirm_submit = 1.
- Go to <http://bugzilla.elinks.cz/query.cgi>.
- Click the [ Search ] submit button.
- ELinks asks "Do you want to post form data to URL".
Each line of the URL begins at the horizontal center of the dialog,
and bleeds outside the right border of the dialog. Also, the
[ Yes ] and [ No ] buttons appear to float below the dialog.
Previously, ELinks used to silently discard the Alt modifier from
Alt-ö keystrokes when UTF-8 I/O was enabled. Now, separate actions
can be bound to ö and Alt-ö.
However, if CONFIG_UTF_8 is defined, then actions cannot be bound to
non-ASCII characters, regardless of modifiers. This is because the
code that handles names of keystrokes assumes a character can only be
a single byte. This commit does not change that.
Form fields and BFU text-input widgets then convert from UCS-4 to UTF-8.
If not all UTF-8 bytes fit, they don't insert anything. Thus it is no
longer possible to get invalid UTF-8 by hitting the length limit.
It is unclear to me which charset is supposed to be used for strings
in internal buffers. I made BFU insert UTF-8 whenever CONFIG_UTF_8,
but form fields use the charset of the terminal; that may have to be
changed.
As a side effect, this change should solve bug 782, because
term_send_ucs no longer encodes in UTF-8 if CONFIG_UTF_8 is defined.
I think the UTF-8 and codepage encoding calls I added are safe, too.
A similar bug may still surface somewhere else, but 782 could be
closed for now.
This change also lays the foundation for binding actions to non-ASCII
keys, but the keystroke name parser doesn't yet support that.
The CONFIG_UTF_8 mode does not currently support non-ASCII characters
in hot keys, either.
There is no need to check whether ev->ev == EVENT_KBD;
if decode_terminal_escape_sequence called
decode_terminal_mouse_escape_sequence, then the former neither modified
kbd.key nor passed &kbd to the latter, so kbd.key remains KBD_UNDEF.
If ev->ev was not checked, then it should not be trusted either.
So reinitialize the whole *ev if a keyboard event was indeed found.
For instance, if Ctrl-F1 were pressed and src/terminal/kbd.c supported it,
then toupper(KBD_F1) would be called, resulting in undefined behaviour.
src/terminal/kbd.c does not support such combinations yet, but it is
safest to fix the bug already.
$(AM_CFLAGS) is one of the variables set by Automake, which ELinks no
longer uses. $(CPPFLAGS) should be used whenever the C preprocessor
is run, according to the GNU Coding Standards. (My build environment
does have an important -I option there.)
... since the latter is for printing int64_T and we don't check for that and
we use PRId64 only for printing values having the off_t types.
Besides off_t has it's own ELinks specific defaults so it should be safer
to use an internal format string. If off_t is 8 bytes use "lld" else use
"ld".
Reported-by: Andy Tanenbaum <ast@cs.vu.nl>
Drop some code for superscript and subscript handling that was deleted
in commit 65016cdca4, then added back
with the UTF-8 merge in commit 2a6125e3d0,
and then disabled in commit 1b653b9765.
Also list the capnames with which the escape sequences could be
read from Terminfo, and the ECMA-48 interpretations of the bytes
(parenthesized if they seem unrelated to the keys). This is in
preparation for fixing bug 96.
decode_terminal_escape_sequence() used to handle both, but
there is now a separate decode_terminal_application_key()
for ESC O. I have not yet edited decode_terminal_escape_sequence();
there may be dead code in it.
If there is e.g. ESC [ in the input buffer, combine that to Alt-[.
Check the first character too; don't blindly assume it is ESC, as
it can be NUL as well. Note this means you can no longer activate
the main menu by pressing Ctrl-@ (or Ctrl-Space on some terminals).
Otherwise, the timeout could cause ELinks to resume reading from
the terminal device while another process is still using it.
This actually happened in a test.
On entry to some functions that could resume reading from the device,
assert that the terminal has not been blocked.
To reproduce the bug before this patch:
Enable CONFIG_UTF_8, UTF-8 I/O, and UTF-8 charset.
Go to www.google.com and type "abc" in the text input field.
Then press Left. The cursor jumps to "a" when it should go to "c".
to encode it using utf_8_to_unicode. If every unicode_val_T value
could be a result of that function then one must add out param
to the utf_8_to_unicode signaling 'true' UCS_NO_CHAR.