When ELinks runs in an X11 terminal emulator (e.g. xterm), or in GNU
Screen, it tries to update the title of the window to match the title
of the current document. To do this, ELinks sends an "OSC 1 ; Pt BEL"
sequence to the terminal. Unfortunately, xterm expects the Pt string
to be in the ISO-8859-1 charset, making it impossible to display e.g.
Cyrillic characters. In xterm patch #210 (2006-03-12) however, there
is a menu item and a resource that can make xterm take the Pt string
in UTF-8 instead, allowing characters from all around the world.
The downside is that ELinks apparently cannot ask xterm whether the
setting is on or off; so add a terminal._template_.latin1_title option
to ELinks and let the user edit that instead.
Complete list of changes:
- Add the terminal._template_.latin1_title option. But do not add
that to the terminal options window because it's already rather
crowded there.
- In set_window_title(), take a new codepage argument. Use it to
decode the title into Unicode characters, and remove only actual
control characters. For example, CP437 has graphical characters in
the 0x80...0x9F range, so don't remove those, even though ISO-8859-1
has control characters in the same range. Likewise, don't
misinterpret single bytes of UTF-8 characters as control characters.
- In set_window_title(), do not truncate the title to the width of the
window. The font is likely to be different and proportional anyway.
But do truncate before 1024 bytes, an xterm limit.
- In struct itrm, add a title_codepage member to remember which
charset the master said it was going to use in the terminal window
title. Initialize title_codepage in handle_trm(), update it in
dispatch_special() if the master sends the new request
TERM_FN_TITLE_CODEPAGE, and use it in most set_window_title() calls;
but not in the one that sets $TERM as the title, because that string
was not received from the master and should consist of ASCII
characters only.
- In set_terminal_title(), convert the caller-provided title to
ISO-8859-1 or UTF-8 if appropriate, and report the codepage to the
slave with the new TERM_FN_TITLE_CODEPAGE request. The conversion
can run out of memory, so return a success/error flag, rather than
void. In display_window_title(), check this result and don't update
caches on error.
- Add a NEWS entry for all of this.
This simplifies the callers a little and may help implement
simultaneous support for different charsets on different terminals
of the same type (bug 1064).
Replace almost all uses of enum connection_state with struct
connection_status. This removes the assumption that errno values used
by the system are between 0 and 100000. The GNU Hurd uses values like
ENOENT = 0x40000002 and EMIG_SERVER_DIED = -308.
This commit is derived from my attachments 450 and 467 to bug 1013.
On AMD64 apparently, off_t is long but ELinks detected SIZEOF_OFF_T == 8
and defined OFF_T_FORMAT as "lld", which expects long long and so causes
GCC to warn about a mismatching format specifier. Because --enable-debug
adds -Werror to $CFLAGS, this warning breaks the build. When both
SIZEOF_LONG and SIZEOF_LONG_LONG are 8, ELinks cannot know which type
it should use.
To fix this, do not attempt to find a format specifier for off_t itself.
Instead cast all printed off_t values to a new typedef off_print_T that
is large enough, and replace OFF_T_FORMAT with OFF_PRINT_FORMAT which
is suitable for off_print_T altough not necessarily for off_t. ELinks
already had a similar scheme with time_print_T and TIME_PRINT_FORMAT.
This is an attempt to make it easier to update without
requiring to update all translations.
Copyright info is now set in setup.h using COPYRIGHT_STRING
Use it for the actual I/O only. Previously, defining CONFIG_UTF8 and
enabling UTF-8 used to force many strings to the UTF-8 charset
regardless of the terminal charset option. Now, those strings always
follow the terminal charset. This fixes bug 914 which was caused
because _() returned strings in the terminal charset and functions
then assumed they were in UTF-8. This reduction in the effects of
UTF-8 I/O may also simplify future testing.
straconcat reads the args with va_arg(ap, const unsigned char *),
and the NULL macro may have the wrong type (e.g. int).
Many places pass string literals of type char * to straconcat. This
is in principle also a violation, but I'm ignoring it for now because
if it becomes a problem with some C implementation, then so will the
use of unsigned char * with printf "%s", which is so widespread in
ELinks that I'm not going to try fixing it now.
Don't cast function pointers; calling functions via pointers of
incorrect types is not guaranteed to work. Instead, define the
functions with the desired types, and make them cast the incoming
parameters. Or define wrapper functions if the return types don't
match.
really_exit_prog wasn't being used outside src/dialogs/menu.c,
and I had to change its parameter type, so it's now static.
This change allows screen_driver_change_hook to detect that the
charset has been changed to UTF-8 and set screen_driver.utf8 = 1.
redraw_screen then calls get_screen_driver, which propagates the flag
to terminal.utf8. That in turn avoids an assertion failure in
handle_interlink_event.
Now the currently displayed version of the current document,
rather than the latest version of the current document, will be used
for the document info box and the current_document() Lua function.
This allows code to use document->cached instead of
find_in_cache(document->uri), thereby increasing the likelihood
of getting the correct cache entry.
This should fix Bug 756 - "assertion (cached)->object.refcount >= 0 failed"
after HTTP proxy was changed.
Patches for this were written by me and then later by Jonas.
This commit combines our independent implementations.
The configure script no longer recognizes "CONFIG_UTF_8=yes" lines
in custom features.conf files. They will have to be changed to
"CONFIG_UTF8=yes". This incompatibility was deemed acceptable
because no released version of ELinks supports CONFIG_UTF_8.
The --enable-utf-8 option was not renamed.
... since the latter is for printing int64_T and we don't check for that and
we use PRId64 only for printing values having the off_t types.
Besides off_t has it's own ELinks specific defaults so it should be safer
to use an internal format string. If off_t is 8 bytes use "lld" else use
"ld".
Reported-by: Andy Tanenbaum <ast@cs.vu.nl>
With regular comments in the definition of the structure itself,
and with xgettext:c-format comments in constants of that type,
if xgettext would otherwise guess wrong; so that translators
will know they'll have to double any percent signs they add.
I didn't regenerate PO files, though.
Thus, each caller must now choose the accelerator key and declare the
accelerator contexts (i.e. menus) to which it may add the command.
Also, use only one context for tab_menu.
These changes fix the following bugs in accelerator conflict detection:
* "~Pass frame URI to external command" may be displayed together
with "Pass tab URI to e~xternal command", but that was not
declared.
* "Pass link URI to e~xternal command" was declared as being in
the tab menu, but it is actually displayed in the link menu.
Don't replace UTF-8 bytes with '*'. Probably there is need to do better
check what will be displayed.
Also get_current_link_title is no longer pretty and trivial. (o:
Preparation for using struct terminal in formating functions.
By now distinguish between formating widgets and formating widgets with
displaying was done with term == NULL and term != NULL. I hope I'am
not wrong.
This changes the init target to be idempotent: most importantly it will now
never overwrite a Makefile if it exists. Additionally 'make init' will
generate the .vimrc files. Yay, no more stupid 'added fairies' commits! ;)
the same accelerator key to multiple buttons in a dialog box or
to multiple items in a menu. ELinks already has some support for
this but it requires the translator to run ELinks and manually
scan through all menus and dialogs. The attached changes make it
possible to quickly detect and list any conflicts, including ones
that can only occur on operating systems or configurations that
the translator is not currently using.
The changes have no immediate effect on the elinks executable or
the MO files. PO files become larger, however.
The scheme works like this:
- Like before, accelerator keys in translatable strings are
tagged with the tilde (~) character.
- Whenever a C source file defines an accelerator key, it must
assign one or more named "contexts" to it. The translations in
the PO files inherit these contexts. If multiple strings use
the same accelerator (case insensitive) in the same context,
that's a conflict and can be detected automatically.
- The contexts are defined with "gettext_accelerator_context"
comments in source files. These comments delimit regions where
all translatable strings containing tildes are given the same
contexts. There must be one special comment at the top of the
region; it lists the contexts assigned to that region. The
region automatically ends at the end of the function (found
with regexp /^\}/), but it can also be closed explicitly with
another special comment. The comments are formatted like this:
/* [gettext_accelerator_context(foo, bar, baz)]
begins a region that uses the contexts "foo", "bar", and "baz".
The comma is the delimiter; whitespace is optional.
[gettext_accelerator_context()]
ends the region. */
The scripts don't currently check whether this syntax occurs
inside or outside comments.
- The names of contexts consist of C identifiers delimited with
periods. I typically used the name of a function that sets
up a dialog, or the name of an array where the items of a
menu are listed. There is a special feature for static
functions: if the name begins with a period, then the period
will be replaced with the name of the source file and a colon.
- If a menu is programmatically generated from multiple parts,
of which some are never used together, so that it is safe to
use the same accelerators in them, then it is necessary to
define multiple contexts for the same menu. link_menu() in
src/viewer/text/link.c is the most complex example of this.
- During make update-po:
- A Perl script (po/gather-accelerator-contexts.pl) reads
po/elinks.pot, scans the source files listed in it for
"gettext_accelerator_context" comments, and rewrites
po/elinks.pot with "accelerator_context" comments that
indicate the contexts of each msgid: the union of all
contexts of all of its uses in the source files. It also
removes any "gettext_accelerator_context" comments that
xgettext --add-comments has copied to elinks.pot.
- If po/gather-accelerator-contexts.pl does not find any
contexts for some use of an msgid that seems to contain an
accelerator (because it contains a tilde), it warns. If the
tilde refers to e.g. "~/.elinks" and does not actually mark
an accelerator, the warning can be silenced by specifying the
special context "IGNORE", which the script otherwise ignores.
- msgmerge copies the "accelerator_context" comments from
po/elinks.pot to po/*.po. Translators do not edit those
comments.
- During make check-po:
- Another Perl script (po/check-accelerator-contexts.pl) reads
po/*.po and keeps track of which accelerators have been bound
in each context. It warns about any conflicts it finds.
This script does not access the C source files; thus it does
not matter if the line numbers in "#:" lines are out of date.
This implementation is not perfect and I am not proposing to
add it to the main source tree at this time. Specifically:
- It introduces compile-time dependencies on Perl and Locale::PO.
There should be a configure-time or compile-time check so that
the new features are skipped if the prerequisites are missing.
- When the scripts include msgstr strings in warnings, they
should transcode them from the charset of the PO file to the
one specified by the user's locale.
- It is not adequately documented (well, except perhaps here).
- po/check-accelerator-contexts.pl reports the same conflict
multiple times if it occurs in multiple contexts.
- The warning messages should include line numbers, so that users
of Emacs could conveniently edit the conflicting part of the PO
file. This is not feasible with the current version of
Locale::PO.
- Locale::PO does not understand #~ lines and spews warnings
about them. There is an ugly hack to hide these warnings.
- Jonas Fonseca suggested the script could propose accelerators
that are still available. This has not been implemented.
There are three files attached:
- po/gather-accelerator-contexts.pl: Augments elinks.pot with
context information.
- po/check-accelerator-contexts.pl: Checks conflicts.
- accelerator-contexts.diff: Makes po/Makefile run the scripts,
and adds special comments to source files.