When setting the title or URL of a bookmark from SMJS user scripting,
use update_bookmark() instead of writing directly to struct bookmark.
It triggers the bookmark-update event and sets the bookmarks_dirty
flag.
SpiderMonkey uses UTF-16 and the strings in struct bookmark are in
UTF-8. Previously, the conversions behaved as if the strings had been
in ISO-8859-1.
SpiderMonkey also supports JS_SetCStringsAreUTF8(), which would make
the existing functions convert between UTF-16 and UTF-8, but that
effect is global so I dare not enable it yet. Besides, I don't know
if that function works in all the SpiderMonkey versions that ELinks
claims to work with.
This also makes the bookmark-update event carry strings in UTF-8.
The only current consumer of that event is bookmark_change_hook(),
which ignores the strings, so no changes are needed there.
When the file is being read, Expat provides the strings to ELinks in
UTF-8, so ELinks can put them in struct bookmark without conversions.
Make sure gettext returns any placeholder strings in UTF-8, too.
Replace '\r' with ' ' in bookmark titles and URLs.
When the file is being written, put encoding="UTF-8" in the XML
declaration, and then write out the strings from struct bookmark
without character set conversions. Do replace some characters
with entity references though, by calling add_html_to_string().
When ELinks is parsing an XML element in from an XBEL bookmark file,
it collects the attributes of the element to the current_node->attrs
list. Previously, struct attributes had room for one string only:
the last element of current_node->attrs was the name of the first
attribute, and it was preceded by the value of the first attribute,
the name of the second attribute, the value of the second attribute,
and so on. However, when get_attribute_value() was looking for a
given name, it compared the values as well. So, if you had for
example <bookmark id="href" href="http://elinks.cz/">, then
get_attribute_value("href") would incorrectly return "href".
To fix this confusion, store values in the new member
attributes.value, rather than in attributes.name.
Change test/imgmap2.html so it can be used for testing this too.
Debian Iceweasel 3.0.4 does not appear to support such external
client-side image maps. Well, that's one place where ELinks is
superior, I guess. There might be a security problem though if ELinks
were to let scripts of the referring page examine the links in the
image map.
When ELinks runs in an X11 terminal emulator (e.g. xterm), or in GNU
Screen, it tries to update the title of the window to match the title
of the current document. To do this, ELinks sends an "OSC 1 ; Pt BEL"
sequence to the terminal. Unfortunately, xterm expects the Pt string
to be in the ISO-8859-1 charset, making it impossible to display e.g.
Cyrillic characters. In xterm patch #210 (2006-03-12) however, there
is a menu item and a resource that can make xterm take the Pt string
in UTF-8 instead, allowing characters from all around the world.
The downside is that ELinks apparently cannot ask xterm whether the
setting is on or off; so add a terminal._template_.latin1_title option
to ELinks and let the user edit that instead.
Complete list of changes:
- Add the terminal._template_.latin1_title option. But do not add
that to the terminal options window because it's already rather
crowded there.
- In set_window_title(), take a new codepage argument. Use it to
decode the title into Unicode characters, and remove only actual
control characters. For example, CP437 has graphical characters in
the 0x80...0x9F range, so don't remove those, even though ISO-8859-1
has control characters in the same range. Likewise, don't
misinterpret single bytes of UTF-8 characters as control characters.
- In set_window_title(), do not truncate the title to the width of the
window. The font is likely to be different and proportional anyway.
But do truncate before 1024 bytes, an xterm limit.
- In struct itrm, add a title_codepage member to remember which
charset the master said it was going to use in the terminal window
title. Initialize title_codepage in handle_trm(), update it in
dispatch_special() if the master sends the new request
TERM_FN_TITLE_CODEPAGE, and use it in most set_window_title() calls;
but not in the one that sets $TERM as the title, because that string
was not received from the master and should consist of ASCII
characters only.
- In set_terminal_title(), convert the caller-provided title to
ISO-8859-1 or UTF-8 if appropriate, and report the codepage to the
slave with the new TERM_FN_TITLE_CODEPAGE request. The conversion
can run out of memory, so return a success/error flag, rather than
void. In display_window_title(), check this result and don't update
caches on error.
- Add a NEWS entry for all of this.
This simplifies the callers a little and may help implement
simultaneous support for different charsets on different terminals
of the same type (bug 1064).
look_for_link() used to return 0 both when it found the closing </MAP>
tag, and when it hit the end of the file. In the first case, it also
added *menu to the memory_list; in the second case, it did not. The
caller get_image_map() supposedly distinguished between these cases by
checking whether pos >= eof, and freed *menu separately if so.
However, if the </MAP> was at the very end of the HTML file, so that
not even a newline followed it, then look_for_link() left pos == eof
even though it had found the </MAP> and added *menu to the memory_list.
This made get_image_map() misinterpret the result and mem_free(*menu)
even though *menu had already been freed as part of the memory_list;
thus the crash.
To fix this, make look_for_link() return -1 instead of 0 if it hits
EOF without finding the </MAP>. Then make get_image_map() check the
return value instead of comparing pos to eof. And add a test case,
although not an automated one.
Alternatively, look_for_link() could have been changed to decrement
pos between finding the </MAP> and returning 0. Then, the pos >= eof
comparison in get_image_map() would have been false. That scheme
would however have been a bit more difficult to understand and
maintain, I think.
Reported by Paul B. Mahol.
(cherry picked from commit a2404407ce)
Before this patch, if you first moved the cursor to link X with
move-cursor-up and similar actions, and then clicked link Y with the
mouse, ELinks would activate link X, i.e. not the one you clicked.
This happened because the NAVIGATE_CURSOR_ROUTING mode was left
enabled and made ELinks ignore the doc_view->vs->current_link
member that ELinks had updated according to the click.
Make ELinks return the session to NAVIGATE_LINKWISE mode, so that
the update takes effect.
Reported by Paul B. Mahol.
(cherry picked from commit 4086418069)
This patch fixes an issue whereby a newline character appearing within
a hidden input field is incorrectly reinterpreted as a space character.
The patch handles almost all cases, and includes a test case.
15/18 tests pass, but the remainder currently fail due to the fact
that ELinks does not currently support textarea scripting.
c_strcasecmp and c_strncasecmp were taken from GNU coreutils 6.9,
which is copyrighted by the Free Software Foundation and licensed
under GNU GPL version 2 or later. It seems the programs in coreutils
do not normally read commands interactively. So, including coreutils
code in an interactive program such as ELinks could trigger GPLv2
section 2. c), which would require ELinks to display a copyright
notice and a warranty disclaimer each time it is started. Rewrite
those functions to remove the FSF-copyrighted code and make ELinks
not a work based on GNU coreutils.
Avoiding FSF code has the additional benefit that we won't have to ask
FSF for permission if we want to add a licence exception that allows
linking ELinks with OpenSSL. So it seems a good idea even if my
interpretation of GPLv2 2. c) is overly strict. I haven't checked
though whether there are other FSF-copyrighted portions in ELinks.
src/config/kbdbind.c (parse_keystroke): If the user types "Ctrl-i",
it should mean "Ctrl-I" rather than "Ctrl-İ", because the Ctrl-
combinations are only well known for ASCII characters. This does not
matter in practice though, because src/terminal/kbd.c converts 0x09
to (KBD_MOD_NONE, KBD_TAB) and not to (KBD_MOD_CTRL, 'I').
src/osdep/beos/beos.c (get_system_env): Changing the locale does not
affect the TERM environment variable, I think, so it should not affect
the interpretation either.
Add #include directives to fix these errors:
[CC] src/intl/gettext/l10nflist.o
cc1: warnings being treated as errors
.../src/intl/gettext/l10nflist.c: In function ‘_nl_normalize_codeset’:
.../src/intl/gettext/l10nflist.c:352: error: implicit declaration of function ‘c_tolower’
[CC] src/dom/css/scanner.o
cc1: warnings being treated as errors
In file included from .../src/dom/scanner.h:4,
from .../src/dom/css/scanner.h:4,
from .../src/dom/css/scanner.c:12:
.../src/dom/string.h: In function ‘dom_string_casecmp’:
.../src/dom/string.h:34: error: implicit declaration of function ‘c_strncasecmp’
Bug 932 is about ELinks letting control characters 0x80...0x9F through
to the terminal. It did not occur with ISO 8859-1, 8859-2, 8859-15,
or 8859-16, because the ELinks mappings for those charsets did not
include those bytes. However, the www.unicode.org versions imported
in the previous commit do include the problematic bytes.
To avoid a possible regression before the ELinks 0.12.0 release,
comment those control-character mappings out again. This workaround
should be reverted after bug 932 has been fixed properly.
Add copyright and licence notices, and a NEWS entry.
The data in the new versions is not entirely the same as what ELinks
used to have:
- Unicode/8859_1.cp: Adds control characters.
- Unicode/8859_2.cp: Adds control characters.
- Unicode/8859_4.cp: Adds some control characters that ELinks assumed
there already.
- Unicode/8859_7.cp: Adds three characters.
- Unicode/8859_15.cp: Adds control characters.
- Unicode/8859_16.cp: Adds control characters and swaps 0xA5 with 0xAB.
- Unicode/koi8_r.cp: Changes 0x95 and adds some control characters
that ELinks assumed there already.
- Unicode/macroman.cp: Changes 0xC6 and removes some control characters
that ELinks assumes there anyway.
Call stacks reported by valgrind:
==14702== at 0x80DD791: read_from_socket (socket.c:945)
==14702== by 0x8104D0C: read_more_http_data (http.c:1180)
==14702== by 0x81052FE: read_http_data (http.c:1388)
==14702== by 0x80DD69B: read_select (socket.c:910)
==14702== by 0x80D27AA: select_loop (select.c:307)
==14702== by 0x80D1ADE: main (main.c:358)
==14702== Address 0x4F4E598 is 56 bytes inside a block of size 81 free'd
==14702== at 0x402210F: free (vg_replace_malloc.c:233)
==14702== by 0x812BED8: debug_mem_free (memdebug.c:484)
==14702== by 0x80D7C82: done_connection (connection.c:479)
==14702== by 0x80D8A44: abort_connection (connection.c:769)
==14702== by 0x80D99CE: cancel_download (connection.c:1053)
==14702== by 0x8110EB6: abort_download (download.c:143)
==14702== by 0x81115BC: download_data_store (download.c:337)
==14702== by 0x8111AFB: download_data (download.c:446)
==14702== by 0x80D7B33: notify_connection_callbacks (connection.c:458)
==14702== by 0x80D781E: set_connection_state (connection.c:388)
==14702== by 0x80D7132: set_connection_socket_state (connection.c:234)
==14702== by 0x80DD78D: read_from_socket (socket.c:943)
read_from_socket() attempted to read socket->fd in order to set
handlers on it, but the socket had already been freed. Incidentally,
socket->fd was -1, which would have resulted in an assertion failure
if valgrind hadn't caught the bug first.
To fix this, add a list of weak references to sockets.
read_from_socket() registers a weak reference on entry and unregisters
it before exit. done_socket() breaks any weak references to the
specified socket. read_from_socket() then checks whether the weak
reference was broken, and doesn't access the socket any more if so.
This reverts src/{network,sched}/connection.c CVS revision 1.43,
which was made on 2003-07-03 and converted to Git commit
cae65f7941628109b51ffb2e2d05882fbbdc73ef in elinks-history.
It is pointless to check whether (c == d && c->id == d->id).
If c == d, then surely c->id == d->id, and I wouldn't be surprised
to see a compiler optimize that out.
Whereas, by taking the id as a parameter, connection_disappeared()
can check whether the pointer now points to a new struct connection
with a different id.
ELinks attempted to display a message box on file_download.term, but
it had already closed that terminal and freed the struct terminal. To
fix this, reset file_download.term pointers to NULL when the terminal
is about to be destroyed. Also, assert in download_data_store() that
file_download.term is either NULL or in the global "terminals" list.
Reported by أحمد المحمودي.
(cherry picked from commit 6e2476ea4d)
never_for_this_site(form) did return remember_form(form).
In ELinks 0.11.0, both functions returned int, so this was OK.
In commit 2b7788614f however, the
functions were changed to return void, as required by msg_box().
GCC still accepted the return statement but Sun Studio 11 did not.
The GNU Hurd has a bug that can make select() report an exception in a
pipe even though none has actually occurred. The typical result is
that ELinks closes the pipe through which it internally passes all
input events, such as keypresses. It then no longer reacts to what
the user is trying to do.
Work around the Hurd bug by making set_handlers() check whether the
file descriptor refers to a pipe, and if so, pretend the caller did
not provide any handler for exceptions. This is a minimal change that
avoids slowing down the select() loop itself and does not require
careful analysis of the callers to statically find out which file
descriptors might refer to pipes. The extra stat() calls may slow
ELinks down somewhat, but anyway it'll work better than it did without
the patch, and if the Hurd bug is ever fixed, we can remove the
workaround at that time.
Previously, spidermonkey_get_interpreter() and init_smjs() each called
JS_SetErrorReporter on the JSContexts they created. However,
JS_SetErrorReporter actually sets the error reporter of the JSRuntime
associated with the JSContext, and all of our JSContexts use the same
JSRuntime nowadays, so only the error_reporter() of
src/ecmascript/spidermonkey.c was left installed. Because this
error_reporter() asserts that JS_GetContextPrivate(ctx) returns a
non-NULL pointer, and init_smjs() does not set a private pointer for
smjs_ctx, any error in smjs_ctx could cause an assertion failure, at
least in principle.
Fix this by making spidermonkey_runtime_addref() install a shared
error_reporter() when it creates the JSRuntime and the first JSContext.
The shared error_reporter() then checks the JSContext and calls the
appropriate function.
The two error reporters are quite similar with each other. In the
future, we could move the common code into shared functions. I'm not
doing that yet though, because fixing the bug doesn't require it.
make_bittorrent_peer_connection() used to construct a struct uri on
the stack. This was hacky but worked nicely because the struct uri
was not really accessed after make_connection() returned. However,
since commit a83ff1f565, the struct uri
is also needed when the connection is being closed. Valgrind shows:
Invalid read of size 2
at 0x8100764: get_blacklist_entry (blacklist.c:33)
by 0x8100985: del_blacklist_entry (blacklist.c:64)
by 0x80DA579: complete_connect_socket (socket.c:448)
by 0x80DA84A: connected (socket.c:513)
by 0x80D0DDF: select_loop (select.c:297)
by 0x80D00C6: main (main.c:353)
Address 0xBEC3BFAE is just below the stack ptr. To suppress, use: --workaround-gcc296-bugs=yes
To fix this, allocate the struct uri on the heap instead, by
constructing a string and giving that to get_uri(). This string
cannot use the "bittorrent" URI scheme because parse_uri() does not
recognize the host and port fields in that. (The "bittorrent" scheme
has protocol_backend.free_syntax = 1 in order to support strings like
"bittorrent:http://beta.legaltorrents.com/get/159-noisome-beasts".)
Instead, define a new "bittorrent-peer" URI scheme for this purpose.
If the user attempts to use this URI scheme, its handler aborts the
connection with an error; but when make_bittorrent_peer_connection()
uses a bittorrent-peer URI, the handler is not called.
This change also lets get_uri() set the ipv6 flag if peer_info->ip is
an IPv6 address literal.
Reported by Witold Filipczyk.
Separate the formatting of unparsed lines from ftp_process_dirlist()
to a new function ftp_add_unparsed_line(). Check for all possible
out-of-memory errors. Encode HTML metacharacters as entity references
and document how charsets are handled FTP directory listings.
Add a NEWS entry.
cache_entry.id => cache_entry.cache_id
document.id => document.cache_id
ecmascript_interpreter.onload_snippets_owner => .onload_snippets_cache_id
This is a combination of:
commit 232c07aa7f
bug 1009: id variables renamed, added document_id to the document.
commit 6007043458bf8f14abfc18b9db60785bdc0279f6
Revert addition of document.document_id
fsp_open_session() has a bug where it does not set errno if getaddrinfo fails.
Before the bug 1013 fix, this caused an assertion failure.
After the bug 1013 fix, this caused a "Success" error message.
Now it instead causes "FSP server not found".
Replace almost all uses of enum connection_state with struct
connection_status. This removes the assumption that errno values used
by the system are between 0 and 100000. The GNU Hurd uses values like
ENOENT = 0x40000002 and EMIG_SERVER_DIED = -308.
This commit is derived from my attachments 450 and 467 to bug 1013.
It seems GnuTLS is not as good at negotiating a supported protocol as
OpenSSL is. ELinks tries to work around that by retrying with a
different protocol if the SSL library reports an error. However,
ELinks must not automatically retry POST requests where some data may
have already reached the server; POST is not a safe method in HTTP.
So instead, collect the name of the TLS-incapable server in a blacklist
when ELinks e.g. loads an HTML form from it; the actual POST can then
immediately use the protocol that worked.
It's a bit ugly that src/network/socket.c now uses
protocol/http/blacklist.h. It might be better to move the blacklist
files out of the http directory, and perhaps merge them with the
BitTorrent blacklisting code.
Check in refresh_view() whether the tab is still current; if not, skip
the draw_doc() and draw_frames() calls because draw_current_link()
called within them asserts that the tab is current. However, do
always call print_screen_status(), because that handles non-current
tabs correctly too.
I think it was not yet possible to trigger the assertion failure with
setTimeout, because input.value modifications by ECMAScript do not
trigger a redraw (bug 1035).