In uri.post, each file name begins and ends with FILE_CHAR.
Previously, file names were not encoded, and names containing
FILE_CHAR could not be used. Because FILE_CHAR is a control
character, the user cannot directly type it in a file input field,
so ELinks asserted that the field did not contain FILE_CHAR.
However, it is possible to get FILE_CHAR in a file input field
with file name completion (ACT_EDIT_AUTO_COMPLETE), causing the
assertion to fail. Now, ELinks encodes FILE_CHAR as "%02", so it
is no longer ambiguous and the assertion is not needed.
gcc-4.3 -O2 was complaining that http_got_header may use uninitialized
version.major and version.minor. That indeed happened with HTTP/0.9
servers, and the PRE_HTTP_1_1(version) check then had an undefined
result, so http->close could remain 0 even though it should have
become 1; fortunately, it was then set to 1 anyway, because there was
no Content-Length header. The undefined version was also saved in
http->recv_version, but it appears nothing ever reads that. So in the
end, the bug did not cause any symptoms at runtime, but the warning
broke the build on gcc-4.3 if ELinks was configured with --enable-debug.
I am reverting all /dev/fd recognition because of bug 917.
This reverts commit c283f8cfd9,
except src/protocol/file/file.c still needs #include "osdep/osdep.h"
for STRING_DIR_SEP.
I am reverting all copiousoutput support because of bug 917.
This reverts commit 4dc4ea47f2.
Conflicts:
src/network/connection.h: After the original commit, the declaration
of copiousoutput_data had been changed to use the LIST_OF macro.
Also, connection.cgi had been added next to the connection.popen
member added by the original commit.
src/session/download.c: After the original commit, the definition of
copiousoutput_data had been changed to use the INIT_LIST_OF macro.
If the user opens the same file again after it is in the cache, then
ELinks does not always open a new connection, so download->conn can be
NULL in init_type_query(), and download->conn->cgi would crash.
Don't read that, then; instead add a new flag cache_entry.cgi, which
http_got_header() sets or clears as soon as possible after the cache
entry has been created.
(cherry picked from commit 81f8ee1fa2)
CGI scripts are distinguishable from normal files. I hope that this
fixes the bug 991. This commit also reverts the previous revert.
(cherry picked from commit 7ceba1e461)
The comment said "it is not possible to call kill_timer from a timer
handler." Sure, such calls used to crash occasionally, but that was
bug 868 and has already been fixed.
Previously, each progress timer function registered with
start_update_progress() was directly used as the timer function of
progress.timer, so it was responsible of erasing the expired timer ID
from that member. Failing to do this could result in heap corruption.
The progress timer functions normally fulfilled the requirement by
calling update_progress(), but one such function upload_stat_timer()
had to erase the timer ID on its own too.
Now instead, there is a wrapper function progress_timeout(), which
progress.c sets as the timer function of progress.timer. This wrapper
erases the expired timer ID from progress.timer and then calls the
progress timer function registered with start_update_progress(). So
the progress timer function is no longer responsible of erasing the
timer ID and there's no risk that it could fail to do that in some
error situation.
This commit introduces a new risk though. Previously, if the struct
progress was freed while the timer was running, the (progress) timer
function would still be called, and it would be able to detect that
the progress pointer is NULL and recover from this situation. Now,
the timer function progress_timeout() has a pointer to the struct
progress and will dereference that pointer without being able to check
whether the structure has been freed. Fortunately, done_progress()
asserts that the timer is not running, so this should not occur.
Posting a 91762123-byte file to test/cgi/big_file.cgi. The CPU
percentages are from "top" set up to update every 10 seconds and
checked near the end of the transfer, so they are less accurate
than the upload rate, which averages over the whole transfer.
buffer=4096: average 1.7 MiB/s, elinks 62% CPU, python 35% CPU.
buffer=8192: average 2.5 MiB/s, elinks 49% CPU, python 42% CPU.
buffer=16384: average 3.1 MiB/s, elinks 40% CPU, python 55% CPU.
buffer=32768: average 3.8 MiB/s, elinks 33% CPU, python 61% CPU.
buffer=65536: average 4.1 MiB/s, elinks 26% CPU, python 70% CPU.
buffer=131072: average 4.2 MiB/s, elinks 28% CPU, python 67% CPU.
buffer=262144: average 4.4 MiB/s, elinks 26% CPU, python 69% CPU.
I'm choosing 32768 as POST_BUFFER_SIZE because the advantages of
larger buffers don't seem very high and keeping this under 65536
may help anyone trying to port ELinks to DOS.
I'm using the same value for HTTP too, just to keep things consistent
until there is a reason to diverge.
Without this patch, ELinks showed garbage at
<http://www.dwheeler.com/oss_fs_why.html> when bzip2 decompression was
enabled. safe_read() in bzip2_read() did not see all of the body
bytes that ELinks had received from the server. After bzip2_read()
received EAGAIN from safe_read() and returned 0, something skipped
1460 bytes.
decompress_data() apparently assumed that read_encoded() returning 0
meant the end of the file, and returned even though len still was
nonzero, i.e. it had not yet written to the pipe all the data that
the caller (read_chunked_http_data() or read_normal_http_data()) had
provided. The caller did not know this, and discarded the data.
(cherry picked from commit 7e5e05ca60)
Without this patch, ELinks showed garbage at
<http://www.dwheeler.com/oss_fs_why.html> when bzip2 decompression was
enabled. safe_read() in bzip2_read() did not see all of the body
bytes that ELinks had received from the server. After bzip2_read()
received EAGAIN from safe_read() and returned 0, something skipped
1460 bytes.
decompress_data() apparently assumed that read_encoded() returning 0
meant the end of the file, and returned even though len still was
nonzero, i.e. it had not yet written to the pipe all the data that
the caller (read_chunked_http_data() or read_normal_http_data()) had
provided. The caller did not know this, and discarded the data.
Move connection.post_fd to http_post.post_fd.
Make connection.done point to the new done_http_connection(),
which calls the new done_http_post(), which closes post_fd.
So done_connection() no longer needs to do that.
Now that done_http_post() exists, a later commit can add dynamically
allocated data in struct http_post and ensure that it will be freed.
As the comment near the end of this function says, conn->info is
already non-NULL if a HTTPS proxy is being used, and the code in fact
correctly frees the previous info. So there is no need to assert its
nonexistence. I added that bug on 2008-05-22, in commit 291a913d1e.
If ELinks is being linked with SSL library, use its random number
generator.
Otherwise, try /dev/urandom and /dev/prandom. If they do not work,
fall back to rand(), calling srand() only once. This fallback is
mostly interesting for the Hurd and Microsoft Windows.
BitTorrent piece selection and dom/test/html-mangle.c still use rand()
(but not srand()) directly. Those would not benefit from being
unpredictable, I think.
To reduce code duplication, src/protocol/file/cgi.c no longer parses
connection->uri->post on its own but rather calls the new function
http_read_post_data(), provided by src/protocol/http/http.c. The same
code is now also used for POST requests that do not include files.
Conflicts:
NEWS (bug 939 was listed twice)
doc/man/man5/elinks.conf.5 (regenerated)
po/fr.po (only in comments and such)
po/pl.po (only in comments and such)
src/protocol/fsp/fsp.c (the relevant changes were already here)
*fresult pointed to nowhere. On FreeBSD *fresult == NULL
and directories weren't displayed.
Check also if safe_write writes all data.
(cherry picked from commit 06bcc48487)
If the user opens the same file again after it is in the cache, then
ELinks does not always open a new connection, so download->conn can be
NULL in init_type_query(), and download->conn->cgi would crash.
Don't read that, then; instead add a new flag cache_entry.cgi, which
http_got_header() sets or clears as soon as possible after the cache
entry has been created.
CGI scripts are distinguishable from normal files. I hope that this
fixes the bug 991. This commit also reverts the previous revert.
(cherry picked from commit 7ceba1e461)
libsmbclient's stdout and stderr interferred with ELinks's stdout
and stdin. That caused an assertion failure. Now the ELinks uses
different streams for processing of the smb protocol.
This reverts commit 7ceba1e461,
which is causing an assertion to fail if I open the same PDF
twice in a row, even if I cancel the dialog box when ELinks
first asks which program to run:
INTERNAL ERROR at /home/Kalle/src/elinks-0.12/src/session/download.c:980: assertion download && download->conn failed!
Forcing core dump! Man the Lifeboats! Women and children first!
But please DO NOT report this as a segfault!!! It is an internal error, not a
normal segfault, there is a huge difference in these for us the developers.
Also, noting the EXACT error you got above is crucial for hunting the problem
down. Thanks, and please get in touch with us.
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1216698688 (LWP 17877)]
0xb7a02d76 in raise () from /lib/libc.so.6
(gdb) backtrace 6
at /home/Kalle/src/elinks-0.12/src/util/error.c:179
fmt=0x816984c "assertion download && download->conn failed!")
at /home/Kalle/src/elinks-0.12/src/util/error.c:122
cached=0x8253ca8) at /home/Kalle/src/elinks-0.12/src/session/download.c:980
cached=0x8253ca8, frame=0)
at /home/Kalle/src/elinks-0.12/src/session/download.c:1339
at /home/Kalle/src/elinks-0.12/src/session/task.c:493
(More stack frames follow...)
There is a fix available but I don't trust it yet.
This syncs some changes (ie. -> e.g. etc.) from elinks-0.12 or beyond.
I noticed them while updating the web pages, and apologize that I will
not spent the time to attribute it to the individual commits.
(cherry picked from commit 2bfc7b3724,
omitting generated files)
AFAIK, all bugs in it have been fixed. Some bugs may still be lurking
but they are more likely to get caught if compression is enabled.
I also replaced COMP_NOTE with static text because xgettext does not
support macros in the argument of N_.
The bug was reported by Paul B. Mahol on elinks-users. The example is
from the FTP site he provided:
ftp.freebsd.org/pub/FreeBSD/ISO-IMAGES-ia64/
Message-ID: <3a142e750802262008l6fd55be5v44207bc4479dd3fc@mail.gmail.com>
(cherry picked from commit c069403b75)
... so all the tests with responses stretching multiple lines are
actually tested in their entirety.
(cherry picked from commit aa9a847c00,
resolving a conflict due to the use of get_test_opt)
On AMD64 apparently, off_t is long but ELinks detected SIZEOF_OFF_T == 8
and defined OFF_T_FORMAT as "lld", which expects long long and so causes
GCC to warn about a mismatching format specifier. Because --enable-debug
adds -Werror to $CFLAGS, this warning breaks the build. When both
SIZEOF_LONG and SIZEOF_LONG_LONG are 8, ELinks cannot know which type
it should use.
To fix this, do not attempt to find a format specifier for off_t itself.
Instead cast all printed off_t values to a new typedef off_print_T that
is large enough, and replace OFF_T_FORMAT with OFF_PRINT_FORMAT which
is suitable for off_print_T altough not necessarily for off_t. ELinks
already had a similar scheme with time_print_T and TIME_PRINT_FORMAT.
Previously, struct string was used here. However,
bittorrent_fetch_callback does not initialize response.magic,
and parse_bittorrent_tracker_response changes response->source
to point to data that must not be freed. So the util/string.h
functions are not actually safe to use on these objects.
For this reason, it is safer to use a separate type.
The previous check (integer > (off_t) integer * 10) did not detect all
overflows. Examples with 32-bit off_t:
integer = 0x1C71C71D (0x100000000/9 rounded up);
integer * 10 = 0x11C71C722, wraps to 0x1C71C722 which is > integer.
integer = 0x73333333;
integer * 10 = 0x47FFFFFFE, wraps to 0x7FFFFFFE which is > integer.
Examples with 64-bit off_t:
integer = 0x1C71C71C71C71C72 (0x10000000000000000/9 rounded up);
integer * 10 = 0x11C71C71C71C71C74, wraps to 0x1C71C71C71C71C74
which is > integer.
integer = 0x7333333333333333;
integer * 10 = 0x47FFFFFFFFFFFFFFE, wraps to 0x7FFFFFFFFFFFFFFE
which is > integer.
It is unclear to me what effect an undetected overflow would actually
have from the user's viewpoint, so I'm not adding a NEWS entry.
(cherry picked from commit a25fd18e56)
The compression support in ELinks has always been buggy, with some large pages
failing to decompress and containing garbage at the end instead. However,
with the recent attempts to fix the compression support, it has been actually
made *so* buggy that not only these cases seem to occur more often, but in
some cases, the page is just silently chopped and no content visible; in other
cases, "Resource temporarily unavailable" is displayed. Etc.
The compression support got now to the point where it is so awfully unstable
that it is actively harmful to have it enabled by default. I've been burnt by
it several times already and once made a very serious error because of page
being chopped silently.
This change avoids linker warnings when building with Debian tcc
0.9.23-4 + patch from Debian bug 418360:
[LD] src/protocol/bittorrent/lib.o
bittorrent.o: 'BITTORRENT_NULL_ID' defined twice
common.o: 'BITTORRENT_NULL_ID' defined twice
connection.o: 'BITTORRENT_NULL_ID' defined twice
dialogs.o: 'BITTORRENT_NULL_ID' defined twice
peerconnect.o: 'BITTORRENT_NULL_ID' defined twice
peerwire.o: 'BITTORRENT_NULL_ID' defined twice
piececache.o: 'BITTORRENT_NULL_ID' defined twice
tracker.o: 'BITTORRENT_NULL_ID' defined twice
Add a boolean protocol flag which says whether "//" in the path
part of an URI can be safely substituted with "/". Be conservative
and enable it only for file://, ftp:// and nntp[s]://. Other
can be turned on later, if needed.
Generalizes the fix from 58b3b1e752.
This reverts commit 4f0aaa166e
and insert check for the "//" -> "/" change only to occur for
file:// URIs. This fixes the recent reports on broken handling
of relative file URIs starting with "..".
<http://www.wikipedia.org/w/wiki.phtml?search=sue%20lawley>
incorrectly redirects to
<http://en.wikipedia.org/w/wiki.phtml?search=sue%2520lawley>
which searches for "sue%20lawley" rather than "sue lawley".
By using en.wikipedia.org directly, we avoid the server bug.
Prompted by an elinks-users post on 2007-07-27.
I asked on #wikimedia-tech, and www.wikipedia.org does always
redirect to en.wikipedia.org; it does not guess any other
language based on headers or IP addresses or such. Also, the
redirection exists only for compatibility, and skipping it
avoids a few roundtrips to the server. So this change is good
even if the server is eventually fixed.
There were conflicts in src/document/css/ because 0.12.GIT switched
to LIST_OF(struct css_selector) and 0.13.GIT switched to struct
css_selector_set. Resolved by using LIST_OF(struct css_selector)
inside struct css_selector_set.
This patch changes normalize_uri() to no replace "//" with "/" in URIs. This
fixed this bug but will also lead to possibility that duplicate entries can
exist in ELinks' cache. ELinks might be able to detect in another way by
hashing the content or something.
[ From attachment 310 of bug 744. --KON ]
This change avoids linker warnings when building with Debian tcc
0.9.23-4 + patch from Debian bug 418360:
[LD] src/protocol/bittorrent/lib.o
bittorrent.o: 'BITTORRENT_NULL_ID' defined twice
common.o: 'BITTORRENT_NULL_ID' defined twice
connection.o: 'BITTORRENT_NULL_ID' defined twice
dialogs.o: 'BITTORRENT_NULL_ID' defined twice
peerconnect.o: 'BITTORRENT_NULL_ID' defined twice
peerwire.o: 'BITTORRENT_NULL_ID' defined twice
piececache.o: 'BITTORRENT_NULL_ID' defined twice
tracker.o: 'BITTORRENT_NULL_ID' defined twice
And reorder the characters in the string given to strcspn(), to match
their expected order in the URI. This is also how strcspn() is called
elsewhere in uri.c.
Use it for the actual I/O only. Previously, defining CONFIG_UTF8 and
enabling UTF-8 used to force many strings to the UTF-8 charset
regardless of the terminal charset option. Now, those strings always
follow the terminal charset. This fixes bug 914 which was caused
because _() returned strings in the terminal charset and functions
then assumed they were in UTF-8. This reduction in the effects of
UTF-8 I/O may also simplify future testing.
Give them a corresponding Content-Type header. This must go in
cached->head because cached->content_type is supposed to be just
type/subtype. It will also be deduced from cached->head, so don't set
it separately.