Separate the formatting of unparsed lines from ftp_process_dirlist()
to a new function ftp_add_unparsed_line(). Check for all possible
out-of-memory errors. Encode HTML metacharacters as entity references
and document how charsets are handled FTP directory listings.
Add a NEWS entry.
fsp_open_session() has a bug where it does not set errno if getaddrinfo fails.
Before the bug 1013 fix, this caused an assertion failure.
After the bug 1013 fix, this caused a "Success" error message.
Now it instead causes "FSP server not found".
Replace almost all uses of enum connection_state with struct
connection_status. This removes the assumption that errno values used
by the system are between 0 and 100000. The GNU Hurd uses values like
ENOENT = 0x40000002 and EMIG_SERVER_DIED = -308.
This commit is derived from my attachments 450 and 467 to bug 1013.
It seems GnuTLS is not as good at negotiating a supported protocol as
OpenSSL is. ELinks tries to work around that by retrying with a
different protocol if the SSL library reports an error. However,
ELinks must not automatically retry POST requests where some data may
have already reached the server; POST is not a safe method in HTTP.
So instead, collect the name of the TLS-incapable server in a blacklist
when ELinks e.g. loads an HTML form from it; the actual POST can then
immediately use the protocol that worked.
It's a bit ugly that src/network/socket.c now uses
protocol/http/blacklist.h. It might be better to move the blacklist
files out of the http directory, and perhaps merge them with the
BitTorrent blacklisting code.
Conflicts:
NEWS
configure.in
The following files also conflicted, but they had not been manually
edited in the elinks-0.12 branch after the previous merge, so I just
kept the 0.13.GIT versions:
doc/man/man1/elinks.1.in
doc/man/man5/elinks.conf.5
doc/man/man5/elinkskeys.5
po/fr.po
po/pl.po
In uri.post, each file name begins and ends with FILE_CHAR.
Previously, file names were not encoded, and names containing
FILE_CHAR could not be used. Because FILE_CHAR is a control
character, the user cannot directly type it in a file input field,
so ELinks asserted that the field did not contain FILE_CHAR.
However, it is possible to get FILE_CHAR in a file input field
with file name completion (ACT_EDIT_AUTO_COMPLETE), causing the
assertion to fail. Now, ELinks encodes FILE_CHAR as "%02", so it
is no longer ambiguous and the assertion is not needed.
gcc-4.3 -O2 was complaining that http_got_header may use uninitialized
version.major and version.minor. That indeed happened with HTTP/0.9
servers, and the PRE_HTTP_1_1(version) check then had an undefined
result, so http->close could remain 0 even though it should have
become 1; fortunately, it was then set to 1 anyway, because there was
no Content-Length header. The undefined version was also saved in
http->recv_version, but it appears nothing ever reads that. So in the
end, the bug did not cause any symptoms at runtime, but the warning
broke the build on gcc-4.3 if ELinks was configured with --enable-debug.
I am reverting all /dev/fd recognition because of bug 917.
This reverts commit c283f8cfd956c9d6332c9192a263700ebd3881b2,
except src/protocol/file/file.c still needs #include "osdep/osdep.h"
for STRING_DIR_SEP.
I am reverting all copiousoutput support because of bug 917.
This reverts commit 4dc4ea47f2d46490623305d79f46b4d8bdf670fc.
Conflicts:
src/network/connection.h: After the original commit, the declaration
of copiousoutput_data had been changed to use the LIST_OF macro.
Also, connection.cgi had been added next to the connection.popen
member added by the original commit.
src/session/download.c: After the original commit, the definition of
copiousoutput_data had been changed to use the INIT_LIST_OF macro.
If the user opens the same file again after it is in the cache, then
ELinks does not always open a new connection, so download->conn can be
NULL in init_type_query(), and download->conn->cgi would crash.
Don't read that, then; instead add a new flag cache_entry.cgi, which
http_got_header() sets or clears as soon as possible after the cache
entry has been created.
(cherry picked from commit 81f8ee1fa2b94ef40460eb4a42710e1ed9e22c98)
CGI scripts are distinguishable from normal files. I hope that this
fixes the bug 991. This commit also reverts the previous revert.
(cherry picked from commit 7ceba1e46131a96ee78ce999dfbe14e359d8cbcb)
The comment said "it is not possible to call kill_timer from a timer
handler." Sure, such calls used to crash occasionally, but that was
bug 868 and has already been fixed.
Previously, each progress timer function registered with
start_update_progress() was directly used as the timer function of
progress.timer, so it was responsible of erasing the expired timer ID
from that member. Failing to do this could result in heap corruption.
The progress timer functions normally fulfilled the requirement by
calling update_progress(), but one such function upload_stat_timer()
had to erase the timer ID on its own too.
Now instead, there is a wrapper function progress_timeout(), which
progress.c sets as the timer function of progress.timer. This wrapper
erases the expired timer ID from progress.timer and then calls the
progress timer function registered with start_update_progress(). So
the progress timer function is no longer responsible of erasing the
timer ID and there's no risk that it could fail to do that in some
error situation.
This commit introduces a new risk though. Previously, if the struct
progress was freed while the timer was running, the (progress) timer
function would still be called, and it would be able to detect that
the progress pointer is NULL and recover from this situation. Now,
the timer function progress_timeout() has a pointer to the struct
progress and will dereference that pointer without being able to check
whether the structure has been freed. Fortunately, done_progress()
asserts that the timer is not running, so this should not occur.
Posting a 91762123-byte file to test/cgi/big_file.cgi. The CPU
percentages are from "top" set up to update every 10 seconds and
checked near the end of the transfer, so they are less accurate
than the upload rate, which averages over the whole transfer.
buffer=4096: average 1.7 MiB/s, elinks 62% CPU, python 35% CPU.
buffer=8192: average 2.5 MiB/s, elinks 49% CPU, python 42% CPU.
buffer=16384: average 3.1 MiB/s, elinks 40% CPU, python 55% CPU.
buffer=32768: average 3.8 MiB/s, elinks 33% CPU, python 61% CPU.
buffer=65536: average 4.1 MiB/s, elinks 26% CPU, python 70% CPU.
buffer=131072: average 4.2 MiB/s, elinks 28% CPU, python 67% CPU.
buffer=262144: average 4.4 MiB/s, elinks 26% CPU, python 69% CPU.
I'm choosing 32768 as POST_BUFFER_SIZE because the advantages of
larger buffers don't seem very high and keeping this under 65536
may help anyone trying to port ELinks to DOS.
I'm using the same value for HTTP too, just to keep things consistent
until there is a reason to diverge.
Without this patch, ELinks showed garbage at
<http://www.dwheeler.com/oss_fs_why.html> when bzip2 decompression was
enabled. safe_read() in bzip2_read() did not see all of the body
bytes that ELinks had received from the server. After bzip2_read()
received EAGAIN from safe_read() and returned 0, something skipped
1460 bytes.
decompress_data() apparently assumed that read_encoded() returning 0
meant the end of the file, and returned even though len still was
nonzero, i.e. it had not yet written to the pipe all the data that
the caller (read_chunked_http_data() or read_normal_http_data()) had
provided. The caller did not know this, and discarded the data.
(cherry picked from commit 7e5e05ca608ee3abf1343a589857d1b8b1026984)
Without this patch, ELinks showed garbage at
<http://www.dwheeler.com/oss_fs_why.html> when bzip2 decompression was
enabled. safe_read() in bzip2_read() did not see all of the body
bytes that ELinks had received from the server. After bzip2_read()
received EAGAIN from safe_read() and returned 0, something skipped
1460 bytes.
decompress_data() apparently assumed that read_encoded() returning 0
meant the end of the file, and returned even though len still was
nonzero, i.e. it had not yet written to the pipe all the data that
the caller (read_chunked_http_data() or read_normal_http_data()) had
provided. The caller did not know this, and discarded the data.
Move connection.post_fd to http_post.post_fd.
Make connection.done point to the new done_http_connection(),
which calls the new done_http_post(), which closes post_fd.
So done_connection() no longer needs to do that.
Now that done_http_post() exists, a later commit can add dynamically
allocated data in struct http_post and ensure that it will be freed.
As the comment near the end of this function says, conn->info is
already non-NULL if a HTTPS proxy is being used, and the code in fact
correctly frees the previous info. So there is no need to assert its
nonexistence. I added that bug on 2008-05-22, in commit 291a913d1e.
If ELinks is being linked with SSL library, use its random number
generator.
Otherwise, try /dev/urandom and /dev/prandom. If they do not work,
fall back to rand(), calling srand() only once. This fallback is
mostly interesting for the Hurd and Microsoft Windows.
BitTorrent piece selection and dom/test/html-mangle.c still use rand()
(but not srand()) directly. Those would not benefit from being
unpredictable, I think.
To reduce code duplication, src/protocol/file/cgi.c no longer parses
connection->uri->post on its own but rather calls the new function
http_read_post_data(), provided by src/protocol/http/http.c. The same
code is now also used for POST requests that do not include files.