Posting a 91762123-byte file to test/cgi/big_file.cgi. The CPU
percentages are from "top" set up to update every 10 seconds and
checked near the end of the transfer, so they are less accurate
than the upload rate, which averages over the whole transfer.
buffer=4096: average 1.7 MiB/s, elinks 62% CPU, python 35% CPU.
buffer=8192: average 2.5 MiB/s, elinks 49% CPU, python 42% CPU.
buffer=16384: average 3.1 MiB/s, elinks 40% CPU, python 55% CPU.
buffer=32768: average 3.8 MiB/s, elinks 33% CPU, python 61% CPU.
buffer=65536: average 4.1 MiB/s, elinks 26% CPU, python 70% CPU.
buffer=131072: average 4.2 MiB/s, elinks 28% CPU, python 67% CPU.
buffer=262144: average 4.4 MiB/s, elinks 26% CPU, python 69% CPU.
I'm choosing 32768 as POST_BUFFER_SIZE because the advantages of
larger buffers don't seem very high and keeping this under 65536
may help anyone trying to port ELinks to DOS.
I'm using the same value for HTTP too, just to keep things consistent
until there is a reason to diverge.
Without this patch, ELinks showed garbage at
<http://www.dwheeler.com/oss_fs_why.html> when bzip2 decompression was
enabled. safe_read() in bzip2_read() did not see all of the body
bytes that ELinks had received from the server. After bzip2_read()
received EAGAIN from safe_read() and returned 0, something skipped
1460 bytes.
decompress_data() apparently assumed that read_encoded() returning 0
meant the end of the file, and returned even though len still was
nonzero, i.e. it had not yet written to the pipe all the data that
the caller (read_chunked_http_data() or read_normal_http_data()) had
provided. The caller did not know this, and discarded the data.
(cherry picked from commit 7e5e05ca60)
Without this patch, ELinks showed garbage at
<http://www.dwheeler.com/oss_fs_why.html> when bzip2 decompression was
enabled. safe_read() in bzip2_read() did not see all of the body
bytes that ELinks had received from the server. After bzip2_read()
received EAGAIN from safe_read() and returned 0, something skipped
1460 bytes.
decompress_data() apparently assumed that read_encoded() returning 0
meant the end of the file, and returned even though len still was
nonzero, i.e. it had not yet written to the pipe all the data that
the caller (read_chunked_http_data() or read_normal_http_data()) had
provided. The caller did not know this, and discarded the data.
Move connection.post_fd to http_post.post_fd.
Make connection.done point to the new done_http_connection(),
which calls the new done_http_post(), which closes post_fd.
So done_connection() no longer needs to do that.
Now that done_http_post() exists, a later commit can add dynamically
allocated data in struct http_post and ensure that it will be freed.
As the comment near the end of this function says, conn->info is
already non-NULL if a HTTPS proxy is being used, and the code in fact
correctly frees the previous info. So there is no need to assert its
nonexistence. I added that bug on 2008-05-22, in commit 291a913d1e.
If ELinks is being linked with SSL library, use its random number
generator.
Otherwise, try /dev/urandom and /dev/prandom. If they do not work,
fall back to rand(), calling srand() only once. This fallback is
mostly interesting for the Hurd and Microsoft Windows.
BitTorrent piece selection and dom/test/html-mangle.c still use rand()
(but not srand()) directly. Those would not benefit from being
unpredictable, I think.
To reduce code duplication, src/protocol/file/cgi.c no longer parses
connection->uri->post on its own but rather calls the new function
http_read_post_data(), provided by src/protocol/http/http.c. The same
code is now also used for POST requests that do not include files.
Previously, init_type_query would check to make sure that it doesn't create a
duplicate type query and would return NULL if it otherwise would create a
duplicate. Then setup_download_handler would return 0 to do_move.
This patch changes setup_download_handler to return 1 to do_move in this
situation so that do_move stops trying to load the document.
This avoid a crash when loading twice the same file in the same tab when
loading the file opens a type query.