Take a quick stroll through the unchartered corners of the DOM node data
structures:
- Remove ununsed struct dom_node_id_item.
- Make the document node reference a future struct dom_document.
- Describe ideas for node data, e.g. the entity reference node should use
it for storing the unicode_val_T.
This ensures that the 'writer' process will remove itself when the
main ELinks process for some reason decides to shutdown the connection.
Before the 'writer' process would complete it's task taking up unnecessary
system resources.
This is mostly an issue when fetching big files. Therefore only file
fetching is fixed. FIXME added about also checking return codes for write
associated with directory listing.
Reported-by: zas
It should be included via elinks.h but apparently some other system header
can prevent this somehow on some systems.
Reported-by: Phillip Pi <ant@zimage.com>
It uses mangleme by Michal Zalewski <lcamtuf@coredump.cx> to generate HTML
which is then fed into the sgml-parser program. By default 100 random HTML
documents are tested. But the test script takes the number of documents
to test against as an argument. Useful for torture testing the SGML parser.
It didn't check that both title and title->text was non NULL. In either
case it now passes "No title" string to add_bookmark().
Reported by Neuromancer.
Tested with both:
<bookmark href="empty://title"><title></title><bookmark>
<bookmark href="no://title"></bookmark>
This was cause by the recent change to allocate string during incremental
parsing where the node string was set after insertion. Test for this in the
works.
Fixes: b6b6d3c67e
Simplify commit 8d4f44f2f1, in particular
detecting MIME types for files. It is more consistent to do it the way
it was already done by the session/download code.
Instead, write a NUL byte to stderr when getting FSP files and only set
cache->content_type when the header string is non-empty.
Additionally it also moves close(stderr) after the fsp_error() in the
file handling part of do_fsp() so the error message is shown with the
correct type.
Fixes problems with host or protocol parts not being lowercased. This
triggers an assertion failure when trying to download such links. Reported
by lindi-.
Use enum connection_state instead of int in load_uri,
proxy_uri, get_proxy_worker, and get_proxy_uri. See commit
d18809522e. I hope that satisfies TCC.
This changes init_dom_node_() to take an allocated argument saying whether
to allocate or not. If the value is -1, node->allocated will be set to the
value of node->parent->allocated. This way the value is inherited like we
do it in the menu code. It should be a sane default since we eventually
want not to rely on the 'underlying' source of the document and there will
be less variables to pass around.
When doing incremental rendering we now require the whole thing to be there
and that there is room for two tokens in the scanner token table. This is
necessary because we have to generate both a processing target token and a
processing data token to make life simpler for the parser.
Remove processing instruction data case label from the main parser loop. It
is safer this way since it already assumes that the processing target token
has been stored.
Check whether there are '=' and value tokens before handling them. If there
is any doubt the whole attribute structure is 'pushed back' into the
stream. That way incremental parsing will not add the value as a new
attribute because the name token was handled in the previous parsing run.
It is a loop that parses the same small document with various read sizes.
The sgml-parser program is updated to take --stdin option taking a the read
size as a required parameter.
That is, add the last parts that saves and resumes previous incomplete
parsing states. Now the parsing stack push handler checks if the parent has
a resume flag set. When set, the incomplete fragment to resume is restored
and the new source fragment appended and parsing is continued.