On 2008-09-05, it was reported to elinks-dev that ELinks hits an
internal error (bad alloc_header) when given a specific HTML file.
On 2008-09-09, out-of-range values of document->comb_x and
document->comb_y were noted as the cause of memory corruption.
Update those variables when splitting, aligning, or justifying a line.
Add many assertions to detect the bug if it occurs again.
Previously, the character at (document.comb_x, document.comb_y) was
accessed via the POS macro, which adds part.box.x and part.box.y to
the coordinates. However, if document.comb and document.y are set
at the end of one part and read at the beginning of another, then
the struct screen_char used by the original part should be updated,
even though the new part has a different box. Change comb_{x,y} to
be relative to the document, rather than to the box of a single part.
In document.forms, each struct form has form_num and form_end members
that reserve a subrange of [0, INT_MAX] to that form. Previously,
multiple forms in the list could have form_end == INT_MAX and thus
overlap each other. Prevent that by adjusting form_end of each form
newly added to the list.
Revert 438f039bda,
"check_html_form_hierarchy: Old code was buggy.", which made
check_html_form_hierarchy attach controls to the wrong forms.
Instead, construct the dummy form ("for those Flying Dutchmans") at
form_num == 0 always before adding any real forms to the list.
This prevents the assertion failure by ensuring that every possible
form_control.position is covered by some form, if there are any forms.
Add a function assert_forms_list_ok, which checks that the set of
forms actually covers the [0, INT_MAX] range without overlapping,
as intended. Call that from check_html_form_hierarchy to detect
any corruption.
I have tested this code (before any cherry-picking) with:
- bug 613 attachment 210: didn't crash
- bug 714 attachment 471: didn't crash
- bug 961 attachment 382: didn't crash
- bug 698 attachment 239: all the submit buttons showed the right URLs
- bug 698 attachment 470: the submit button showed the right URL
(cherry picked from commit 386a5d517b)
In document.forms, each struct form has form_num and form_end members
that reserve a subrange of [0, INT_MAX] to that form. Previously,
multiple forms in the list could have form_end == INT_MAX and thus
overlap each other. Prevent that by adjusting form_end of each form
newly added to the list.
Revert 438f039bda,
"check_html_form_hierarchy: Old code was buggy.", which made
check_html_form_hierarchy attach controls to the wrong forms.
Instead, construct the dummy form ("for those Flying Dutchmans") at
form_num == 0 always before adding any real forms to the list.
This prevents the assertion failure by ensuring that every possible
form_control.position is covered by some form, if there are any forms.
Add a function assert_forms_list_ok, which checks that the set of
forms actually covers the [0, INT_MAX] range without overlapping,
as intended. Call that from check_html_form_hierarchy to detect
any corruption.
I have tested this code (before any cherry-picking) with:
- bug 613 attachment 210: didn't crash
- bug 714 attachment 471: didn't crash
- bug 961 attachment 382: didn't crash
- bug 698 attachment 239: all the submit buttons showed the right URLs
- bug 698 attachment 470: the submit button showed the right URL
The previous code displayed the wrong attributes if the combining
characters were at the end of an HTML link. For example:
<a href="#">trickỹ</a> more text <a href="#">second link</a>
(The characters in the first A element are "tricky" and U+0303
COMBINING TILDE.)
Here, when the cursor was not at the first link, ELinks displayed
the y-with-tilde cell as if it were not part of the link.
This happened because ELinks had already changed schar->attr
before set_line saw the space character after the link and
flushed document->combi[].
Combining characters requires a UTF-8 locale.
It slows down rendering. There is still the unresolved issue with
combining characters at the end of a document.
This patch wasn't heavilly tested. Especially a "garbage" input may cause
unpredictable results.
Before, *_html_parser_state() operated with struct html_element *. Now, it is
transparent for the renderer (just void *), so that DOM won't have to provide
this struct but will be able to use something internal.
Backported from master.
Before, *_html_parser_state() operated with struct html_element *. Now, it is
transparent for the renderer (just void *), so that DOM won't have to provide
this struct but will be able to use something internal.
Currently, all DOM, HTML and plain renderers had their own routine for
conversion from text style to screen attribute. This moves text_style and
text_style_format from html/parser.h to renderer.h and introduces new generic
routine get_screen_chracter_template() that is used by all the specific
rendering engines.
This fixes ELinks crashing on this with terminal width e.g. 103:
<p align="justify">
xxxx xxxx xxxx xxxxx xxxxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx
xxxxx xxxx xxxxxxx xxx xxxxxxx xxx xxxx xxxx xxxx xx xxxx x xxx xxxx
xxxx xx xxxx xxxx xxxx xxx—xxx xxxx xx—xxx xxxx<em> </em>x
xxxx </p>
This test was removed for an unknown reason in commit
b1cc717789.
Discovered together with Miciah.
All the needed memory has been allocated before the loop so we can use
copy_screen_chars() directly. This avoids the assertion failure in
copy_chars() for width==0 and should be a bit faster too. According
to ISO/IEC 9899:1999 7.21.1p2, memcpy() doesn't copy anything if n==0
(but the pointers must be valid).
(original 'git cherry-pick' arguments: cherry-pick bug968-att394)
There were conflicts in src/document/css/ because 0.12.GIT switched
to LIST_OF(struct css_selector) and 0.13.GIT switched to struct
css_selector_set. Resolved by using LIST_OF(struct css_selector)
inside struct css_selector_set.
options->cp is still used for this in seven places where html_context
is not easily available. Those should eventually be corrected too,
but I'm checking this change in already because it's better than what
we had before.
Previously, html_special_form_control converted
form_control.default_value to the terminal charset, and init_form_state
then copied the value to form_state.value. However, when CONFIG_UTF8
is defined and UTF-8 I/O is enabled, form_state.value is supposed to
be in UTF-8, rather than in the terminal charset.
This mismatch could not be conveniently fixed in
html_special_form_control because that does not know which terminal is
being used and whether UTF-8 I/O is enabled there. Also, constructing
a conversion table from the document charset to form_state.value could
have ruined renderer_context.convert_table, because src/intl/charsets.c
does not support multiple concurrent conversion tables.
So instead, we now keep form_control.default_value in the document
charset, and convert it in the viewer each time it is needed. Because
the result of the conversion is kept in form_state.value between
incremental renderings, this shouldn't even slow things down too much.
I am not implementing the proper charset conversions for the DOM
defaultValue property yet, because the current code doesn't have
them for other string properties either, and bug 805 is already open
for that.