Before, *_html_parser_state() operated with struct html_element *. Now, it is
transparent for the renderer (just void *), so that DOM won't have to provide
this struct but will be able to use something internal.
Backported from master.
Previously, process_head immediately returned if there was no refresh, never giving the cache-control check further down a chance to run.
Also add new tests:
nocache.html
refresh+nocache.html
Simply search for 'url' marker ignoring anything
before it.
ELinks is now able to follow incorrectly written
meta refresh content attribute with missing ; before
url= parameter.
As an example, try http://akkada.tivi.net.pl/
Simply search for 'url' marker ignoring anything
before it.
ELinks is now able to follow incorrectly written
meta refresh content attribute with missing ; before
url= parameter.
As an example, try http://akkada.tivi.net.pl/
Previously, process_head immediately returned if there was no refresh, never giving the cache-control check further down a chance to run.
Also add new tests:
nocache.html
refresh+nocache.html
Before, *_html_parser_state() operated with struct html_element *. Now, it is
transparent for the renderer (just void *), so that DOM won't have to provide
this struct but will be able to use something internal.
...as struct text_style. This way it might be possible later to
add other default formatting attributes by CSS and it allows
quite a code simplification in the DOM renderer.
Currently, all DOM, HTML and plain renderers had their own routine for
conversion from text style to screen attribute. This moves text_style and
text_style_format from html/parser.h to renderer.h and introduces new generic
routine get_screen_chracter_template() that is used by all the specific
rendering engines.
This fixes ELinks crashing on this with terminal width e.g. 103:
<p align="justify">
xxxx xxxx xxxx xxxxx xxxxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx
xxxxx xxxx xxxxxxx xxx xxxxxxx xxx xxxx xxxx xxxx xx xxxx x xxx xxxx
xxxx xx xxxx xxxx xxxx xxx—xxx xxxx xx—xxx xxxx<em> </em>x
xxxx </p>
This test was removed for an unknown reason in commit
b1cc717789.
Discovered together with Miciah.
All the needed memory has been allocated before the loop so we can use
copy_screen_chars() directly. This avoids the assertion failure in
copy_chars() for width==0 and should be a bit faster too. According
to ISO/IEC 9899:1999 7.21.1p2, memcpy() doesn't copy anything if n==0
(but the pointers must be valid).
(original 'git cherry-pick' arguments: cherry-pick bug968-att394)
There were conflicts in src/document/css/ because 0.12.GIT switched
to LIST_OF(struct css_selector) and 0.13.GIT switched to struct
css_selector_set. Resolved by using LIST_OF(struct css_selector)
inside struct css_selector_set.
Because the renderer no longer does that.
The comment "We don't cope well with entities here" may now be
obsolete but I'm not sure about that so I'm leaving it in.
options->cp is still used for this in seven places where html_context
is not easily available. Those should eventually be corrected too,
but I'm checking this change in already because it's better than what
we had before.
Previously, html_special_form_control converted
form_control.default_value to the terminal charset, and init_form_state
then copied the value to form_state.value. However, when CONFIG_UTF8
is defined and UTF-8 I/O is enabled, form_state.value is supposed to
be in UTF-8, rather than in the terminal charset.
This mismatch could not be conveniently fixed in
html_special_form_control because that does not know which terminal is
being used and whether UTF-8 I/O is enabled there. Also, constructing
a conversion table from the document charset to form_state.value could
have ruined renderer_context.convert_table, because src/intl/charsets.c
does not support multiple concurrent conversion tables.
So instead, we now keep form_control.default_value in the document
charset, and convert it in the viewer each time it is needed. Because
the result of the conversion is kept in form_state.value between
incremental renderings, this shouldn't even slow things down too much.
I am not implementing the proper charset conversions for the DOM
defaultValue property yet, because the current code doesn't have
them for other string properties either, and bug 805 is already open
for that.
This does not yet fix bug 947 for the case where the document is UTF-8
and the terminal is ISO-8859-1. That will require changing charsets.c
too, it seems.
trim_chars was called only in debug mode and the results of the get_attr_val
for value=" something " in debug mode differ from normal and fastmem mode.
[ From commit c4500039b2 on the witekfl
branch. --KON ]
straconcat reads the args with va_arg(ap, const unsigned char *),
and the NULL macro may have the wrong type (e.g. int).
Many places pass string literals of type char * to straconcat. This
is in principle also a violation, but I'm ignoring it for now because
if it becomes a problem with some C implementation, then so will the
use of unsigned char * with printf "%s", which is so widespread in
ELinks that I'm not going to try fixing it now.
When tables were rendered first time html_format_part was called with
document==NULL. <meta http-equiv=Refresh.../> was inside a table,
so document was NULL. Second time the table knew its dimensions
and document was not NULL.
I do not fully understand this code, but I am sure skipping characters
like this is a bug, and correcting it seems to fix bug 826 (too small
table for double-cell characters). I don't see any similar bugs in
other parts of set_hline.
The patch is from bug 826, comment 4, attachment 308. The warning
there about unicode_to_cell(UCS_NO_CHAR) still applies but this patch
does not make the situation worse. I have logged a separate bug 901
about those calls.
The quote_level was decremented unconditionally and could become negative
resulting in a negative index after applying "modulus 2". Reproducable
with an HTML file contianing "</q>".
Reported by paakku.
Recognize all of 
 
 with any number of leading
zeroes. (Previously only and 
 were supported.) All of
these are case insensitive.
Treat each CR+LF combination ( ) as a single newline.
The configure script no longer recognizes "CONFIG_UTF_8=yes" lines
in custom features.conf files. They will have to be changed to
"CONFIG_UTF8=yes". This incompatibility was deemed acceptable
because no released version of ELinks supports CONFIG_UTF_8.
The --enable-utf-8 option was not renamed.
Suggested by Miciah on #elinks.
What was renamed:
add_utf_8 => add_utf8
cp2utf_8 => cp2utf8
encode_utf_8 => encode_utf8
get_translation_table_to_utf_8 => get_translation_table_to_utf8
goto invalid_utf_8_start_byte => goto invalid_utf8_start_byte
goto utf_8 => goto utf8
goto utf_8_select => goto utf8_select
terminal_interlink.utf_8 => terminal_interlink.utf8
utf_8_to_unicode => utf8_to_unicode
What was not renamed:
terminal._template_.utf_8_io option, TERM_OPT_UTF_8_IO
Compatibility with existing elinks.conf files would require an alias.
--enable-utf-8
Because the name of the charset is UTF-8, --enable-utf-8 looks better
than --enable-utf8.
CONFIG_UTF_8
Will be renamed in a later commit.
Unicode/utf_8.cp, table_utf_8, aliases_utf_8
Will be renamed in a later commit.
Drop some code for superscript and subscript handling that was deleted
in commit 65016cdca4, then added back
with the UTF-8 merge in commit 2a6125e3d0,
and then disabled in commit 1b653b9765.
Its return value is now an enum that lets callers know whether an
error occurred. However, this commit changes the callers only
minimally, so they do not yet check the return value.
In html_subscript, html_subscript_close, html_superscript, html_quote, and
html_quote_close, use put_chrs instead of html_context->put_chars_f.
Element handlers should use put_chrs so that it can correctly handle
whitespace and stuff.
Introduce html_subscript_close callback. Draw opening and closing brackets
and carets for subscript and superscript text directly in the element
handlers rather than performing weirdness in the renderer. This both
improves readability and fixes bug 284, misplaced brackets with subscripts.
Add close callbacks html_html_close, html_style_close, and
html_xmp_close. end_element now calls the element close callback instead
of performing special handling for certain tags.
Note: there are ugly bug (feature?) - when there isn't enought room for
whole double-width char two double-chars are displayed. Can be seen on
table with double-width chars reduced as much as possible.
This patch modifies ELinks wrapping behavior slightly.
* The wrap command now toggles line wrapping in HTML mode, as well as
text mode. Note that when the HTML view of a page is wrapped, its
source view is unwrapped, and vice versa.
* Tabs in text-mode lines are now handled correctly.
* Wrapping a line that reaches exactly to the edge of the screen will
no longer produce a blank line in text mode.
* Text within extra-wide table cells is now wrapped to less than the
screen width, to eliminate sideways scrolling.
The last point is only enabled by setting TABLE_LINE_PADDING to a
non-negative number, in the src/setup.h header file, because it is a
significant change of behavior from previous versions.
Revert commit 2f0490cb04
('Eval embedded scripts at once') and follow-up commit
997f61bb32 ('Use document_view instead of
view_state. It is safer probably') because the change causes crashes on
numerous pages and just looks wrong.