Use it for the actual I/O only. Previously, defining CONFIG_UTF8 and
enabling UTF-8 used to force many strings to the UTF-8 charset
regardless of the terminal charset option. Now, those strings always
follow the terminal charset. This fixes bug 914 which was caused
because _() returned strings in the terminal charset and functions
then assumed they were in UTF-8. This reduction in the effects of
UTF-8 I/O may also simplify future testing.
Because the renderer no longer does that.
The comment "We don't cope well with entities here" may now be
obsolete but I'm not sure about that so I'm leaving it in.
options->cp is still used for this in seven places where html_context
is not easily available. Those should eventually be corrected too,
but I'm checking this change in already because it's better than what
we had before.
Previously, html_special_form_control converted
form_control.default_value to the terminal charset, and init_form_state
then copied the value to form_state.value. However, when CONFIG_UTF8
is defined and UTF-8 I/O is enabled, form_state.value is supposed to
be in UTF-8, rather than in the terminal charset.
This mismatch could not be conveniently fixed in
html_special_form_control because that does not know which terminal is
being used and whether UTF-8 I/O is enabled there. Also, constructing
a conversion table from the document charset to form_state.value could
have ruined renderer_context.convert_table, because src/intl/charsets.c
does not support multiple concurrent conversion tables.
So instead, we now keep form_control.default_value in the document
charset, and convert it in the viewer each time it is needed. Because
the result of the conversion is kept in form_state.value between
incremental renderings, this shouldn't even slow things down too much.
I am not implementing the proper charset conversions for the DOM
defaultValue property yet, because the current code doesn't have
them for other string properties either, and bug 805 is already open
for that.
This does not yet fix bug 947 for the case where the document is UTF-8
and the terminal is ISO-8859-1. That will require changing charsets.c
too, it seems.
trim_chars was called only in debug mode and the results of the get_attr_val
for value=" something " in debug mode differ from normal and fastmem mode.
[ From commit c4500039b2 on the witekfl
branch. --KON ]
straconcat reads the args with va_arg(ap, const unsigned char *),
and the NULL macro may have the wrong type (e.g. int).
Many places pass string literals of type char * to straconcat. This
is in principle also a violation, but I'm ignoring it for now because
if it becomes a problem with some C implementation, then so will the
use of unsigned char * with printf "%s", which is so widespread in
ELinks that I'm not going to try fixing it now.
When tables were rendered first time html_format_part was called with
document==NULL. <meta http-equiv=Refresh.../> was inside a table,
so document was NULL. Second time the table knew its dimensions
and document was not NULL.
I do not fully understand this code, but I am sure skipping characters
like this is a bug, and correcting it seems to fix bug 826 (too small
table for double-cell characters). I don't see any similar bugs in
other parts of set_hline.
The patch is from bug 826, comment 4, attachment 308. The warning
there about unicode_to_cell(UCS_NO_CHAR) still applies but this patch
does not make the situation worse. I have logged a separate bug 901
about those calls.
This allows code to use document->cached instead of
find_in_cache(document->uri), thereby increasing the likelihood
of getting the correct cache entry.
This should fix Bug 756 - "assertion (cached)->object.refcount >= 0 failed"
after HTTP proxy was changed.
Patches for this were written by me and then later by Jonas.
This commit combines our independent implementations.
The quote_level was decremented unconditionally and could become negative
resulting in a negative index after applying "modulus 2". Reproducable
with an HTML file contianing "</q>".
Reported by paakku.
Recognize all of 
 
 with any number of leading
zeroes. (Previously only and 
 were supported.) All of
these are case insensitive.
Treat each CR+LF combination ( ) as a single newline.