If there are isolated combining characters, e.g. at the beginning of a paragraph
or table cell:
– if it’s not the first screen column, combine them with whatever character is
printed to their left;
– otherwise, add a no-break space as the base character.
Previously, such combining characters were combined with the last letter
displayed, i.e. the last letter of the previous paragraph or cell.
Signed-off-by: Fabienne Ducroquet <fabiduc@gmail.com>
without waiting for the next non zero width character. This way combining
characters at the end of the document are displayed.
Signed-off-by: Fabienne Ducroquet <fabiduc@gmail.com>
Otherwise, there are unnecessary spaces at the end of lines in tables containing
combining characters.
Signed-off-by: Fabienne Ducroquet <fabiduc@gmail.com>
A combining character sequence where the base character is a space remained
recorded as a space although the initial space was replaced with an internal
code corresponding to the combined character. This caused an internal error when
ELinks tried to split the line at that place and did not find the space.
Signed-off-by: Fabienne Ducroquet <fabiduc@gmail.com>
Use the same functions as for the list-style property since only the "type" part
of the list-style property is supported at this stage.
Signed-off-by: Fabienne Ducroquet <fabiduc@gmail.com>
* Rename P_STAR as P_DISC and P_PLUS as P_SQUARE.
* Delete P_NONE because it was used only as the default flag in init_html_parser
and a list with P_NONE then got bullets, so instead use P_DISC by default (as
per the CSS specification), and P_NO_BULLET for lists with no bullets.
* Use as bullets the characters:
- U+25E6 WHITE BULLET for the circle style;
- U+25AA BLACK SMALL SQUARE (alias square bullet) for the square style;
- U+2022 BULLET for the disc style (default).
Signed-off-by: Fabienne Ducroquet <fabiduc@gmail.com>
INIT_OPTION used to initialize union option_value at compile time by
casting the default value to LIST_OF(struct option) *, which is the
type of the first member. On sparc64 and other big-endian systems
where sizeof(int) < sizeof(struct list_head *), this tended to leave
option->value.number as zero, thus messing up OPT_INT and OPT_BOOL
at least. OPT_LONG however tended to work right.
This would be easy to fix with C99 designated initializers,
but doc/hacking.txt says ELinks must be kept C89 compatible.
Another solution would be to make register_options() read the
value from option->value.tree (the first member), cast it back
to the right type, and write it to the appropriate member;
but that would still require somewhat dubious conversions
between integers, data pointers, and function pointers.
So here's a rather more invasive solution. Add struct option_init,
which is somewhat similar to struct option but has non-overlapping
members for different types of values, to ensure nothing is lost
in compile-time conversions. Move unsigned char *path from struct
option_info to struct option_init, and replace struct option_info
with a union that contains struct option_init and struct option.
Now, this union can be initialized with no portability problems,
and register_options() then moves the values from struct option_init
to their final places in struct option.
In my x86 ELinks build with plenty of options configured in, this
change bloated the text section by 340 bytes but compressed the data
section by 2784 bytes, presumably because union option_info is a
pointer smaller than struct option_info was.
(cherry picked from elinks-0.12 commit e5f6592ee2)
Conflicts:
src/protocol/fsp/fsp.c: All options had been removed in 0.13.GIT.
src/protocol/smb/smb2.c: Ditto.
Fix this GCC 3.4.6 warning, which becomes an error
if configure --enable-debug adds -Werror to CFLAGS:
[CC] src/document/css/apply.o
In file included from /home/Kalle/src/elinks-0.12/src/document/html/internal.h:6,
from /home/Kalle/src/elinks-0.12/src/document/css/apply.c:35:
/home/Kalle/src/elinks-0.12/src/document/html/parser.h:149: warning: parameter has incomplete type
In file included from /home/Kalle/src/elinks-0.12/src/document/css/apply.c:35:
/home/Kalle/src/elinks-0.12/src/document/html/internal.h:125: warning: parameter has incomplete type
Even without this warning, "enum html_special_type;"
would not be standard C89.
(cherry picked from elinks-0.12 commit c9f487cdf4)
Conflicts:
src/document/html/parser.h: 0.13.GIT had more #includes already.
This reverts commit d06cccffd6.
Some people wants URL in the title bar, but some wants the title.
I restored previous version (the title). If you want the URL, create
the patch with a configurable option.
The long term goal is good looking of the Python docs in ELinks, especially
background colors. Every start tag and every text node would have associated
a natural number. Those numbers would be "drawn" in the document instead
of colors. Finally, the screen driver would change numbers into colors.
This will be done in small steps. The next step is to implement this change
in the screen driver.
Add a case for CSS_LIST_ORDINAL (and assert(0)) to the switch in
css_apply_list_style. This change should eliminate a warning from the
compiler reported by Witold that CSS_LIST_ORDINAL is not handled.
Recognise the list-style property and apply it by setting the
appropriate flag on the element's parattr based on the property's value.
Add test/list-style.html with an example of each possible list-style
value (many are unsupported by the HTML engine).
The URL in <meta http-equiv="Refresh" content="42; URL=target.html">
can now freely contain spaces and semicolons. There cannot be other
parameters between the delay and the URL. If the URL is not quoted,
then it spans to the end of the attribute, except not to trailing
spaces. If the URL is quoted, then it ends at the first closing
quotation mark. All this is consistent with Debian Iceweasel 3.5.16.
The HTML parser decoded SGML entity references and numeric character
references in the following attributes, and then the renderer did the
same again:
link/@title
link/@hreflang
link/@type
link/@media
img/@alt
area/@alt
input[@type="image"]/@alt
input[@type="image"]/@name
input[@type="button"]/@value
The result was that e.g. title="&#65;" displayed as "A"
even though it was supposed to display as "A".
Fix by making the HTML parser tell the renderer that the entities have
already been decoded.
After the recent ecmascript_get_interpreter change, I got an assertion
failure in render_document, which calls ecmascript_reset_state and
then asserts that it has set vs->ecmascript != NULL.
ecmascript_reset_state cannot guarantee that because there might not
even be enough free memory for mem_calloc(1, sizeof(struct
ecmascript_interpreter). So, replace the assertion in render_document
with error handling, and likewise in call_onsubmit_and_submit.
In start_document_refresh, use register_bottom_half instead of
install_timer if the timeout is 0 because install_timer asserts that it is
given a delay greater than 0.
Add a test case, test/refresh-0timeout.html. Note that
document.browse.minimum_refresh_time must be set to 0 to reproduce the
assertion failure.
C99 6.7.4p3 and 6.7.4p6 set some constraints on what can be done in
inline functions and how they can be declared. In particular, any
function declared inline must also be defined in the same translation
unit. To comply with that, remove inline specifiers from function
declarations in header files when the functions are not also defined
in those header files.
Sun Studio 11 on Solaris 9 is stricter than C99 and does not allow
references to static identifiers in extern inline functions. Make the
configure script detect this and define NONSTATIC_INLINE accordingly
in config.h. Then use that in the definitions of all non-static
inline functions.
Document the restrictions and this scheme in doc/hacking.txt.
Documentation strings of most options used to contain a "\n" at the
end of each source line. When the option manager displayed these
strings, it treated each "\n" as a hard newline. On 80x24 terminals
however, the option description window has only 60 columes available
for the text (with the default setup.h), and the hard newlines were
further apart, so the option manager wrapped the text a second time,
resulting in rather ugly output where long lones are interleaved with
short ones. This could also cause the text to take up too much
vertical space and not fit in the window.
Replace most of those hard newlines with spaces so that the option
manager (or perhaps BFU) will take care of the wrapping. At the same
time, rewrap the strings in source code so that the source lines are
at most 79 columns wide.
In some options though, there is a list of possible values and their
meanings. In those lists, if the description of one value does not
fit in one line, then continuation lines should be indented. The
option manager and BFU are not currently able to do that. So, keep
the hard newlines in those lists, but rewrap them to 60 columns so
that they are less likely to require further wrapping at runtime.
These functions now expect or return strings in UTF-8:
delete_folder_by_name (sneak in a const, too), bookmark_terminal_tabs,
open_bookmark_folder, and get_auto_save_bookmark_foldername_utf8 (new
function).
This simplifies the callers a little and may help implement
simultaneous support for different charsets on different terminals
of the same type (bug 1064).
look_for_link() used to return 0 both when it found the closing </MAP>
tag, and when it hit the end of the file. In the first case, it also
added *menu to the memory_list; in the second case, it did not. The
caller get_image_map() supposedly distinguished between these cases by
checking whether pos >= eof, and freed *menu separately if so.
However, if the </MAP> was at the very end of the HTML file, so that
not even a newline followed it, then look_for_link() left pos == eof
even though it had found the </MAP> and added *menu to the memory_list.
This made get_image_map() misinterpret the result and mem_free(*menu)
even though *menu had already been freed as part of the memory_list;
thus the crash.
To fix this, make look_for_link() return -1 instead of 0 if it hits
EOF without finding the </MAP>. Then make get_image_map() check the
return value instead of comparing pos to eof. And add a test case,
although not an automated one.
Alternatively, look_for_link() could have been changed to decrement
pos between finding the </MAP> and returning 0. Then, the pos >= eof
comparison in get_image_map() would have been false. That scheme
would however have been a bit more difficult to understand and
maintain, I think.
Reported by Paul B. Mahol.
(cherry picked from commit a2404407ce)
A style-sheet containing the string "url( )" with 1 or more characters of
whitespace in between the parentheses triggered an assertion failure in
scan_css_token.
scan_css_token would find the left parenthesis, find the right parenthesis,
and then scan forwards from the left parenthesis for a non-whitespace
character to find the start of the URL and backwards from the right
parenthesis to find the end of the URL. If there were whitespace and
nothing else, the start would be past the end, the routine would compute a
negative length for the URL, and then the routine would trigger an
assertion failure. Now the routine simply enforces a lower bound of length
0 for the URL.
This patch fixes an issue whereby a newline character appearing within
a hidden input field is incorrectly reinterpreted as a space character.
The patch handles almost all cases, and includes a test case.
15/18 tests pass, but the remainder currently fail due to the fact
that ELinks does not currently support textarea scripting.
On 2008-09-05, it was reported to elinks-dev that ELinks hits an
internal error (bad alloc_header) when given a specific HTML file.
On 2008-09-09, out-of-range values of document->comb_x and
document->comb_y were noted as the cause of memory corruption.
Update those variables when splitting, aligning, or justifying a line.
Add many assertions to detect the bug if it occurs again.
Previously, the character at (document.comb_x, document.comb_y) was
accessed via the POS macro, which adds part.box.x and part.box.y to
the coordinates. However, if document.comb and document.y are set
at the end of one part and read at the beginning of another, then
the struct screen_char used by the original part should be updated,
even though the new part has a different box. Change comb_{x,y} to
be relative to the document, rather than to the box of a single part.
cache_entry.id => cache_entry.cache_id
document.id => document.cache_id
ecmascript_interpreter.onload_snippets_owner => .onload_snippets_cache_id
This is a combination of:
commit 232c07aa7f
bug 1009: id variables renamed, added document_id to the document.
commit 6007043458bf8f14abfc18b9db60785bdc0279f6
Revert addition of document.document_id
Replace almost all uses of enum connection_state with struct
connection_status. This removes the assumption that errno values used
by the system are between 0 and 100000. The GNU Hurd uses values like
ENOENT = 0x40000002 and EMIG_SERVER_DIED = -308.
This commit is derived from my attachments 450 and 467 to bug 1013.
In document.forms, each struct form has form_num and form_end members
that reserve a subrange of [0, INT_MAX] to that form. Previously,
multiple forms in the list could have form_end == INT_MAX and thus
overlap each other. Prevent that by adjusting form_end of each form
newly added to the list.
Revert 438f039bda,
"check_html_form_hierarchy: Old code was buggy.", which made
check_html_form_hierarchy attach controls to the wrong forms.
Instead, construct the dummy form ("for those Flying Dutchmans") at
form_num == 0 always before adding any real forms to the list.
This prevents the assertion failure by ensuring that every possible
form_control.position is covered by some form, if there are any forms.
Add a function assert_forms_list_ok, which checks that the set of
forms actually covers the [0, INT_MAX] range without overlapping,
as intended. Call that from check_html_form_hierarchy to detect
any corruption.
I have tested this code (before any cherry-picking) with:
- bug 613 attachment 210: didn't crash
- bug 714 attachment 471: didn't crash
- bug 961 attachment 382: didn't crash
- bug 698 attachment 239: all the submit buttons showed the right URLs
- bug 698 attachment 470: the submit button showed the right URL
(cherry picked from commit 386a5d517b)
Anything that frees or reallocates struct form_state must now call the
new functions ecmascript_detach_form_state or ecmascript_moved_form_state.
These functions should then clear out any dangling pointers, but that has
not yet been implemented.
In document.forms, each struct form has form_num and form_end members
that reserve a subrange of [0, INT_MAX] to that form. Previously,
multiple forms in the list could have form_end == INT_MAX and thus
overlap each other. Prevent that by adjusting form_end of each form
newly added to the list.
Revert 438f039bda,
"check_html_form_hierarchy: Old code was buggy.", which made
check_html_form_hierarchy attach controls to the wrong forms.
Instead, construct the dummy form ("for those Flying Dutchmans") at
form_num == 0 always before adding any real forms to the list.
This prevents the assertion failure by ensuring that every possible
form_control.position is covered by some form, if there are any forms.
Add a function assert_forms_list_ok, which checks that the set of
forms actually covers the [0, INT_MAX] range without overlapping,
as intended. Call that from check_html_form_hierarchy to detect
any corruption.
I have tested this code (before any cherry-picking) with:
- bug 613 attachment 210: didn't crash
- bug 714 attachment 471: didn't crash
- bug 961 attachment 382: didn't crash
- bug 698 attachment 239: all the submit buttons showed the right URLs
- bug 698 attachment 470: the submit button showed the right URL
Conflicts:
NEWS
configure.in
The following files also conflicted, but they had not been manually
edited in the elinks-0.12 branch after the previous merge, so I just
kept the 0.13.GIT versions:
doc/man/man1/elinks.1.in
doc/man/man5/elinks.conf.5
doc/man/man5/elinkskeys.5
po/fr.po
po/pl.po
Do not retain changed values in form fields when the user reloads. Doing
so can be confusing or even cause data-loss when new default values are
specified in the updated document. For example, when editing an article on
Wikipedia, one loads the edit page for the article, makes and submits
changes, goes back to the edit page to make further modifications, and
reloads to get the new article text. Before this change, reloading the
edit page would not update the textarea on the page with the new article
source, which can lead one (and has led me) to make changes to the original
version of the article by accident.
This fixes bug 620.
(cherry picked from commit 9e1e94bee0)
Do not retain changed values in form fields when the user reloads. Doing
so can be confusing or even cause data-loss when new default values are
specified in the updated document. For example, when editing an article on
Wikipedia, one loads the edit page for the article, makes and submits
changes, goes back to the edit page to make further modifications, and
reloads to get the new article text. Before this change, reloading the
edit page would not update the textarea on the page with the new article
source, which can lead one (and has led me) to make changes to the original
version of the article by accident.
This fixes bug 620.
Handle <script> blocks even when they are contained by blocks with
"display: none" set.
This commit fixes the second problem that Kalle points out in comment 5
to bug 963.
When this option is enabled, elements should be rendered even when the CSS
display attribute is "none". Before this commit, the reverse was true:
when the option was enabled, such elements were _not_ rendered.
I am not changing the default, which is enabled, meaning that by default,
ELinks renders elements regardless of "display: none". Pasky advocates
that this remain the default until ELinks's CSS support improves.
This commit fixes the first problem that Kalle points out in comment 5 to
bug 963.
cached->id => cached->cache_id
document->id => document->cache_id
onload_snippets_owner => onload_snippets_document_id
Added the distinct document->document_id.
Always reset ecmascript when a document changes for example a next chunk
of it is loaded.
The previous code displayed the wrong attributes if the combining
characters were at the end of an HTML link. For example:
<a href="#">trickỹ</a> more text <a href="#">second link</a>
(The characters in the first A element are "tricky" and U+0303
COMBINING TILDE.)
Here, when the cursor was not at the first link, ELinks displayed
the y-with-tilde cell as if it were not part of the link.
This happened because ELinks had already changed schar->attr
before set_line saw the space character after the link and
flushed document->combi[].
Combining characters requires a UTF-8 locale.
It slows down rendering. There is still the unresolved issue with
combining characters at the end of a document.
This patch wasn't heavilly tested. Especially a "garbage" input may cause
unpredictable results.
Because ELinks's CSS support is still so incomplete, some documents still render better if "display: none" is not honoured. Therefore, it is now honoured, unless document.css.ignore_display_none = 0, which is the default.
We have a while loop that checks token && token->type != '}' followed by an if statement that checks token && token->type == }. If the while loop exits, that either token is false or token->type == '}'; therefore, the if statement need only check token.
This patch adds support for:
- option document.css.media
- CSS @import "foo.css" tty;
- CSS @media tty { ... }
- HTML <link rel="stylesheet" media="tty">
- HTML <style media="tty">
This patch is attachments 395 and 396 from bugzilla.elinks.cz, which
are based on attachment 388 from bugzilla.elinks.cz. This new
version of the patch fixes conflicts with recent 0.13.GIT changes,
marks Doxygen commands with at-signs rather than backslashes, and
adds a few comments.
This necessitates that non-pairable elements be briefly pushed on the stack, so that get_css_selector_for_element sees them, and then popped.
It would be possible to push them only when CONFIG_CSS is defined, as they are otherwise not needed (as evidenced by the fact that we've gone so long without bothering to push them). However, the performance hit should be small, the necessary #ifdef/#endif wrappers would be pretty ugly, and ideally, the CSS code will someday be in such a state that it can be considered an integral feature.
Instead, convert the element pointers inside the comparison functions.
The last argument of qsort() is supposed to be of type
int (*)(const void *, const void *). Previously, comp_links() was
defined to take struct link * instead of const void *, and the type
mismatch was silenced by casting the function pointer to void *.
This was in principle not portable because:
(1) The different pointer types may have different representations.
In a word-oriented machine, the const void * might include a byte
selector while the struct link * might not.
(2) Casting a function pointer to a data pointer can lose bits in some
memory models. Apparently this does not occur in POSIX-conforming
systems though, as dlsym() would fail if it did.
This commit also fixes hits_cmp() and compare_dir_entries(), which
had similar problems. However, I'm leaving alias_compare() in
src/intl/gettext/localealias.c unchanged for now, so as not to diverge
from the GNU version.
I also checked the bsearch() calls but they were all okay, apart from
one that used the alias_compare() mentioned above.
start_document_refreshes() performs the NULL-pointer checks that
previously all callers to start_document_refresh() must perform
and then calls start_document_refresh().
Before, *_html_parser_state() operated with struct html_element *. Now, it is
transparent for the renderer (just void *), so that DOM won't have to provide
this struct but will be able to use something internal.
Backported from master.
Previously, process_head immediately returned if there was no refresh, never giving the cache-control check further down a chance to run.
Also add new tests:
nocache.html
refresh+nocache.html
Simply search for 'url' marker ignoring anything
before it.
ELinks is now able to follow incorrectly written
meta refresh content attribute with missing ; before
url= parameter.
As an example, try http://akkada.tivi.net.pl/
Simply search for 'url' marker ignoring anything
before it.
ELinks is now able to follow incorrectly written
meta refresh content attribute with missing ; before
url= parameter.
As an example, try http://akkada.tivi.net.pl/
start_document_refreshes performs the NULL-pointer checks that previously all callers to start_document_refresh must perform and then calls start_document_refresh.
Previously, process_head immediately returned if there was no refresh, never giving the cache-control check further down a chance to run.
Also add new tests:
nocache.html
refresh+nocache.html
After commit b66d2bec67 'document: Unify text style -> screen attribute handling', the case statements for CSS_PT_FONT_WEIGHT, CSS_PT_FONT_STYLE, and CSS_PT_TEXT_DECORATION all have common code and therefore collapse nicely.
Pass the session with some get_opt_* calls. These are the low-hanging fruit. Some places will be difficult because we don't have the session or for other reasons.
change_hook_active_link: pass update_cache_document_options @ses.
Now when changing the global settings, it will not simply copy the new values for the global settings to the document-options cache, but also check session-specific settings. This doesn't really matter yet, since the options in question can't be set on a per-session basis, but is in preparation for future changes.
Before, *_html_parser_state() operated with struct html_element *. Now, it is
transparent for the renderer (just void *), so that DOM won't have to provide
this struct but will be able to use something internal.