There were conflicts in src/document/css/ because 0.12.GIT switched
to LIST_OF(struct css_selector) and 0.13.GIT switched to struct
css_selector_set. Resolved by using LIST_OF(struct css_selector)
inside struct css_selector_set.
Fix warnings:
dom/stack.h:70: Warning: explicit link request to 'pop_dom_node' could not be resolved
dom/stack.h:71: Warning: explicit link request to 'pop_dom_nodes' could not be resolved
dom/stack.h:71: Warning: explicit link request to 'pop_dom_state' could not be resolved
dom/stack.h:115: Warning: explicit link request to 'done_dom_node' could not be resolved
Use @returns instead of \return in src/document/css/parser.c,
and other such things.
Introduce get_opt() to do the tedious work of getting the right
argument for options expecting them and handles both '--opt=arg'
and '--opt arg'. As a side effect it also removes an unneeded
assignment of the source string for stdin.
Works with both bash and dash. This reintroduces the fix to the
test-sgml-parser-basic test, and also fixes test-sgml-parser-incremental
and test-sgml-parser-lines, which Witek has reported as failing.
A problem with \n replacement caused test no. 19 to fail.
Fix it by also allowing expected output to be prepared in a
file by introducing a new backend: test_output_equals_file,
also used test_output_equals.
Tested on Ubuntu Feisty Fawn with both /bin/bash and /bin/sh
where /bin/sh failed before the fix. Reported by Witek.
The error was:
sgml-parser.c: In function 'print_indent':
sgml-parser.c:99: warning: field precision should have type 'int', but argument 2 has type 'long unsigned int'
Take a quick stroll through the unchartered corners of the DOM node data
structures:
- Remove ununsed struct dom_node_id_item.
- Make the document node reference a future struct dom_document.
- Describe ideas for node data, e.g. the entity reference node should use
it for storing the unicode_val_T.
It uses mangleme by Michal Zalewski <lcamtuf@coredump.cx> to generate HTML
which is then fed into the sgml-parser program. By default 100 random HTML
documents are tested. But the test script takes the number of documents
to test against as an argument. Useful for torture testing the SGML parser.
This was cause by the recent change to allocate string during incremental
parsing where the node string was set after insertion. Test for this in the
works.
Fixes: b6b6d3c67e
This changes init_dom_node_() to take an allocated argument saying whether
to allocate or not. If the value is -1, node->allocated will be set to the
value of node->parent->allocated. This way the value is inherited like we
do it in the menu code. It should be a sane default since we eventually
want not to rely on the 'underlying' source of the document and there will
be less variables to pass around.
When doing incremental rendering we now require the whole thing to be there
and that there is room for two tokens in the scanner token table. This is
necessary because we have to generate both a processing target token and a
processing data token to make life simpler for the parser.
Remove processing instruction data case label from the main parser loop. It
is safer this way since it already assumes that the processing target token
has been stored.
Check whether there are '=' and value tokens before handling them. If there
is any doubt the whole attribute structure is 'pushed back' into the
stream. That way incremental parsing will not add the value as a new
attribute because the name token was handled in the previous parsing run.
It is a loop that parses the same small document with various read sizes.
The sgml-parser program is updated to take --stdin option taking a the read
size as a required parameter.
That is, add the last parts that saves and resumes previous incomplete
parsing states. Now the parsing stack push handler checks if the parent has
a resume flag set. When set, the incomplete fragment to resume is restored
and the new source fragment appended and parsing is continued.
It add support for normalizing a DOM document in various ways, such as
removing comments, converting CDATA section nodes to text nodes, cleanup
whitespace, etc.
Use it in the RSS renderer to sanitize the text to be rendered.
This changes the init target to be idempotent: most importantly it will now
never overwrite a Makefile if it exists. Additionally 'make init' will
generate the .vimrc files. Yay, no more stupid 'added fairies' commits! ;)
parse_sgml() now just pushes a text node on the parsing state and the push
handler will now call parse_sgml_plain() and save the return code in
parser->code so parse_sgml() can return it. Much simpler.
- Fix assertion failure by breaking the switch if an error token is next
when previous was a processing instruction.
- Fix <!notation parsing by skipping ident chars instead of spaces.
- Improve checking of processing instruction 'target'-end and what error
string is generated.
- For now put all of the processing instruction data in the error token.
- Remove a DBG()-print.