Jonas Fonseca
ceffe8f1a4
Make the SGML parser ready for (specializing) doctypes other than HTML
2005-12-20 01:04:33 +01:00
Jonas Fonseca
8e769d48a5
Misc cleanups and improvements
2005-12-20 00:01:18 +01:00
Jonas Fonseca
3b412553b6
match_attribute_selectors(): Fix warning about uninitialized attr variable
...
Outspitten on FreeBSD.
2005-12-19 22:13:23 +01:00
Jonas Fonseca
2e2c0a590e
Add basic functionality for selecting DOM nodes based on CSS3 selectors
...
The design should more or less be in place. There is still a lot of things
missing but it should actually be enough for using it in a simple RSS renderer.
Amongst several things, :nth-* pseudo-classes and :not() syntax is not in
place.
2005-12-19 03:44:18 +01:00
Jonas Fonseca
b64e122159
Change order of variables given to foreach_dom_node iterators
2005-12-19 02:57:00 +01:00
Jonas Fonseca
330c0174e5
Rename DOM stack iterators and make them include all states when iterating
...
They are now called: foreach{back,}_dom_stack_state (...) and the immutable
flag together with node type restricted stack searches should ensure that
the document root node never is popped.
2005-12-19 02:51:32 +01:00
Jonas Fonseca
051db70dd4
Add boolean immutable flag to the DOM stack state
...
Can be used to ensure the document root node never leaves the stack while
parsing.
2005-12-19 02:34:26 +01:00
Jonas Fonseca
ee1eba9689
Rename: dom_stack_has_parents() -> dom_stack_is_empty() (with negated value)
2005-12-19 02:15:36 +01:00
Jonas Fonseca
bc338207e7
do_pop_dom_node(): move dom_stack_has_parents() to assertion
...
All callers already checks if the stack is empty.
2005-12-19 02:05:43 +01:00
Jonas Fonseca
45861c68e1
pop_dom_state(): Drop unused left-over argument
2005-12-15 22:05:30 +01:00
Jonas Fonseca
ef5d5fc27a
dom_node_cmp(): Only use element or attribute type ID if both are set
2005-12-15 22:02:02 +01:00
Jonas Fonseca
c2d27a33d8
Rename: nav -> stack
2005-12-15 17:24:20 +01:00
Jonas Fonseca
5ef041c051
Redo the assertm() message to just show the type of the node and parent
2005-12-13 20:08:58 +01:00
Jonas Fonseca
d1635d6970
Fix wrong assertion message string
...
You just cannot print dom_string structs with %s.
2005-12-13 16:33:50 +01:00
Jonas Fonseca
f35026ecfb
Add DOM_NODE_UNKNOWN node type for internal purposes only
2005-12-13 04:52:47 +01:00
Jonas Fonseca
5ff0849eb3
set_dom_string(): take length as size_t; -1 means use strlen() to get size
2005-12-12 17:42:26 +01:00
Jonas Fonseca
27116d6385
Make the DOM stack and the SGML parser interface more general
...
They now both hold a single private data member. This means the parser now
holds the renderer data.
2005-12-12 17:41:09 +01:00
Jonas Fonseca
458fc5ee79
Review and change dom_string specific uint16_t value to size_t
2005-12-10 22:24:30 +01:00
Jonas Fonseca
cdc749def3
get_dom_node_value(): indent switch
2005-12-10 21:50:40 +01:00
Jonas Fonseca
86c9a37810
Factor out dom_string_casecmp()
2005-12-10 21:49:33 +01:00
Jonas Fonseca
87aad88c96
Use dom_string throughout the DOM stack interface
2005-12-10 21:42:49 +01:00
Jonas Fonseca
0fab644bee
get_dom_node_value(): move non-compliant functionality to the tree renderer
...
Entity references are supposed to have a null value and the string
compression is for improving the tree view.
2005-12-10 21:37:47 +01:00
Jonas Fonseca
52f5276f92
get_dom_node_name(): indent switch statement
2005-12-10 20:05:01 +01:00
Jonas Fonseca
295679a5e6
get_dom_node_name(): return struct dom_string *
...
Also, simplify the rendering a bit for now.
2005-12-10 20:03:43 +01:00
Jonas Fonseca
7d6db6b152
Update the DOM tree renderer to support dom_strings
2005-12-10 19:44:01 +01:00
Jonas Fonseca
2aedeb0a67
get_dom_node_type_name(): return struct dom_string *
2005-12-10 19:28:37 +01:00
Jonas Fonseca
5f69255cbc
get_dom_node_map_entry(): take name as a dom_string
...
Requires that dom_string->length becomes a size_t.
2005-12-10 19:21:12 +01:00
Jonas Fonseca
0546759b4b
Use struct dom_string for node->proc_instruction.instruction
2005-12-10 18:59:17 +01:00
Jonas Fonseca
9935bf2083
Convert some yet unused strings to use the dom_string struct
...
Also remove the unneeded path member from the dom_node_id_item struct. It
was obsoleted by the addition of dom_node->parent.
2005-12-10 18:51:08 +01:00
Jonas Fonseca
ed7a292966
Use struct dom_string for node->attribute.value
2005-12-10 18:42:54 +01:00
Jonas Fonseca
ce3778c3c0
Add struct dom_string
...
In time it should be used for all strings in the DOM engine.
For now it is just used for node->string.
2005-12-10 18:37:47 +01:00
Jonas Fonseca
8aa078393a
Move dom_node_data union outside the dom_node struct
2005-12-08 03:26:34 +01:00
Jonas Fonseca
4480a9a4cd
Removes node from the DOM tree when using the SGML stream parser
...
That should free up some short-term memory.
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2005-12-08 03:02:27 +01:00
Jonas Fonseca
8f97dc8403
done_dom_node(): remove the node from all parent lists
...
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2005-12-08 02:59:40 +01:00
Jonas Fonseca
93fb17ea2a
Indent switch statement
...
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2005-12-08 02:32:23 +01:00
Jonas Fonseca
ce5bf8c6f8
Fix DOM node list iterators macros
...
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2005-12-08 02:04:13 +01:00
Jonas Fonseca
1c2f271782
Add parent member to dom_node
...
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2005-12-08 01:35:48 +01:00
Jonas Fonseca
69b321cb5b
Replace struct cache_entry member with struct uri member
...
Reduces the number of (unused) dependencies. Also, update the #include
list removing old entries.
2005-12-05 23:47:23 +01:00
Jonas Fonseca
38b8503161
Remove document member from struct sgml_parser
...
The document URI can be accessed from the cache entry.
2005-12-05 23:31:54 +01:00
Jonas Fonseca
c7ad6f967b
Introduce new pop_dom_state()
...
It's basically pop_dom_nodes() without the search part and is now used as a
backend in pop_dom_nodes(). Use it in parse_sgml_document() to avoid two
DOM stack searches in a row.
2005-12-05 19:40:35 +01:00
Jonas Fonseca
1c4a0d67ce
Restore highlighting of element end-tags
...
... by installing a pop-callback for elements and responding to whatever
the parser has put in the end_token parser state member.
2005-12-05 19:33:15 +01:00
Jonas Fonseca
9aebb66bce
Introduce separate push/pop callbacks
...
This should make it possible to do the SAX (parser stream thing) for XBEL.
And will also be used for fixing the end-tag highlighting.
2005-12-05 19:31:42 +01:00
Jonas Fonseca
7a912795e1
Make the parser stream mode work as intended
...
This makes the parser and renderer share the stack, most importantly the
callbacks are now those of the renderer. Disable node attribute rendering
code that worked around the attributes being visited in sorted order.
2005-12-05 19:25:13 +01:00
Jonas Fonseca
2dddf86acc
Initialize the renderer before initializing the parser
...
... so it is ready when/if the parser will push any initial states.
2005-12-05 19:20:48 +01:00
Jonas Fonseca
65b504f093
Remove all traces of the element end-tag hilighting hack
...
End-tags will stay uncolored for the next few commits.
2005-12-05 11:21:08 +01:00
Jonas Fonseca
4c8d871404
Introduce sgml_parser_type for specifying tree and streaming parsers
...
Mostly added for the future, since currently only one parser type is
supported.
2005-12-05 11:19:13 +01:00
Jonas Fonseca
05a61cd16a
Update the DOM stack comment for things to come
2005-12-05 11:16:52 +01:00
Jonas Fonseca
8f25d73013
Use separate data variables for storing DOM stack data
...
They will eventually have to share the stack.
2005-12-05 11:16:05 +01:00
Jonas Fonseca
f85b498375
Add FIXME about using DOM node subtypes when searching the DOM stack
2005-12-05 11:11:40 +01:00
Jonas Fonseca
208515c82f
Fix access to free()d memory in the DOM stack state data
...
There already was one work-around in the code related to clearing of the
state data after popping. Instead of storing a pointer to the state data in
the state we now store the depth of the state (in the stack) and use a
macro to access the state data. The bug occurred when reallocating the
stack state objects and the stack data pointers wasn't updated to point to
the newly allocated data.
2005-12-05 11:11:06 +01:00