1
0
mirror of https://github.com/rkd77/elinks.git synced 2024-12-04 14:46:47 -05:00
Commit Graph

235 Commits

Author SHA1 Message Date
Jonas Fonseca
d1e275be52 Make parse_sgml() take buffer as dom_string struct 2005-12-28 15:21:45 +01:00
Jonas Fonseca
11e168aba4 Make init_sgml_parser() take URI as dom_string struct 2005-12-28 15:19:10 +01:00
Jonas Fonseca
71533eef9a Elute all DOM-related code and put it in src/dom 2005-12-28 14:05:14 +01:00
Jonas Fonseca
217f905d88 call_dom_stack_callbacks(): Only acquire the state data if needed 2005-12-27 15:22:06 +01:00
Jonas Fonseca
9e7dfb1ddf Make walk_dom_nodes() use a stack context; reduces the DOM stack state size 2005-12-27 06:04:01 +01:00
Jonas Fonseca
a4831fef2d Make it easier to work with DOM stack contexts outside of the callbacks
The problem is to get access to the context when it is not the first one
and it has to happen outside of the context callbacks. This changes the
memory management so that the context adder returns the context. To further
improve the use of contexts add a context destructor which makes it
possible to unregister (temporary) contexts.
2005-12-27 05:59:12 +01:00
Jonas Fonseca
774aa70c6f Drop (now) unused get_dom_stack_parent() wrapper for get_dom_stack_state() 2005-12-27 04:44:20 +01:00
Jonas Fonseca
af19f811e3 Simplify DOM node popping
.. by turning do_pop_dom_node() into pop_dom_node() and let pop_dom_state()
handle it's own logic for reaching the wanted state.
2005-12-27 04:42:28 +01:00
Jonas Fonseca
f5b32f86d1 Drop some dead code related to past rendering of DOM attribute nodes 2005-12-27 03:55:34 +01:00
Jonas Fonseca
b36229c222 Drop some unneeded #includes 2005-12-27 03:33:49 +01:00
Jonas Fonseca
74a0a7b174 Highlight the 'CDATA' part of <![CDATA[ (in a bit hacky way) 2005-12-26 20:51:01 +01:00
Jonas Fonseca
c99b1cc2cc Render content of CDATA sections with orange2 (in lack of something better) 2005-12-26 19:56:07 +01:00
Jonas Fonseca
be8c68c5f7 Fix matching of file:// URLs which of course have no host name 2005-12-25 16:22:20 +01:00
Jonas Fonseca
5dd9061a55 Allocate the string of nodes being 'linkified'
This makes the string actually NUL terminated at the right place and the
matching will never go beyond the text region of the node. One example is
<!--http://elinks.cz/--> which didn't work before.
2005-12-25 16:08:00 +01:00
Jonas Fonseca
2d020e4879 Extend the URL_REGEX to allow more protocol and stuff like %XX in path part
Thanks to peder for suggestions.
2005-12-25 15:45:47 +01:00
Jonas Fonseca
4fa0937ca5 Make URL inside cdata, comment and text nodes accessible
It depends on existence of <regex.h> system header and that can be enabled
via document.plain.display_links. The URL regex was supplied by yanek.
2005-12-25 15:16:21 +01:00
Jonas Fonseca
cb90dcd58c Remove unused variable 2005-12-25 05:19:55 +01:00
Jonas Fonseca
f2ba5e7f6b Cleanup and remove unnneded code 2005-12-25 04:40:58 +01:00
Jonas Fonseca
40ae683bfb match_attribute_value(): Actually do the matching
This is still untested like the last patch.
2005-12-25 04:38:30 +01:00
Jonas Fonseca
fc35d9ee33 Implement matching of element relations for DOM selection
It requires searching the select_data stack for all matches of the parent
selector and check the properties of matched nodes.
2005-12-25 03:48:53 +01:00
Jonas Fonseca
1347678988 Introduce get_dom_node_list_index() to lookup the index of a node in a list 2005-12-25 03:46:01 +01:00
Jonas Fonseca
d36b2d8a36 get_dom_select_data(): Move macro nearer to its users 2005-12-25 02:43:08 +01:00
Jonas Fonseca
ce2aa08cb1 Compile fix 2005-12-24 13:07:57 +01:00
Jonas Fonseca
8d30613a7f The child node list can be NULL when matching for the :empty pseudo-class
Also use the children node list and not the attribute list (aka the map).
2005-12-23 01:03:39 +01:00
Jonas Fonseca
3ea1b30fd6 Fix matching of the :root structural pseudo-class
Root nodes either have no parents or are the single child of the document
node.
2005-12-23 00:59:56 +01:00
Jonas Fonseca
755108cf95 Tidyup 2005-12-23 00:53:31 +01:00
Jonas Fonseca
4d6223f6a4 Oops, compile fix 2005-12-23 00:52:52 +01:00
Jonas Fonseca
12d34fd133 Factor out code to new match_element_selector()
'Twill make it easier to do the logic.
2005-12-23 00:51:57 +01:00
Jonas Fonseca
07fc481607 match_attribute_selectors(): Factor out matching of values to own function 2005-12-23 00:11:25 +01:00
Jonas Fonseca
262856273e Drop unused get_dom_node_attributes(), comment get_dom_node_list() 2005-12-22 23:42:23 +01:00
Jonas Fonseca
faa85adf73 dom_select_push_element(): Use dom_node_casecmp() and drop homegrown one 2005-12-22 23:40:11 +01:00
Jonas Fonseca
1f47fabf5e Rename dom_node_cmp() to dom_node_casecmp() and make it public 2005-12-22 23:37:59 +01:00
Jonas Fonseca
b13a21bbc2 dom_node_cmp(): Make it into a general node comparer 2005-12-22 23:35:17 +01:00
Jonas Fonseca
6dfd7a5988 When searching DOM node lists store the subtype in the search struct's node 2005-12-22 23:29:07 +01:00
Jonas Fonseca
558e2736e4 search_dom_stack(): Use dom_string_casecmp() for comparison 2005-12-22 22:28:38 +01:00
Jonas Fonseca
c4a1031b2e Move code for the final source highlight flushing to document pop callback
This requires the document root stack state is made mutable and is popped.
Should make render_dom_document() more generalised. For SGML_PARSER_STREAM
this has the fun property that the stack will magically free the root node.
2005-12-22 12:33:27 +01:00
Jonas Fonseca
4eae1d4882 Add a few comment and remove an obsolete one 2005-12-22 04:00:55 +01:00
Jonas Fonseca
9c720c2cc8 Rename the DOM tree renderer to DOM stack tracer
Use add_dom_stack_tracer(stack) to have stack activity traced. It is only
active when DOM_STACK_TRACE is defined.
2005-12-22 03:55:55 +01:00
Jonas Fonseca
cb64068712 Make it so that the indent string used by the tree renderer needs no init 2005-12-22 03:33:56 +01:00
Jonas Fonseca
f21fcd132f Oops, do not define DOM_TREE_RENDERER by default 2005-12-22 03:21:52 +01:00
Jonas Fonseca
6cb9a841b6 Add FIXME about optimizing walk_dom_nodes() 2005-12-22 03:20:11 +01:00
Jonas Fonseca
eab6c19bbe Add lots of comments and FIXMEs 2005-12-22 03:19:53 +01:00
Jonas Fonseca
25e0a18b7f Misc DOM select fixes
- ensure done_dom_stack() is called after parsing is done
 - get_dom_select_data(): Use stack->current->data since it is used within
   dom_stack_callback_T
 - dom_select_pop_element(): Use stack->contexts since it is the
   select_data stack and it owns the first context
 - add the select_data context to the right stack
 - fix some comments
2005-12-21 23:26:22 +01:00
Jonas Fonseca
45592ea5a7 Make the DOM tree renderer thing usable without a dom_renderer defined
It now hardcodes the codepage to ASCII when showing entity references.
The tree debug context info is also exported.
2005-12-21 22:32:27 +01:00
Jonas Fonseca
d6c5640f29 Turn the DOM tree renderer into a debug module
Define DOM_TREE_RENDERER and run as:

	ELINKS_LOG=/tmp/dom-dump.txt ./elinks -no-connect <url>

to have a trace of DOM tree dumped into a file.
2005-12-21 14:41:28 +01:00
Jonas Fonseca
fe6637dd7d Fix the DOM tree renderer to work with the new stack interface 2005-12-21 14:05:01 +01:00
Jonas Fonseca
419d9d165a get_dom_stack_state_data(): Make static inline and handle zero object size 2005-12-21 13:56:18 +01:00
Jonas Fonseca
9360f88d65 search_dom_stack(): No need to inline this at least not while debugging 2005-12-21 13:48:37 +01:00
Jonas Fonseca
779a8a4553 Improve comments 2005-12-21 13:46:28 +01:00
Jonas Fonseca
edee14699e Reorder some struct and fix some comments 2005-12-21 04:57:25 +01:00
Jonas Fonseca
2a24ad9099 Introduce enum dom_stack_flag to make init_dom_stack() more obvious 2005-12-21 04:48:50 +01:00
Jonas Fonseca
632f12f82a init_dom_stack(): Drop unused object_size argument 2005-12-21 04:41:35 +01:00
Jonas Fonseca
f8d48e81eb Move the state_objects to the DOM stack contexts
This way all contexts are now separated, almost.
2005-12-21 04:38:04 +01:00
Jonas Fonseca
910c51abaf Remove the now unused DOM stack data member 2005-12-21 03:59:46 +01:00
Jonas Fonseca
da33827771 Use a (for now bogus) DOM stack context for holding DOM select data 2005-12-21 03:57:17 +01:00
Jonas Fonseca
3374f3cbba Drop data member from struct sgml_parser it is at stack->current->data 2005-12-21 01:36:47 +01:00
Jonas Fonseca
c524655362 Add current member to struct dom_Stack which holds the current context 2005-12-21 01:32:43 +01:00
Jonas Fonseca
0faa8d7462 Add a data member to struct dom_stack_context (not used yet) 2005-12-21 01:25:50 +01:00
Jonas Fonseca
12a2f96920 Introduce struct dom_Stack_context
- For now it just stores the dom_stack_context_info reference.
 - Rename dom_stack members to contexts and contexts_size.
 - Use mem_alloc_align based rellocation scheme.
2005-12-21 01:15:19 +01:00
Jonas Fonseca
a77242738c Rename: add_dom_stack_callbacks() -> add_dom_stack_context() 2005-12-21 01:04:37 +01:00
Jonas Fonseca
3843be25af Rename: struct dom_stack_callbacks -> struct dom_stack_context_info 2005-12-21 00:58:22 +01:00
Jonas Fonseca
0834e77252 Make the DOM renderer add its own DOM stack callbacks 2005-12-20 21:10:09 +01:00
Jonas Fonseca
625725f0e9 Allow for multiple callbacks to be attached to the DOM stack 2005-12-20 20:27:20 +01:00
Jonas Fonseca
e309de8950 Introduce call_dom_stack_callbacks as a common way to call back 2005-12-20 20:01:18 +01:00
Jonas Fonseca
d6b125fa68 Drop the return value from dom_stack_callback_T
... since the feature with popping the node if the return value is NULL is
not used and it doesn't make a lot of sense with multiple callbacks.
2005-12-20 19:48:33 +01:00
Jonas Fonseca
990c5e0a26 Combine DOM stack push and pop callbacks into one struct 2005-12-20 19:20:04 +01:00
Jonas Fonseca
ec9f41c1cd Retire specialized proc-instruction DOM renderer callback
It now uses the DOM element callback. Before the proc-instruction
attributes was shown twice.
2005-12-20 03:25:51 +01:00
Jonas Fonseca
56d634b946 Add basic support for RSS parsing for application/rss+xml content types
This means the RSS source will be highlighted, but by default the HTML
renderer will be used for the default rendering.
2005-12-20 03:08:13 +01:00
Jonas Fonseca
5777941d06 DOM select: Completely rewrite the parser for nth arguments 2005-12-20 01:50:39 +01:00
Jonas Fonseca
c2e30c8eea get_child_dom_select_node(): Use the foreach_dom_node iterator 2005-12-20 01:48:21 +01:00
Jonas Fonseca
ceffe8f1a4 Make the SGML parser ready for (specializing) doctypes other than HTML 2005-12-20 01:04:33 +01:00
Jonas Fonseca
8e769d48a5 Misc cleanups and improvements 2005-12-20 00:01:18 +01:00
Jonas Fonseca
3b412553b6 match_attribute_selectors(): Fix warning about uninitialized attr variable
Outspitten on FreeBSD.
2005-12-19 22:13:23 +01:00
Jonas Fonseca
2e2c0a590e Add basic functionality for selecting DOM nodes based on CSS3 selectors
The design should more or less be in place. There is still a lot of things
missing but it should actually be enough for using it in a simple RSS renderer.

Amongst several things, :nth-* pseudo-classes and :not() syntax is not in
place.
2005-12-19 03:44:18 +01:00
Jonas Fonseca
b64e122159 Change order of variables given to foreach_dom_node iterators 2005-12-19 02:57:00 +01:00
Jonas Fonseca
330c0174e5 Rename DOM stack iterators and make them include all states when iterating
They are now called: foreach{back,}_dom_stack_state (...) and the immutable
flag together with node type restricted stack searches should ensure that
the document root node never is popped.
2005-12-19 02:51:32 +01:00
Jonas Fonseca
051db70dd4 Add boolean immutable flag to the DOM stack state
Can be used to ensure the document root node never leaves the stack while
parsing.
2005-12-19 02:34:26 +01:00
Jonas Fonseca
ee1eba9689 Rename: dom_stack_has_parents() -> dom_stack_is_empty() (with negated value) 2005-12-19 02:15:36 +01:00
Jonas Fonseca
bc338207e7 do_pop_dom_node(): move dom_stack_has_parents() to assertion
All callers already checks if the stack is empty.
2005-12-19 02:05:43 +01:00
Jonas Fonseca
45861c68e1 pop_dom_state(): Drop unused left-over argument 2005-12-15 22:05:30 +01:00
Jonas Fonseca
ef5d5fc27a dom_node_cmp(): Only use element or attribute type ID if both are set 2005-12-15 22:02:02 +01:00
Jonas Fonseca
c2d27a33d8 Rename: nav -> stack 2005-12-15 17:24:20 +01:00
Jonas Fonseca
5ef041c051 Redo the assertm() message to just show the type of the node and parent 2005-12-13 20:08:58 +01:00
Jonas Fonseca
d1635d6970 Fix wrong assertion message string
You just cannot print dom_string structs with %s.
2005-12-13 16:33:50 +01:00
Jonas Fonseca
f35026ecfb Add DOM_NODE_UNKNOWN node type for internal purposes only 2005-12-13 04:52:47 +01:00
Jonas Fonseca
5ff0849eb3 set_dom_string(): take length as size_t; -1 means use strlen() to get size 2005-12-12 17:42:26 +01:00
Jonas Fonseca
27116d6385 Make the DOM stack and the SGML parser interface more general
They now both hold a single private data member. This means the parser now
holds the renderer data.
2005-12-12 17:41:09 +01:00
Jonas Fonseca
458fc5ee79 Review and change dom_string specific uint16_t value to size_t 2005-12-10 22:24:30 +01:00
Jonas Fonseca
cdc749def3 get_dom_node_value(): indent switch 2005-12-10 21:50:40 +01:00
Jonas Fonseca
86c9a37810 Factor out dom_string_casecmp() 2005-12-10 21:49:33 +01:00
Jonas Fonseca
87aad88c96 Use dom_string throughout the DOM stack interface 2005-12-10 21:42:49 +01:00
Jonas Fonseca
0fab644bee get_dom_node_value(): move non-compliant functionality to the tree renderer
Entity references are supposed to have a null value and the string
compression is for improving the tree view.
2005-12-10 21:37:47 +01:00
Jonas Fonseca
52f5276f92 get_dom_node_name(): indent switch statement 2005-12-10 20:05:01 +01:00
Jonas Fonseca
295679a5e6 get_dom_node_name(): return struct dom_string *
Also, simplify the rendering a bit for now.
2005-12-10 20:03:43 +01:00
Jonas Fonseca
7d6db6b152 Update the DOM tree renderer to support dom_strings 2005-12-10 19:44:01 +01:00
Jonas Fonseca
2aedeb0a67 get_dom_node_type_name(): return struct dom_string * 2005-12-10 19:28:37 +01:00
Jonas Fonseca
5f69255cbc get_dom_node_map_entry(): take name as a dom_string
Requires that dom_string->length becomes a size_t.
2005-12-10 19:21:12 +01:00
Jonas Fonseca
0546759b4b Use struct dom_string for node->proc_instruction.instruction 2005-12-10 18:59:17 +01:00
Jonas Fonseca
9935bf2083 Convert some yet unused strings to use the dom_string struct
Also remove the unneeded path member from the dom_node_id_item struct. It
was obsoleted by the addition of dom_node->parent.
2005-12-10 18:51:08 +01:00
Jonas Fonseca
ed7a292966 Use struct dom_string for node->attribute.value 2005-12-10 18:42:54 +01:00
Jonas Fonseca
ce3778c3c0 Add struct dom_string
In time it should be used for all strings in the DOM engine.
For now it is just used for node->string.
2005-12-10 18:37:47 +01:00
Jonas Fonseca
8aa078393a Move dom_node_data union outside the dom_node struct 2005-12-08 03:26:34 +01:00
Jonas Fonseca
4480a9a4cd Removes node from the DOM tree when using the SGML stream parser
That should free up some short-term memory.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2005-12-08 03:02:27 +01:00
Jonas Fonseca
8f97dc8403 done_dom_node(): remove the node from all parent lists
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2005-12-08 02:59:40 +01:00
Jonas Fonseca
93fb17ea2a Indent switch statement
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2005-12-08 02:32:23 +01:00
Jonas Fonseca
ce5bf8c6f8 Fix DOM node list iterators macros
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2005-12-08 02:04:13 +01:00
Jonas Fonseca
1c2f271782 Add parent member to dom_node
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2005-12-08 01:35:48 +01:00
Jonas Fonseca
69b321cb5b Replace struct cache_entry member with struct uri member
Reduces the number of (unused) dependencies. Also, update the #include
list removing old entries.
2005-12-05 23:47:23 +01:00
Jonas Fonseca
38b8503161 Remove document member from struct sgml_parser
The document URI can be accessed from the cache entry.
2005-12-05 23:31:54 +01:00
Jonas Fonseca
c7ad6f967b Introduce new pop_dom_state()
It's basically pop_dom_nodes() without the search part and is now used as a
backend in pop_dom_nodes(). Use it in parse_sgml_document() to avoid two
DOM stack searches in a row.
2005-12-05 19:40:35 +01:00
Jonas Fonseca
1c4a0d67ce Restore highlighting of element end-tags
... by installing a pop-callback for elements and responding to whatever
the parser has put in the end_token parser state member.
2005-12-05 19:33:15 +01:00
Jonas Fonseca
9aebb66bce Introduce separate push/pop callbacks
This should make it possible to do the SAX (parser stream thing) for XBEL.
And will also be used for fixing the end-tag highlighting.
2005-12-05 19:31:42 +01:00
Jonas Fonseca
7a912795e1 Make the parser stream mode work as intended
This makes the parser and renderer share the stack, most importantly the
callbacks are now those of the renderer. Disable node attribute rendering
code that worked around the attributes being visited in sorted order.
2005-12-05 19:25:13 +01:00
Jonas Fonseca
2dddf86acc Initialize the renderer before initializing the parser
... so it is ready when/if the parser will push any initial states.
2005-12-05 19:20:48 +01:00
Jonas Fonseca
65b504f093 Remove all traces of the element end-tag hilighting hack
End-tags will stay uncolored for the next few commits.
2005-12-05 11:21:08 +01:00
Jonas Fonseca
4c8d871404 Introduce sgml_parser_type for specifying tree and streaming parsers
Mostly added for the future, since currently only one parser type is
supported.
2005-12-05 11:19:13 +01:00
Jonas Fonseca
05a61cd16a Update the DOM stack comment for things to come 2005-12-05 11:16:52 +01:00
Jonas Fonseca
8f25d73013 Use separate data variables for storing DOM stack data
They will eventually have to share the stack.
2005-12-05 11:16:05 +01:00
Jonas Fonseca
f85b498375 Add FIXME about using DOM node subtypes when searching the DOM stack 2005-12-05 11:11:40 +01:00
Jonas Fonseca
208515c82f Fix access to free()d memory in the DOM stack state data
There already was one work-around in the code related to clearing of the
state data after popping. Instead of storing a pointer to the state data in
the state we now store the depth of the state (in the stack) and use a
macro to access the state data. The bug occurred when reallocating the
stack state objects and the stack data pointers wasn't updated to point to
the newly allocated data.
2005-12-05 11:11:06 +01:00
Jonas Fonseca
3a9cba5695 Split the parser interface up into init, parser and done steps
The parser will eventually have to live across parses for incremental
renderering. Also the renderer and parser need to share the DOM stack.
2005-11-27 09:18:40 +01:00
Laurent MONIN
3dd81f003e Fix trailing whitespaces. 2005-11-25 09:38:32 +01:00
Jonas Fonseca
b42b098fd4 Remove unused root member of struct dom_renderer 2005-11-15 12:11:48 +01:00
Jonas Fonseca
41941c64d6 Improve the DOM stack structures comments 2005-11-15 11:21:01 +01:00
Jonas Fonseca
16481d7baa Move DOM exception enum to separate file
... and remove it from the dom_stack struct.
2005-11-15 11:01:11 +01:00
Jonas Fonseca
bccfbf8647 Rename DOM navigator -> stack
This is really a much more appropriate word since it never ended up being
more than just a stack. The rename also changes the symbol names to use the
much shorter "stack".
2005-11-15 10:43:52 +01:00
Laurent MONIN
df065ead80 Remove now useless $Id: lines. 2005-10-21 09:14:07 +02:00
Jonas Fonseca
c88afeb1c2 path_to_top -> top_builddir 2005-10-20 04:00:35 +02:00
Jonas Fonseca
e39a4342d6 Include $(top_srcdir)/Makefile.lib instead of $(path_to_top)/Makefile.lib
A step towards out of tree builds ...
2005-10-20 01:11:47 +02:00
Jonas Fonseca
1efab31581 Simplify building of and linking with directories
Ditch the building of an archive (.a) in favour of linking all objects in a
directory into a lib.o file. This makes it easy to link in subdirectories
and more importantly keeps the build logic in the local subdirectories.

Note: after updating you will have to rm **/*.a if you do not make clean
before updating.
2005-09-27 21:38:58 +02:00
Jonas Fonseca
b30064c0d0 Rename targets: *-l -> *-local 2005-09-27 21:11:28 +02:00
Jonas Fonseca
50f4b46616 dom_node_cmp(): Minor optimization 2005-09-27 14:39:40 +02:00
Petr Baudis
1f0cd14e91 Converted another bunch of submakefiles to ELBuild 2005-09-16 04:07:37 +02:00
Jonas Fonseca
7462f22635 Remove now obsolete .cvsignore files. 2005-09-15 18:33:20 +02:00
Petr Baudis
0f6d4310ad Initial commit of the HEAD branch of the ELinks CVS repository, as of
Thu Sep 15 15:57:07 CEST 2005. The previous history can be added to this
by grafting.
2005-09-15 15:58:31 +02:00