1
0
mirror of https://github.com/rkd77/elinks.git synced 2025-02-02 15:09:23 -05:00

33 Commits

Author SHA1 Message Date
Jonas Fonseca
00c4e0bfa2 Do not attempt to read *string when string == scanner->end
There might be other places that needs to be reviewd for this.
2006-01-28 03:23:06 +01:00
Jonas Fonseca
d92a074e40 Fix parsing of '<a< b>' where the scanner didn't rewind to the proper place
Add test for this tag soup combo.
2006-01-28 03:21:27 +01:00
Jonas Fonseca
bccf5512d6 Force an incomplete token for quoted attribute values when there's no end 2006-01-28 00:56:48 +01:00
Jonas Fonseca
9d91994f3c Propone updating the scanner->state until incompleteness has been checked
That way the scanner state is meaningful when resuming during incremental
parsing.
2006-01-27 07:41:42 +01:00
Jonas Fonseca
afb45aace5 Add support for scanning comment endings such as '--!>' correctly 2006-01-25 18:18:01 +01:00
Jonas Fonseca
acb1f7e74d Refactor computation of scanner error string length to get_sgml_error_end() 2006-01-07 23:51:19 +01:00
Jonas Fonseca
534a16fff1 Improve error detection 2006-01-07 23:40:21 +01:00
Jonas Fonseca
3835bf8449 A handful of fixes related to error detection
- Fix assertion failure by breaking the switch if an error token is next
   when previous was a processing instruction.
 - Fix <!notation parsing by skipping ident chars instead of spaces.
 - Improve checking of processing instruction 'target'-end and what error
   string is generated.
 - For now put all of the processing instruction data in the error token.
 - Remove a DBG()-print.
2006-01-07 05:18:43 +01:00
Jonas Fonseca
c993a0012e Add basic support for detection errors while scanning
It mostly uses the checking for incompleteness already in place. Tested
lightly so it will definately need some more work.
2006-01-07 04:26:08 +01:00
Jonas Fonseca
7ff2cb2607 Improve a comment a bit 2006-01-07 01:41:07 +01:00
Jonas Fonseca
f8d44ffe32 scan_sgml_tokens(): Drop local variable and use scanner->current
... so lower level scanners can change the next token to use.
2006-01-07 01:25:42 +01:00
Jonas Fonseca
f75ccffbc7 Fix SGML parsing and scanning so that all tests succeeds
This includes checking the return token of get_next_dom_scanner_token() and
fixing the calculated size of recovered processing instruction data tokens.
2006-01-02 21:04:51 +01:00
Jonas Fonseca
e78d43f1ac Add mode where the SGML scanner checks for completeness 2006-01-02 17:46:09 +01:00
Jonas Fonseca
fcf7677584 Skip spaces immediately when recognising '<?ident' 2006-01-02 16:58:48 +01:00
Jonas Fonseca
58c31f44a0 Clearify the code a bit 2006-01-02 03:06:47 +01:00
Jonas Fonseca
6e9a18b444 fix a few bugs for line counting in plain text 2006-01-02 01:49:12 +01:00
Jonas Fonseca
4a766f350b Just for fun also parse <?xml-stylesheet attributes 2005-12-31 03:13:39 +01:00
Jonas Fonseca
a578ed4667 Make the SGML scanner (optionally) keep track of line numbers
A new line is either \n or \f. The main logic for counting lines is in
skip_sgml{,_chars,_space}. For the general case where line numbers are not
wanted the code tries to avoid the extra checks for newlines.

This will be useful for reporting errors when loading the XBEL file.
2005-12-31 02:46:56 +01:00
Jonas Fonseca
b23beed031 Rename skip_comment() and skip_cdata_section() to conform to skip_sgml_*() 2005-12-31 02:00:09 +01:00
Jonas Fonseca
0891cda51e Introduce skip_sgml_space() that wraps scan_sgml(..., SGML_SCAN_WHITESPACE) 2005-12-31 01:57:54 +01:00
Jonas Fonseca
7489c134f7 Make non-terminated comments and cdata sections have 'the rest' as content 2005-12-31 01:47:57 +01:00
Jonas Fonseca
8f7f6abc16 Use skip_sgml_chars() in skip_comment() and skip_cdata_section() 2005-12-31 01:40:52 +01:00
Jonas Fonseca
4e10bcf772 Drop useless code for proc. instruction scanning 2005-12-31 01:18:49 +01:00
Jonas Fonseca
e8ff8bd5f0 Fix another off-by-one error similar to the SGML comment parsing 2005-12-31 01:14:52 +01:00
Jonas Fonseca
ab7ba39d42 Introduce skip_sgml_chars() to avoid usage of memchr() 2005-12-31 00:06:12 +01:00
Jonas Fonseca
76a524ddf6 More <?xml and comment tests, fix an off-by-one error for comments skipping 2005-12-29 22:26:39 +01:00
Jonas Fonseca
bd877570d2 Test some more obscure proc. instructions and fix some assertion failures 2005-12-29 21:52:27 +01:00
Jonas Fonseca
57168e1fbc Handle <element path=/to/%61-&\one";files/> as a self-closing tag
Before the '/' before '>' would be interpreted as part of the attribute
value.  Hope this is sensible slurping of the markup soup.
2005-12-29 20:38:43 +01:00
Jonas Fonseca
1a177491a0 Fix SGML parsing of processing instructions (<?xml ...?>)
It involves adding a new scanner state which is used only to generate a new
processing instruction (PI) data token. This removes some scanner specific
code from the parser and makes handling of PIs more generic. The data of
XML PIs are still parsed as attributes and added to the PI node.

The 6th test now succeeds. Hurrah!
2005-12-29 18:31:49 +01:00
Jonas Fonseca
fb6ca9a390 Use dom_string for storing the name member of dom_scanner_string_mapping 2005-12-28 21:10:05 +01:00
Jonas Fonseca
f1015f8a6a Make files include dom/string.h instead of util/string.h directly 2005-12-28 20:45:55 +01:00
Jonas Fonseca
6e163b186c Make the dom_scanner_token store it's string in a dom_string struct 2005-12-28 16:23:36 +01:00
Jonas Fonseca
71533eef9a Elute all DOM-related code and put it in src/dom 2005-12-28 14:05:14 +01:00