elinks

mirror of https://github.com/rkd77/elinks.git synced 2024-11-04 08:17:17 -05:00

Author	SHA1	Message	Date
Kalle Olavi Niemitalo	e45f5a8915	utf8char_len_tab[] is const. This change moves 256 bytes of data into a read-only section, perhaps reducing memory consumption when multiple ELinks processes run in parallel.	2007-01-01 17:18:05 +02:00
Kalle Olavi Niemitalo	cde14dcd18	utf8_to_unicode: Reject characters in the surrogate range. This isn't CESU-8.	2006-12-23 01:48:07 +02:00
Kalle Olavi Niemitalo	114ce8c833	utf8_to_unicode: Reject invalid sequences, such as overlong. Convert each byte of them to UCS_REPLACEMENT_CHARACTER. This may not be the optimal solution but at least it ought to be safe. Also raise an internal error if the value read from utf8char_len_tab[] is out of range. Note that ELinks is still using the RFC 2279 definition of UTF-8 and thus allows characters up to 0x7FFFFFFF, even though RFC 3629 has changed the maximum to 0x10FFFF.	2006-12-20 22:08:34 +02:00
Kalle Olavi Niemitalo	8b8cd57941	Use new macro UCS_ORPHAN_CELL for broken double-cell characters. UCS_ORPHAN_CELL is currently defined as U+0020 SPACE, which was already used before this macro, so the behaviour does not change, but the code seems clearer now. I searched for ' ' and 32 and 0x20 and \x20, and replaced with UCS_ORPHAN_CELL wherever UCS_NO_CHAR was involved. However, some BFU widgets first draw spaces and then overwrite with text; those will require a more complex fix if UCS_ORPHAN_CELL is ever changed to some other character.	2006-11-13 00:49:59 +02:00
Kalle Olavi Niemitalo	7809efa1b5	Names of enum constants should be in upper case. Requested by Miciah.	2006-11-12 14:51:18 +02:00
Kalle Olavi Niemitalo	40b6edc69d	u2cp_: Make the no_nbsp_hack parameter an enum. This is from attachment 279 of bug 811. The change does not yet affect any visible behaviour.	2006-11-12 14:29:09 +02:00
Jonas Fonseca	180c8befac	Fix linker warning on Mac OS X by prefixing locale_charset with "elinks_" /usr/bin/ld: warning multiple definitions of symbol _locale_charset lib.o definition of _locale_charset in section (__TEXT,__text) /usr/lib/libiconv.dylib(localcharset.o) definition of _locale_charset	2006-11-04 08:46:45 +01:00
Kalle Olavi Niemitalo	d050cb67aa	Revert the use of wcwidth() and describe why. This reverts the following commits: - `86ed79deaf` Use wcwidth if available and applicable. - `304f5fa1ea` comment fix (__STDC_ISO_10646__, not __STDC_ISO_10646) - part of `71eebf1cc7` Compensate for glibc not defining wcwidth() when _XOPEN_SOURCE is not set And adds a lengthy comment about LC_CTYPE problems.	2006-10-22 00:05:37 +03:00
Petr Baudis	71eebf1cc7	Compensate for glibc not defining wcwidth() when _XOPEN_SOURCE is not set	2006-10-12 23:43:49 +02:00
Laurent MONIN	09991b59f1	Partial Afrikaans translation was added. Thanks to Friedel Wolff.	2006-10-11 14:39:04 +02:00
Laurent MONIN	e86e1d0fa3	Trim some trailing whitespaces.	2006-09-29 00:07:54 +02:00
Kalle Olavi Niemitalo	304f5fa1ea	comment fix (__STDC_ISO_10646__, not __STDC_ISO_10646)	2006-09-25 22:24:56 +03:00
Kalle Olavi Niemitalo	86ed79deaf	Use wcwidth if available and applicable. (If wchar_t is not Unicode, then it is not applicable.)	2006-09-24 23:56:12 +03:00
Kalle Olavi Niemitalo	4a5af7fd26	Bug 381: Store codepage-to-Unicode mappings as dense arrays. Previously, each mapping between a codepage byte and a Unicode character was stored as a struct table_entry, which listed both the byte and the character. This representation may be optimal for sparse mappings, but codepages map almost every possible byte to a character, so it is more efficient to just have an array that lists the Unicode character corresponding to each byte from 0x80 to 0xFF. The bytes are not stored but rather implied by the array index. The tcvn5712 and viscii codepages have a total of four mappings that do not fit in the arrays, so we still use struct table_entry for those. This change also makes cp2u() operate in O(1) time and may speed up other functions as well. The "sed \| while read" concoction in Unicode/gen-cp looks rather unhealthy. It would probably be faster and more readable if rewritten in Perl, but IMO that goes for the previous version as well, so I suppose whoever wrote it had a reason not to use Perl here. Before: text data bss dec hex filename 38948 28528 3311 70787 11483 src/intl/charsets.o 500096 85568 82112 667776 a3080 src/elinks After: text data bss dec hex filename 31558 28528 3311 63397 f7a5 src/intl/charsets.o 492878 85568 82112 660558 a144e src/elinks So the text section shrank by 7390 bytes. Measured on i686-pc-linux-gnu with: --disable-xbel --disable-nls --disable-cookies --disable-formhist --disable-globhist --disable-mailcap --disable-mimetypes --disable-smb --disable-mouse --disable-sysmouse --disable-leds --disable-marks --disable-css --enable-small --enable-utf-8 --without-gpm --without-bzlib --without-idn --without-spidermonkey --without-lua --without-gnutls --without-openssl CFLAGS="-Os -ggdb -Wall"	2006-09-24 16:55:29 +03:00
Kalle Olavi Niemitalo	0e88f8ba28	Bug 381: New macro is_cp_ptr_utf8(cp_ptr). This does not significantly change the generated code but should make the next commit more readable.	2006-09-24 13:33:58 +03:00
Kalle Olavi Niemitalo	e1fee49fb7	Bug 381: Halve sizeof(struct table_entry). Before: text data bss dec hex filename 54920 28528 3311 86759 152e7 src/intl/charsets.o 516064 85568 82112 683744 a6ee0 src/elinks After: text data bss dec hex filename 38958 28528 3311 70797 1148d src/intl/charsets.o 500112 85568 82112 667792 a3090 src/elinks So the text section shrank by 15962 bytes. Measured on i686-pc-linux-gnu with: --disable-xbel --disable-nls --disable-cookies --disable-formhist --disable-globhist --disable-mailcap --disable-mimetypes --disable-smb --disable-mouse --disable-sysmouse --disable-leds --disable-marks --disable-css --enable-small --enable-utf-8 --without-gpm --without-bzlib --without-idn --without-spidermonkey --without-lua --without-gnutls --without-openssl CFLAGS="-Os -ggdb -Wall"	2006-09-24 12:47:00 +03:00
Kalle Olavi Niemitalo	62d6db44aa	Bug 381: Make codepage data const. Before: text data bss dec hex filename 25726 62992 3343 92061 1679d src/intl/charsets.o 653856 120020 82144 856020 d0fd4 src/elinks After: text data bss dec hex filename 60190 28528 3311 92029 1677d src/intl/charsets.o 688320 85556 82112 855988 d0fb4 src/elinks So 34464 bytes were moved from the data section to the text section and should be more likely to get shared between ELinks processes. Measured on i686-pc-linux-gnu with: --disable-xbel --disable-nls --disable-cookies --disable-formhist --disable-globhist --disable-mailcap --disable-mimetypes --disable-smb --disable-mouse --disable-sysmouse --disable-leds --disable-marks --disable-css --enable-small --enable-utf-8 --without-gpm --without-bzlib --without-idn --without-spidermonkey --without-lua --without-gnutls --without-openssl CFLAGS="-O2 -ggdb -Wall"	2006-09-24 11:59:23 +03:00
Kalle Olavi Niemitalo	9c94a896b7	Internally rename the utf_8 codepage to utf8.	2006-09-17 16:23:17 +03:00
Kalle Olavi Niemitalo	92cb452a9e	Rename CONFIG_UTF_8 to CONFIG_UTF8. The configure script no longer recognizes "CONFIG_UTF_8=yes" lines in custom features.conf files. They will have to be changed to "CONFIG_UTF8=yes". This incompatibility was deemed acceptable because no released version of ELinks supports CONFIG_UTF_8. The --enable-utf-8 option was not renamed.	2006-09-17 16:12:47 +03:00
Kalle Olavi Niemitalo	e8462980e5	Change "utf_8" to "utf8" in most identifiers. Suggested by Miciah on #elinks. What was renamed: add_utf_8 => add_utf8 cp2utf_8 => cp2utf8 encode_utf_8 => encode_utf8 get_translation_table_to_utf_8 => get_translation_table_to_utf8 goto invalid_utf_8_start_byte => goto invalid_utf8_start_byte goto utf_8 => goto utf8 goto utf_8_select => goto utf8_select terminal_interlink.utf_8 => terminal_interlink.utf8 utf_8_to_unicode => utf8_to_unicode What was not renamed: terminal._template_.utf_8_io option, TERM_OPT_UTF_8_IO Compatibility with existing elinks.conf files would require an alias. --enable-utf-8 Because the name of the charset is UTF-8, --enable-utf-8 looks better than --enable-utf8. CONFIG_UTF_8 Will be renamed in a later commit. Unicode/utf_8.cp, table_utf_8, aliases_utf_8 Will be renamed in a later commit.	2006-09-17 16:06:22 +03:00
Kalle Olavi Niemitalo	a01be8bd6b	UTF-8: Stepping functions set count even if an assertion fails. Previously, utf8_step_forward() and utf8_step_backward() left count unchanged if some parameter was invalid. Now, they properly store 0. This flaw had no effect in practice, because all current callers pass count=NULL, and invalid parameters shouldn't be used anyway.	2006-09-03 03:08:56 +03:00
Kalle Olavi Niemitalo	216495188a	UTF-8: New functions for stepping forward and backward in a string.	2006-09-02 21:48:03 +03:00
Kalle Olavi Niemitalo	38fe5b72f7	Define and use macros for handling UTF-16 surrogates.	2006-08-24 23:30:41 +03:00
Kalle Olavi Niemitalo	0748ee8c92	UTF-8: Split UCS_REPLACEMENT_CHARACTER off UCS_NO_CHAR. In the previous version, invalid UTF-8 from a terminal caused UCS_NO_CHAR (0xFFFFFFFD) to be stored in a term_event_key_T, resulting in -3 which was then incidentally treated as an unassigned special key. Now, invalid UTF-8 is instead mapped to UCS_REPLACEMENT_CHARACTER and treated as a character. The fact that handle_interlink_event calls term_send_ucs when it receives invalid UTF-8 makes it pretty clear that this is how it was intended. src/viewer/text/link.c (not changed in this commit) already referred to UCS_REPLACEMENT_CHARACTER in a comment even though it was not previously defined.	2006-08-19 13:35:21 +03:00
Jonas Fonseca	8a8fc52eec	cp_to_unicode: Simplify and cleanup	2006-08-14 02:19:44 +02:00
Kalle Olavi Niemitalo	85d4c5679c	UTF-8 doc: unicode_fold_label_case() may be called for any Unicode character. It cannot be restricted just to characters that have passed check_kbd_label_key(), because hotkeys in strings received from gettext must also be processed with it, and there we don't have a struct term_event for check_kbd_label_key().	2006-08-13 23:41:48 +03:00
Kalle Olavi Niemitalo	b6447ae26b	UTF-8: New function cp_to_unicode().	2006-08-13 23:35:50 +03:00
Kalle Olavi Niemitalo	143d95e927	UTF-8: Reuse the translation table if it is from the same charset. This change makes ELinks more efficient and causes bug 782 to occur far less often. (That does not mean the bug should not be fixed.)	2006-08-12 21:10:39 +02:00
Kalle Olavi Niemitalo	a14074a763	try_document_key: Convert the key to UCS-4, resolving the FIXME. This requires compiling cp2u() in even without CONFIG_UTF_8. I also added an is_kbd_character macro to make try_document_key more resilient to changes in the definition of term_event_key_T.	2006-08-12 16:04:21 +03:00
Kalle Olavi Niemitalo	f7fd49cf28	UTF-8: New function unicode_fold_label_case and a related script.	2006-08-06 20:02:42 +00:00
Kalle Olavi Niemitalo	8a1d7e2fa3	terminal UTF-8: Translate all input via UCS-4, #ifdef CONFIG_UTF_8.	2006-08-06 20:02:41 +00:00
Witold Filipczyk	5fd284d6a2	The value of UCS_NO_CHAR was bad. There must not be a possibility to encode it using utf_8_to_unicode. If every unicode_val_T value could be a result of that function then one must add out param to the utf_8_to_unicode signaling 'true' UCS_NO_CHAR.	2006-07-31 21:23:47 +02:00
Laurent MONIN	1136aefb71	Trim trailing whitespaces.	2006-07-27 09:51:10 +02:00
Witold Filipczyk	d83068ec85	Proper CFLAGS and LDFLAGS for the Python scripting backend. The patch by M. Levinson. I added DIR to the --with-python	2006-07-26 21:27:57 +02:00
Pavol Babincak	a7a7984d89	Merge with http://www.fi.muni.cz/~xbabinc/elinks/elinks-utf8.git/ without ucdata stuff. UTF-8 code cleanup. Added Pavol Babincak to the AUTHORS	2006-07-25 09:59:12 +02:00
Witold Filipczyk	8c3f931ff0	Wide char could be bigger than 0xffff	2006-07-18 20:33:34 +02:00
Witold Filipczyk	44c74ac389	Refactor is_cp_special to is_cp_utf8	2006-07-18 17:51:03 +02:00
Pavol Babincak	9d1008c523	Added utf8_prevchar for moving throught UTF-8 string to left.	2006-05-01 22:58:51 +02:00
Pavol Babincak	129bd2f444	Added function utf8_ptr2chars for counting number of characters in string.	2006-04-07 22:06:17 +02:00
Pavol Babincak	79d4d74a22	Added functions for manipulating with UTF-8 strings.	2006-03-05 00:10:33 +01:00
Pavol Babincak	c726080def	Double-width glyph support in terminal draw Added unicode_to_cell detect double-width glyphs. Modified terminal draw to correctly accept double-width glyphs.	2006-02-18 20:28:00 +01:00
Pavol Babincak	f9d67aeb73	Added configure option --enable-utf-8 For enabling better UTF-8 support by Witek and Scrool.	2006-02-18 20:28:00 +01:00
Pavol Babincak	0bacd766e2	Added UTF-8 char length lookup table Added lookup table to quick get number of bytes of UTF-8 character from first byte.	2006-02-18 20:27:50 +01:00
Witold Filipczyk	44a1aa9c87	Witekfl's UTF-8 patch v5.	2006-02-18 20:27:46 +01:00
Jonas Fonseca	2748d043f9	Autogenerate .vimrc files and put the master in config/vimrc This changes the init target to be idempotent: most importantly it will now never overwrite a Makefile if it exists. Additionally 'make init' will generate the .vimrc files. Yay, no more stupid 'added fairies' commits! ;)	2006-01-15 18:38:58 +01:00
Laurent MONIN	b7b33bae9b	Fix a memleak that may occur on systems without alloca(), backport from gettext 0.14.5.	2006-01-13 13:58:13 +01:00
Laurent MONIN	f7e435fcf3	Upgrade config.charset to latest version from gnu gettext.	2006-01-13 13:22:09 +01:00
Jonas Fonseca	d2e346436a	Hmm, seem b.delta decided not to become 0x03B4 like it should	2006-01-10 15:39:11 +01:00
Jonas Fonseca	aa75cade23	Reinsert part of comment for nVDash; fixes `8e0eda5e4d`	2006-01-03 23:38:37 +01:00
Jonas Fonseca	8f18d1c6c8	Rebuild the entity table with Unicode/gen-ent	2006-01-03 17:14:33 +01:00
Jonas Fonseca	9c50072c97	Fix more problems when $(srcdir) is empty Thanks to Kalle Olavi Niemitalo and Adam Golebiowski.	2006-01-01 22:54:44 +01:00
Jonas Fonseca	748bab64a7	Make the printed install paths simpler for man5 files when srcdir == builddir	2005-12-30 00:49:01 +01:00
Jonas Fonseca	acf2ec806b	Remove empty lines in start of header files A left over from the CVS Id removal. Also, for a few files, normalize the order in which things are declared in headers.	2005-11-15 11:33:27 +01:00
Jonas Fonseca	a5d205c047	Add missing host variable and fix installation of charset.alias Add variable expanded from @host@ and use $(host) instead of @host@ in the gettext Makefile. Reported by zas.	2005-10-26 19:06:30 +02:00
Jonas Fonseca	9096886a67	@USE_INCLUDED_LIBINTL@ -> $(CONFIG_NLS), @GLIBC21@ -> $(GLIBC21) Should remove all @Makefile.in@ variable traces except for @host@ also in src/intl/gettext/Makefile (which zas will hopefully fix :).	2005-10-26 14:26:35 +02:00
Jonas Fonseca	e82de325a0	@USE_INCLUDED_LIBINTL@ -> CONFIG_NLS	2005-10-26 14:19:32 +02:00
Laurent MONIN	df065ead80	Remove now useless $Id: lines.	2005-10-21 09:14:07 +02:00
Jonas Fonseca	c88afeb1c2	path_to_top -> top_builddir	2005-10-20 04:00:35 +02:00
Jonas Fonseca	db99a74777	Add support for out-of-tree builds Involves prefixing with $(srcdir) to some of the build rule variables. For the builddir we create Makefiles which simply include the srcdir Makefile. Add list make rule to get list of Makefiles to generate (find will get it wrong for builddirs nested in srcdir). There are still a few minor issues like the file paths echoed during make install ...	2005-10-20 03:49:40 +02:00
Jonas Fonseca	e39a4342d6	Include $(top_srcdir)/Makefile.lib instead of $(path_to_top)/Makefile.lib A step towards out of tree builds ...	2005-10-20 01:11:47 +02:00
Jonas Fonseca	94ed6fa754	Finalize and cleanup the denser Makefile format Convert remaining conditional file building to use OBJS-$(CONFIG_FOO) += foo.o one problem with reverse meaining (in util/) fixed with local 'hack'. Cleanup and remove stuff which is now default targets.	2005-09-28 12:38:17 +02:00
Jonas Fonseca	c76586e6b8	Simplify the conditional building Use the very cool 'VAR-$(CONFIG_FOO) += foo.o' feature instead of the more verbose current ifeq($(CONFIG_FOO),yes) wrapping.	2005-09-27 22:49:47 +02:00
Jonas Fonseca	68de9e35d3	Automagically link in subdir lib.o files It is a little ugly since I couldn't get $(wildcard) to expand .o files so it just checks if there are any .c files and then link in the lib.o based on that.	2005-09-27 22:38:00 +02:00
Jonas Fonseca	1efab31581	Simplify building of and linking with directories Ditch the building of an archive (.a) in favour of linking all objects in a directory into a lib.o file. This makes it easy to link in subdirectories and more importantly keeps the build logic in the local subdirectories. Note: after updating you will have to rm */.a if you do not make clean before updating.	2005-09-27 21:38:58 +02:00
Jonas Fonseca	b30064c0d0	Rename targets: -l -> -local	2005-09-27 21:11:28 +02:00
Jonas Fonseca	23497405cb	More installation fixes - Use the mkinstalldirs in $(top_srcdir)/config - Fix buggy scripting of srcdir which broke the local gettext install - Add the local uninstall stuff so we have it around	2005-09-23 20:30:56 +02:00
Jonas Fonseca	c2879b655b	Merge with git+ssh://pasky.or.cz/srv/git/elinks.git	2005-09-23 14:51:51 +02:00
Strahinya Radich	f86a009108	Add Serbian translation	2005-09-23 14:51:14 +02:00
Petr Baudis	bd0f5ba60d	We call $(mkinstalldirs) $(MKINSTALLDIRS) Should fix make install.	2005-09-23 09:08:47 +02:00
Jonas Fonseca	baadeebab1	More CVS -> GIT conversions Update INSTALL to both 'cover' GIT checkouts and requirements for the new build system. Hacking info no more advices to add CVS Id lines.	2005-09-18 18:47:20 +02:00
Petr Baudis	e000fffaa6	Fix linking gettext - actually build libintl.a. ;-)	2005-09-16 14:29:41 +02:00
Petr Baudis	03c2fb116d	ELBuildized intl/. Final nail in automake's coffin.	2005-09-16 13:38:05 +02:00
Petr Baudis	1fd3bff6f3	Converted src/intl/gettext/Makefile to ELBuild	2005-09-16 04:20:14 +02:00
Jonas Fonseca	7462f22635	Remove now obsolete .cvsignore files.	2005-09-15 18:33:20 +02:00
Jonas Fonseca	da11cf325a	Convert more bits from the .cvsignore files. Now cg-status doesn't list anything here. It might do in the future.	2005-09-15 18:25:37 +02:00
Petr Baudis	0f6d4310ad	Initial commit of the HEAD branch of the ELinks CVS repository, as of Thu Sep 15 15:57:07 CEST 2005. The previous history can be added to this by grafting.	2005-09-15 15:58:31 +02:00

1 2 3 4

176 Commits