elinks

mirror of https://github.com/rkd77/elinks.git synced 2024-12-04 14:46:47 -05:00

Author	SHA1	Message	Date
Kalle Olavi Niemitalo	62818a39f9	Unicode/gen-case: Upgrade ISC licence to July 2007 version I had already done this to my other scripts on 2008-09-28 (commit `c67885d880`) but missed Unicode/gen-case. Update it, and list it in COPYING. (Although Unicode/gen-case is part of the source tree, this version of ELinks does not use that file for anything.) (cherry picked from elinks-0.12 commit `c7602eb744`)	2012-11-03 23:01:28 +02:00
Kalle Olavi Niemitalo	12d66ff043	Bug 932: Redisable 0x80...0x9F mappings in some charsets. Bug 932 is about ELinks letting control characters 0x80...0x9F through to the terminal. It did not occur with ISO 8859-1, 8859-2, 8859-15, or 8859-16, because the ELinks mappings for those charsets did not include those bytes. However, the www.unicode.org versions imported in the previous commit do include the problematic bytes. To avoid a possible regression before the ELinks 0.12.0 release, comment those control-character mappings out again. This workaround should be reverted after bug 932 has been fixed properly.	2008-10-11 15:35:34 +03:00
Kalle Olavi Niemitalo	c9ca6fd448	Refresh charsets from www.unicode.org. Add copyright and licence notices, and a NEWS entry. The data in the new versions is not entirely the same as what ELinks used to have: - Unicode/8859_1.cp: Adds control characters. - Unicode/8859_2.cp: Adds control characters. - Unicode/8859_4.cp: Adds some control characters that ELinks assumed there already. - Unicode/8859_7.cp: Adds three characters. - Unicode/8859_15.cp: Adds control characters. - Unicode/8859_16.cp: Adds control characters and swaps 0xA5 with 0xAB. - Unicode/koi8_r.cp: Changes 0x95 and adds some control characters that ELinks assumed there already. - Unicode/macroman.cp: Changes 0xC6 and removes some control characters that ELinks assumes there anyway.	2008-10-11 15:35:09 +03:00
Kalle Olavi Niemitalo	e2d7ce588f	Relicense my Perl scripts to ISC license The primary motivation for this change is that the disclaimer now refers to the author whereas it used to refer to the copyright holder. The ISC license is the preferred license for new code in OpenBSD. http://www.openbsd.org/policy.html http://www.openbsd.org/cgi-bin/cvsweb/src/share/misc/license.template?rev=1.2 I am also removing the reference to "the same terms as Perl itself" because those terms are not being distributed with ELinks. Anyway, Perl 5 is dual licensed under the Artistic License and the GNU General Public License (version 1 or later), and the ISC license seems GPL compatible to me.	2008-03-23 13:28:06 +02:00
Kalle Olavi Niemitalo	a6886634bc	Make unicode_7b[] static const. The .data section of src/intl/charsets.o is only 40 bytes now. Inspired by bug 381.	2007-02-03 23:25:16 +02:00
Kalle Olavi Niemitalo	974a5cdffd	Make entities[] static const. Inspired by bug 381.	2007-02-03 19:51:45 +02:00
Kalle Olavi Niemitalo	65645624b4	cp1250, cp1257: Don't map undefined bytes to U+0000.	2007-01-27 09:58:18 +02:00
Kalle Olavi Niemitalo	4a5af7fd26	Bug 381: Store codepage-to-Unicode mappings as dense arrays. Previously, each mapping between a codepage byte and a Unicode character was stored as a struct table_entry, which listed both the byte and the character. This representation may be optimal for sparse mappings, but codepages map almost every possible byte to a character, so it is more efficient to just have an array that lists the Unicode character corresponding to each byte from 0x80 to 0xFF. The bytes are not stored but rather implied by the array index. The tcvn5712 and viscii codepages have a total of four mappings that do not fit in the arrays, so we still use struct table_entry for those. This change also makes cp2u() operate in O(1) time and may speed up other functions as well. The "sed \| while read" concoction in Unicode/gen-cp looks rather unhealthy. It would probably be faster and more readable if rewritten in Perl, but IMO that goes for the previous version as well, so I suppose whoever wrote it had a reason not to use Perl here. Before: text data bss dec hex filename 38948 28528 3311 70787 11483 src/intl/charsets.o 500096 85568 82112 667776 a3080 src/elinks After: text data bss dec hex filename 31558 28528 3311 63397 f7a5 src/intl/charsets.o 492878 85568 82112 660558 a144e src/elinks So the text section shrank by 7390 bytes. Measured on i686-pc-linux-gnu with: --disable-xbel --disable-nls --disable-cookies --disable-formhist --disable-globhist --disable-mailcap --disable-mimetypes --disable-smb --disable-mouse --disable-sysmouse --disable-leds --disable-marks --disable-css --enable-small --enable-utf-8 --without-gpm --without-bzlib --without-idn --without-spidermonkey --without-lua --without-gnutls --without-openssl CFLAGS="-Os -ggdb -Wall"	2006-09-24 16:55:29 +03:00
Kalle Olavi Niemitalo	62d6db44aa	Bug 381: Make codepage data const. Before: text data bss dec hex filename 25726 62992 3343 92061 1679d src/intl/charsets.o 653856 120020 82144 856020 d0fd4 src/elinks After: text data bss dec hex filename 60190 28528 3311 92029 1677d src/intl/charsets.o 688320 85556 82112 855988 d0fb4 src/elinks So 34464 bytes were moved from the data section to the text section and should be more likely to get shared between ELinks processes. Measured on i686-pc-linux-gnu with: --disable-xbel --disable-nls --disable-cookies --disable-formhist --disable-globhist --disable-mailcap --disable-mimetypes --disable-smb --disable-mouse --disable-sysmouse --disable-leds --disable-marks --disable-css --enable-small --enable-utf-8 --without-gpm --without-bzlib --without-idn --without-spidermonkey --without-lua --without-gnutls --without-openssl CFLAGS="-O2 -ggdb -Wall"	2006-09-24 11:59:23 +03:00
Kalle Olavi Niemitalo	9c94a896b7	Internally rename the utf_8 codepage to utf8.	2006-09-17 16:23:17 +03:00
Kalle Olavi Niemitalo	f7fd49cf28	UTF-8: New function unicode_fold_label_case and a related script.	2006-08-06 20:02:42 +00:00
Jonas Fonseca	d2e346436a	Hmm, seem b.delta decided not to become 0x03B4 like it should	2006-01-10 15:39:11 +01:00
Jonas Fonseca	aa75cade23	Reinsert part of comment for nVDash; fixes `8e0eda5e4d`	2006-01-03 23:38:37 +01:00
Jonas Fonseca	b5065e7a17	Add header about where to get the SGML entity database from unicode.org ... and summon up the local changes made.	2006-01-03 17:20:50 +01:00
Jonas Fonseca	8c684e8c73	Skip entities with unknown unicode (0x????)	2006-01-03 17:12:58 +01:00
Jonas Fonseca	395a64f569	Merge in the public entity set names from the unicode.org database This also fixes b.delta to have the correct value 0x03B4. The main difference to ELinks' entity database is: - entities not in the unicode database from 1997: Scomma, Tcomma, euro, scomma, tcomma - obsolete entities kept for compatibility: emdash, endash, hibar	2006-01-03 17:10:19 +01:00
Jonas Fonseca	8e0eda5e4d	Merge in the 0x???? chars and fix some incomplete descriptions	2006-01-03 16:48:11 +01:00
Jonas Fonseca	3e6c08ce12	Move the SGML entity database back to the format used by unicode.org	2006-01-03 16:43:31 +01:00
Jonas Fonseca	af089507dc	Remove unneeded Unicode/.gitignore	2005-12-25 02:41:03 +01:00
Laurent MONIN	df065ead80	Remove now useless $Id: lines.	2005-10-21 09:14:07 +02:00
Petr Baudis	06ea255a22	Convert part of the build to the new build system The root makefile is converted as well as some leaf Makefiles. This also brings in the required infrastructure and adjusts configure.in appropriately. I converted only makefiles containing no configurable stuff, since that'll require more consideration yet.	2005-09-15 21:03:56 +02:00
Jonas Fonseca	7462f22635	Remove now obsolete .cvsignore files.	2005-09-15 18:33:20 +02:00
Jonas Fonseca	e54f78bf3f	Oops, missed the generated stuff in Unicode/.	2005-09-15 18:29:59 +02:00
Petr Baudis	0f6d4310ad	Initial commit of the HEAD branch of the ELinks CVS repository, as of Thu Sep 15 15:57:07 CEST 2005. The previous history can be added to this by grafting.	2005-09-15 15:58:31 +02:00

24 Commits