C99 6.7.4p3 and 6.7.4p6 set some constraints on what can be done in
inline functions and how they can be declared. In particular, any
function declared inline must also be defined in the same translation
unit. To comply with that, remove inline specifiers from function
declarations in header files when the functions are not also defined
in those header files.
Sun Studio 11 on Solaris 9 is stricter than C99 and does not allow
references to static identifiers in extern inline functions. Make the
configure script detect this and define NONSTATIC_INLINE accordingly
in config.h. Then use that in the definitions of all non-static
inline functions.
Document the restrictions and this scheme in doc/hacking.txt.
Debian libmozjs-dev 1.9.0.4-2 has JS_ReportAllocationOverflow but
js-1.7.0 reportedly hasn't. Check at configure time whether that
function is available. If not, use JS_ReportOutOfMemory instead.
Reported by Witold Filipczyk.
I didn't read the code of the tre library, but I suppose that when sizes of
wchar_t and unicode_val_T are equal it will work fine.
[ From bug 1060 attachment 508. --KON ]
When the user tells ELinks to search for a regexp, ELinks 0.11.0
passes the regexp to regcomp() and the formatted document to
regexec(), both in the terminal charset. This works OK for unibyte
ASCII-compatible charsets because the regexp metacharacters are all in
the ASCII range. And ELinks 0.11.0 doesn't support multibyte or
ASCII-incompatible (e.g. EBCDIC) charsets in terminals, so it is no
big deal if regexp searches fail in such locales.
ELinks 0.12pre1 attempts to support UTF-8 as the terminal charset if
CONFIG_UTF8 is defined. Then, struct search contains unicode_val_T c
rather than unsigned char c, and get_srch() and add_srch_chr()
together save UTF-32 values there if the terminal charset is UTF-8.
In plain-text searches, is_in_range_plain() compares those values
directly if the search is case sensitive, or folds them to lower case
if the search is case insensitive: with towlower() if the terminal
charset is UTF-8, or with tolower() otherwise. In regexp searches
however, get_search_region_from_search_nodes() still truncates all
values to 8 bits in order to generate the string that
search_for_pattern() then passes to regexec(). In UTF-8 locales,
regexec() expects this string to be in UTF-8 and can't make sense of
the truncated characters. There is also a possible conflict in
regcomp() if the locale is UTF-8 but the terminal charset is not, or
vice versa.
Rejected ways of fixing the charset mismatches:
* When the terminal charset is UTF-8, recode the formatted document
from UTF-32 to UTF-8 for regexp searching. This would work if the
terminal and the locale both use UTF-8, or if both use unibyte
ASCII-compatible charsets, but not if only one of them uses UTF-8.
* Convert both the regexp and the formatted document to the charset of
the locale, as that is what regcomp() and regexec() expect. ELinks
would have to somehow keep track of which bytes in the converted
string correspond to which characters in the document; not entirely
trivial because convert_string() can replace a single unconvertible
character with a string of ASCII characters. If ELinks were
eventually changed to use iconv() for unrecognized charsets, such
tracking would become even harder.
* Temporarily switch to a locale that uses the charset of the
terminal. Unfortunately, it seems there is no portable way to
construct a name for such a locale. It is also possible that no
suitable locale is available; especially on Windows, whose C library
defines MB_LEN_MAX as 2 and thus cannot support UTF-8 locales.
Instead, this commit makes ELinks do the regexp matching with regwcomp
and regwexec from the TRE library. This way, ELinks can losslessly
recode both the pattern and the document to Unicode and rely on the
regexp code in TRE decoding them properly, regardless of locale.
There are some possible problems though:
1. ELinks stores strings as UTF-32 in arrays of unicode_val_T, but TRE
uses wchar_t instead. If wchar_t is UTF-16, as it is on Microsoft
Windows, then TRE will misdecode the strings. It wouldn't be too
hard to make ELinks convert to UTF-16 in this case, but (a) TRE
doesn't currently support UTF-16 either, and it seems possible that
wchar_t-independent UTF-32 interfaces will be added to TRE; and (b)
there seems to be little interest on using ELinks on Windows anyway.
2. The Citrus Project apparently wanted BSD to use a locale-dependent
wchar_t: e.g. UTF-32 in some locales and an ISO 2022 derivative in
others. Regexp searches in ELinks now do not support the latter.
[ Adapted to elinks-0.12 from bug 1060 attachment 506.
Commit message by me. --KON ]
With Sun Studio 11 on Solaris 9, we get "cc: Warning: illegal option
-dynamic"; then, cc proceeds anyway, but the option can prevent the
linker from finding the libraries listed in -l operands. To detect
this, move the -rdynamic check in configure.in down to a place where
the libraries have already been added to $LDFLAGS. So if -rdynamic
interferes with the search for libraries, ELinks won't use it.
Merely moving the test would also change the location of -rdynamic in
$LDFLAGS. Counteract that by making the test add -rdynamic to the
beginning of $LDFLAGS, rather than to the end. This may make the test
more reliable on Solaris.
ELinks used to call the MD5 code in libgnutls-openssl, part of
GNUTLS-EXTRA, which was licensed under GNU GPL version 2 or later.
In GnuTLS 2.2.0 however, the license of GNUTLS-EXTRA has been changed
to GNU GPL version 3 or later. This is no longer compatible with
GNU GPL version 2 as used in the current ELinks, because GPLv2 clause
2. b) requires the whole work to be licensed under GPLv2, and GPLv3
does not allow that.
If anyone is still using a pre-2.2 GnuTLS, he or she can tweak
configure.in to check the version or just assume it's old enough.
There is not much reason to do so though, as including the MD5 code
in ELinks seems to cost only about 4 kilobytes on i686.
(cherry picked from commit 9ca0182ec6)
On Mac OS X 10.5.4, <net/if.h> does not #include <sys/socket.h> but
uses struct sockaddr defined there. Autoconf 2.61 generates a
configure script that warns if the header can be preprocessed but not
compiled. The Autoconf manual cautions that future versions of
Autoconf will treat the file as missing in this case. To let ELinks
detect <net/if.h> even with a future Autoconf, make the test program
#include <sys/socket.h> before <net/if.h>.
The build ID now includes both last tagged version, commit generation
since last tagged version, as well as the leading characters of the
commit ID and a flag for dirty working tree.
(cherry picked from commit c2a0d3b969)
Autoconf and m4 copy the "#" comments from configure.in to configure.
This should make it easier to find the part of configure that was
expanded from a specific check in configure.in.
ELinks does not yet work with Lua 5.1 (see bug 742),
so the configure script should not suggest that it does.
When bug 742 is eventually fixed, ELinks should probably
prefer Lua 5.1 over 5.0, as the newer version seems likely
to be kept installed longer.
(cherry picked from commit dc747a6ec3)
Actually, don't use the cfmakeraw function at all,
and don't look for it during configure either.
(cherry picked from commit 87f1661314
but moved the NEWS entry into the 0.12 section)
Because features.conf now contains CONFIG_UTF8=yes,
configure --enable-utf-8 is not useful, and it is better
to document configure --disable-utf-8.
Reported by witekfl.