This optimizes the binary size for each tool that uses these functions.
Previously, if a program just used one single function, maybe even a
one-liner, it would statically compile in all lookup-tables, bloating
the binary by up to 20K.
All these changes are derived from a local libutf where I do the
primary changes. So I hope that I can merge these things into libutf
sooner or later, as discussed on the ml.
This basically means that we now have an autogenerating typecheck
and case-conversion tool.
Don't freak out when you see the added LOC. Given we now have
an additional mapping to the uppercase-characters, some ranges got
"lost" and have to be written literally by the generating awk-script.
The runetypebody.h was generated by myself using my modified version
of mkrunetype.awk and I'll push the changed version as soon as this
has been discussed on the ml.
If you worry about speed, consider, that bsearch is just the right
tool for this job and can even handle a long array like this.
This is a useful behavior if you want to reorder the lines,
because otherwise you might end up with originally two lines
on one, e.g.
$ echo -ne "foo\nbar" | sort
barfoo
We should never mix FILE I/O with raw I/O. Going from raw I/O
to FILE I/O is fine but doing the opposite is extremely tricky and
only works under certain conditions (unbuffered stream + no call
to ungetc()).
We cannot in general detect that truncation happened. At the moment
we use a heuristic to compare the file size before and after a write
happened. If the new file size is smaller than the old, we correctly
handle truncation and dump the entire file to stdout.
If it so happened that the new size is larger or equal to the old size
after the file had been truncated without any reads in between, we will
assume the data was appended to the file.
There is no known way around this other than using inotify or kevent
which is outside the scope of sbase.