We should never mix FILE I/O with raw I/O. Going from raw I/O
to FILE I/O is fine but doing the opposite is extremely tricky and
only works under certain conditions (unbuffered stream + no call
to ungetc()).
We cannot in general detect that truncation happened. At the moment
we use a heuristic to compare the file size before and after a write
happened. If the new file size is smaller than the old, we correctly
handle truncation and dump the entire file to stdout.
If it so happened that the new size is larger or equal to the old size
after the file had been truncated without any reads in between, we will
assume the data was appended to the file.
There is no known way around this other than using inotify or kevent
which is outside the scope of sbase.
and mark it as finished in the README.
Specifically, add a small section on the compression flags, which
are basically an infected GNU limb which should be removed from
the face of the earth as soon as possible.
The algorithm had some areas which had potential for improvement.
This should make cmp(1) faster.
There have been changes to behaviour as well:
1) If argv[0] and argv[1] are the same, cmp(1) returns Same.
2) POSIX specifies the format of the difference-message to be:
"%s %s differ: char %d, line %d\n", file1, file2,
<byte number>, <line number>
However, as cmp(1) operates on bytes, not characters, I changed
it to
"%s %s differ: byte %d, line %d\n", file1, file2,
<byte number>, <line number>
This is one example where the standard just keeps the old format
for backwards-compatibility. As this is harmful, this change
makes sense in the sense of consistentcy (and because we take
the difference of char and byte very seriously in sbase, as
opposed to GNU coreutils).
The manpage has been annotated, reflecting the second change, and
sections shortened where possible.
Thus I marked cmp(1) as finished in README.
using readrune() and iswspace().
musl for instance doesn't differentiate between iswspace() and
isspace(), but when it does, the code will be ready.
It goes without saying that GNU coreutils don't use iswspace()[0].
[0]: http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/wc.c
Use size_t for all counts, fix the manpage and refactor the code.
Here's yet another place where GNU coreutils fail:
sbase:
$ echo "GNU/Turd sucks" | wc -cm
15
coreutils:
$ echo "GNU/Turd sucks" | wc -cm
15 15
Take a bloody guess which behaviour is correct[0].
[0]: http://pubs.opengroup.org/onlinepubs/009604499/utilities/wc.html