and mark it as finished in the README.
Specifically, add a small section on the compression flags, which
are basically an infected GNU limb which should be removed from
the face of the earth as soon as possible.
The algorithm had some areas which had potential for improvement.
This should make cmp(1) faster.
There have been changes to behaviour as well:
1) If argv[0] and argv[1] are the same, cmp(1) returns Same.
2) POSIX specifies the format of the difference-message to be:
"%s %s differ: char %d, line %d\n", file1, file2,
<byte number>, <line number>
However, as cmp(1) operates on bytes, not characters, I changed
it to
"%s %s differ: byte %d, line %d\n", file1, file2,
<byte number>, <line number>
This is one example where the standard just keeps the old format
for backwards-compatibility. As this is harmful, this change
makes sense in the sense of consistentcy (and because we take
the difference of char and byte very seriously in sbase, as
opposed to GNU coreutils).
The manpage has been annotated, reflecting the second change, and
sections shortened where possible.
Thus I marked cmp(1) as finished in README.
Use size_t for all counts, fix the manpage and refactor the code.
Here's yet another place where GNU coreutils fail:
sbase:
$ echo "GNU/Turd sucks" | wc -cm
15
coreutils:
$ echo "GNU/Turd sucks" | wc -cm
15 15
Take a bloody guess which behaviour is correct[0].
[0]: http://pubs.opengroup.org/onlinepubs/009604499/utilities/wc.html
and mark it as finished in the README.
Previously, it would only parse octal mode strings. Given
we have the parsemode()-function in util.h anyway, why not
also use it?
and mark it as finished in the README.
This is another example showing how broken the GNU coreutils are:
$ echo -e "äää\tüüü\tööö" | gnu-expand -t "5,10,20"
äää üüü ööö
$ echo -e "äää\tüüü\tööö" | sbase-expand -t "5,10,20"
äää üüü ööö
This is due to the fact that they are still not UTF8-aware and
actually see "ä" as two single characters, expanding the "äää" with
4 spaces to a tab of length 10.
The correct way however is to expand the "äää" with 2 spaces to a
tab of length 5.
One can only imagine how this silently breaks a lot of code around
the world.
WHAT WERE THEY THINKING?