One major milestone is to have the sbase-tools supporting UTF-8.
Tools like cut(1) with the -n flag don't make sense otherwise.
And while the gnu coreutils cut(1) blatantly ignores such an
important aspect, we will not tolerate this madness and mark it
as a TODO in the main README.
Since most tools inherently support UTF-8 anyway, this just concerns
tools which mangle with text or search in it in special ways.
and mark it as finished in README.
One small rationale on the way the manpage is set up: Looking at
the coreutils manpage, it does not invite to be a quick reference
guide, whereas I wrote this manpage to be short and concise in regard
to the information the advanced user needs.
No one needs to explain what an octal number is. That's not part of
the scope of this manpage.
Also, nobody wants to read a block of text just to find out how
to build an octal mode string.
to mark tools considered finished.
Finished doesn't mean work has stopped on these, but that these
programs are in a satisfying state according to the current suckless
coding practices, this includes having a
1) mandoc manpage
2) clean code
In most cases, 1) was the failing criterion. So in the interest of
finishing more tools and if you want to, well-written mandoc man-
pages are very much appreciated.
This is one aspect which I think has blown up the complexity of many
tr-implementations around today.
Instead of complicating the set-theory-based parser itself (he should
still be relying on one rune per char, not multirunes), I added a
preprocessor, which basically scans the code for upcoming '\'s, reads
what he finds, substitutes the real character onto '\'s index and shifts
the entire following array so there are no "holes".
What is left to reflect on is what to do with octal sequences.
I have a local implementation here, which works fine, but imho,
given tr is already so focused on UTF-8, we might as well ignore
POSIX at this point and rather implement the unicode UTF-8 code points,
which are way more contemporary and future-proof.
Reading in \uC3A4 as a an array of 0xC3 and 0xA4 is not the issue,
but I'm still struggling to find a way to turn it into a well-formed
byte sequence. Hit me with a mail if you have a simple solution for
that.
It's standard behaviour to map a whole class of matched objects
to the last element of a given simple set2 instead of just passing
it through.
Also, error out more strictly when the user gives us bogus sets.
tr(1) always used to be a saddening part of sbase, which was
inherently broken and crufted.
But to be fair, the POSIX-standard doesn't make it very simple.
Given the current version was unfixable and broken by design, I
sat down and rewrote tr(1) very close to the concept of set theory
and the POSIX-standard with a few exceptions:
- UTF-8: not allowed in POSIX, but in my opinion a must. This
finally allows you to work with UTF-8 streams without
problems or unexpected behaviour.
- Equivalence classes: Left out, even GNU coreutils ignore them
and depending on LC_COLLATE, which sucks.
- Character classes: No experiments or environment-variable-trickery.
Just plain definitions derived from the POSIX-
standard, working as expected.
I tested this thoroughly, but expect problems to show up in some
way given the wide range of input this program has to handle.
The only thing left on the TODO is to add support for literal
expressions ('\n', '\t', '\001', ...) and probably rethinking
the way [_*n] is unnecessarily restricted to string2.
1) No limit on number of months (removed MONTHMAX)
2) Strings printed to stdout rather than copied to an internal buffer
3) Rewritten date calculation algorithms
The list is getting too long. Leave the maintainers at the top
and add a section of authors/contributors below the license statement.
Helps maintain the license at the top of the file.
Consider the following code:
pw = getpwuid(uid);
if (!pw) {
if (errno)
...
else
...
}
If the entry was not found then as per POSIX errno is not set
because that is not considered to be a failing condition. errno
is only set if an internal error occurred.
If errno happened to be non-zero before the getpwuid() call
because of a previous error then we'll report a bogus error.
In this case, we have to set errno to zero before the call to
getpwuid().
However in ls(1) we only really care if the password entry was found
and we do not report any errors so setting errno to 0 is not necessary.
There's no point free-ing memory when the kernel can do it for us.
Just reuse the already allocated memory to hold lines.
Thanks Truls Becken for pointing this out.