While auditing du(1) I realized that there's no way the over 100 lines
of procedures in du() would pass the audit.
Instead, I decided to rewrite this section using recurse() from libutil.
However, the issue was that you'd need some kind of payload to count
the number of bytes in the subdirectories and use them in the higher
hierarchies.
The solution is to add a "void *data" data pointer to each recurse-
function-prototype, which we might also be able to use in other
recurse-applications.
recurse() itself had to be augmented with a recurse_samedev-flag, which
basically prevents recurse from leaving the current device.
Now, let's take a closer look at the audit:
1) Removing the now unnecessary util-functions push, pop, xrealpath,
rename print() to printpath(), localize some global variables.
2) Only pass the block count to nblks instead of the entire stat-
pointer.
3) Fix estrtonum to use the minimum of LLONG_MAX and SIZE_MAX.
4) Use idiomatic argv+argc-loop
5) Report proper exit-status.
1) Add check to parselist() to warn about an empty list.
2) Remove all "cut: "-prefixes from error-messages and other style
changes.
3) != -1 --> >= 0 and check for ferror on fp after getline.
4) Update usage with argv0.
5) argv-centric loop refactor
6) Properly report exit-status.
7) Add empty line before return.
1) Use the LIMIT()-macro in util.h instead of defining our own.
2) Drop nextline() and finish(), not needed anymore. Use
fputs in printline instead of printf.
--> BUGFIX: Finish exited with status 1, but actually should
exit with status 0 if ferror(f) == 0.
3) Don't use /dev/fd/0 and use idiomatic <stdin> and fp = stdin
instead.
4) Refactor loop to use getline() instead of some handrolled
nextline-function.
--> BUGFIX: Line-length was limited to LINE_MAX before, now
it's factually unlimited.
5) Combine diff >= 0 and diff <= 0 into one loop with a beginning
continue-condition (diff && i == (diff < 0)).
6) BUGFIX: If diff == 0, don't print one buffer after EOFing on the
other.
1) Remove the return-value-enum, which is not necessary for a simple
program like this.
2) Don't disallow both l and s to be specified. This is undefined
behaviour defined by POSIX, so we don't start demanding things
from the user.
3) Replace exit() with return (we are in main).
4) Refactor main loop to never return in the loop, but actually
set the same-value and break, which increases readability.
5) Remove the final fclose()'s. The OS will take care of them, no
need to become cleansy here.
6) Use idiomatic return-value using same. This concludes the
increase of readability in the main-loop.
After a short correspondence with Otto Moerbeek it turned out
mallocarray() is only in the OpenBSD-Kernel, because the kernel-
malloc doesn't have realloc.
Userspace applications should rather use reallocarray with an
explicit NULL-pointer.
Assuming reallocarray() will become available in c-stdlibs in the
next few years, we nip mallocarray() in the bud to allow an easy
transition to a system-provided version when the day comes.
1) Reorder local variables.
2) Cleanup error messages, use %zu for size_t.
3) combine putchar(' ') and fputs to substitute printf(" %s", s).
4) Fix usage().
5) argv-argc-usage-fix.
6) Add empty line before return.
A function used only in the OpenBSD-Kernel as of now, but it surely
provides a helpful interface when you just don't want to make sure
the incoming pointer to erealloc() is really NULL so it behaves
like malloc, making it a bit more safer.
Talking about *allocarray(): It's definitely a major step in code-
hardening. Especially as a system administrator, you should be
able to trust your core tools without having to worry about segfaults
like this, which can easily lead to privilege escalation.
How do the GNU coreutils handle this?
$ strings -n 4611686018427387903
strings: invalid minimum string length -1
$ strings -n 4611686018427387904
strings: invalid minimum string length 0
They silently overflow...
In comparison, sbase:
$ strings -n 4611686018427387903
mallocarray: out of memory
$ strings -n 4611686018427387904
mallocarray: out of memory
The first out of memory is actually a true OOM returned by malloc,
whereas the second one is a detected overflow, which is not marked
in a special way.
Now tell me which diagnostic error-messages are easier to understand.
Stateless and I stumbled upon this issue while discussing the
semantics of read, accepting a size_t but only being able to return
ssize_t, effectively lacking the ability to report successful
reads > SSIZE_MAX.
The discussion went along and we came to the topic of input-based
memory allocations. Basically, it was possible for the argument
to a memory-allocation-function to overflow, leading to a segfault
later.
The OpenBSD-guys came up with the ingenious reallocarray-function,
and I implemented it as ereallocarray, which automatically returns
on error.
Read more about it here[0].
A simple testcase is this (courtesy to stateless):
$ sbase-strings -n (2^(32|64) / 4)
This will segfault before this patch and properly return an OOM-
situation afterwards (thanks to the overflow-check in reallocarray).
[0]: http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man3/calloc.3
Allows dropping a local variable if the explicit PID is not needed
and it makes it clearer what happens.
Also, one should always strive for consistency for cases like these.
Quoting POSIX[0]:
"Care should be taken, also, to call _exit() rather than exit() if exec cannot be used, since
exit() flushes and closes standard I/O channels, thereby damaging the parent process' standard
I/O data structures. (Even with fork(), it is wrong to call exit(), since buffered data would
then be flushed twice.)"
[0]: http://pubs.opengroup.org/onlinepubs/009695399/functions/vfork.html
Similar to the chgrp(1)-audit:
1) Refactor manpage so it's actually fun to read
2) BUGFIX: Call (l)chown properly when the H-flag is specified
(only when depth > 0)
3) BUGFIX: Call (l)chown properly when the h-flag is specified
(only when depth = 0).
4) BUGFIX: Only recurse() in chgrp() when the initial chownf()
succeeds.
5) Style fixes, argv-basing.
6) Rename status to ret for consistency.
7) Add blank line before return.
1) Refactor manpage so it's actually fun to read.
2) BUGFIX: Call (l)chown properly when the H-flag is specified
(only when depth > 0).
3) BUGFIX: Call (l)chown properly when the h-flag is specified
(only when depth = 0).
4) BUGFIX: Only recurse() in chgrp() when the initial chownf()
succeeds.
5) Style fixes, argv-basing.
6) Rename status to ret for consistency.
7) Add blank line before return.
1) Update manpage with the num-syntax.
2) Use size_t for years and derivatives.
3) Use putchar instead of printf wherever possible.
4) Update usage().
5) Style changes.
1) Refactor manpage.
2) De-globalize local values.
3) update usage().
4) sort local variable declarations.
5) fix wrong argument in strtonum (3 -> 1).
6) argc-argv style, boolean style.
7) check bytes > 0 before accessing b.lines[i][bytes - 1]
relying on len only makes sense but let's not push it.
7) don't break on maxlen > (chars - 1) / 2. This didn't even
make sense.
8) _correctly_ calculate cols and rows in a readable way.
9) Rewrite loop over rows and cols in a readable way and
using putchar in a loop instead of printf-magic or fputs
where not necessary.
- Also fix a few typos, style and section order.
- Changed the text "800 characters per line" to "800 bytes per line" as col
doesn't seem to support UTF-8 right now.
1) don't mix declarations and code (leave recursion alone for now as I
plan on changing/using recurse)
2) change **argv to *argv[]
3) check for error on fork()
We'll probably develop this outside of sbase. A simple script that
parses /etc/magic and generates magic.h would be sufficient.
The table can be huge and we do not want to bloat up binary size
only for file(1).
1) Clarify behaviour when the f-flag is given and a target is in its
own way.
2) Fix usage()-style.
3) Group local variable declarations.
4) reorder args
5) argc style, other boolean style changes
6) improve error messages
7) set argv[argc - 1] to NULL to allow argv-centric loop later
8) BUGFIX: POSIX specifies that when with the f-flag there's a
situation where a file stands in its own way for linking it
should be ignored.
9) Add weprintf() where possible, so we don't pussy out when there's
a small issue. This is sbase ffs!
1) Update manpage, refactor the HLP-section and other wordings.
2) BUGFIX: If chmod() fails, don't recurse.
3) Rewrite the arg-loop, fixing several issues:
BUGFIX: Handle multi-flags (e.g. -RH)
BUGFIX: Properly handle the termination flag --, error on e.g. --x
BUGFIX: Error out on an empty flag -.
4) Refactor logic after the arg-loop, which is now simpler thanks
to argv-incremention.
1) No need for strchr() in mkdirp or a while-loop. Rewrite it in
a sane and readable way.
2) fix usage according to the manpage.
3) order includes, don't align local variables.
4) argc-style-fix.
5) BUGFIX: Don't try to chmod() *argv when mkdir() / mkdirp() failed.
6) Add newline before return in two places.
1) use arg.h
2) !strcmp
3) **argv to *argv[]
4) fix test to check if basename(argv0) == "[" but avoid
basename(3p) as it may change the contents of the string
passed to it and I didn't want to make a copy.
1) Use (s)size_t in head().
2) BUGFIX: only check buf[len - 1] when len > 0, else there would
be an overflow when getline returns 0 (which can happen) and a
very potential segmentation fault.
3) fix error-messages.
4) update usage().
5) argv-argc-style.
6) clear up the main loop with if (newline).
7) add newline before return.