Commit Graph

37 Commits

Author SHA1 Message Date
Michael Forney 371f3cb5ec find: Flush stdout before executing commands
The commands may produce output themselves, so this is required to
ensure that our -print output is actually written before a command's
output, and also that we don't end up with partially written lines
due to stdout buffering intermixed with the output of other commands.

Other implementations of find(1) do the same.
2021-09-10 22:48:47 -07:00
Michael Forney c331811c91 find: Add spawn helper function 2021-09-10 22:43:54 -07:00
Michael Forney e6b6f34506 find: Fix buffer overflow in token stack
The stack is used for two purposes: storing operators for the
shunting yard algorithm, and storing primitives when arranging the
operators into a tree. The number of operators is bounded by the
number of arguments, since we only insert at most one extra operator
per primitive. However, the number of primitives may be as high as
argc + 1, since -print may have been added implicitly.

This can triggered with an empty expression, `find .`, since in
this case argc is 0, but we still try to store -print in the stack.

Detected with musl's WIP allocator, mallocng-draft.
2020-05-12 20:01:43 -07:00
Michael Forney 48d04ae446 find: Make -H and -L flag handling clearer 2018-09-25 19:16:19 -07:00
Evan Gates d2bd40a589 find: remove VLAs
Laslo: Use ereallocarray and fix the style a bit
2016-12-27 12:46:06 +01:00
Evan Gates 0b27c0c9a0 find: estrdup before basename
"The basename() function may modify the string pointed to by path..."
Thanks POSIX

Laslo: Changed the style a bit
2016-12-27 12:37:57 +01:00
Mattias Andrée 96ccf5172d find: check whether readdir failed, and properly check timestamps
Signed-off-by: Mattias Andrée <>
2016-10-05 18:48:10 +02:00
Quentin Rameau 6e7743eb56 Cleanup usage() across sbase
Some tools didn't use argv0 for tool name, or usage() at all.
2015-12-21 18:07:25 +00:00
sin 2366164de7 No need for semicolon after ARGEND
This is also the style used in Plan 9.
2015-11-01 10:18:55 +00:00
Wolfgang Corcoran-Mathe 5dab8bbfa9 find: NULL terminate braces array in exec primary 2015-07-23 11:51:05 +01:00
Wolfgang Corcoran-Mathe 6e7cbdd918 find: Improve prompt spacing with -ok 2015-06-18 14:24:15 +01:00
Wolfgang Corcoran-Mathe cf769f2d8d find: Fix flushing input buffer with -ok
The original flush-stdin loop (with fgets()) hung until the user
entered some extraneous characters for it to kill.

emg's FIXME about nulls still applies.
2015-06-18 14:24:15 +01:00
Evan Gates 1490b37bd2 NULL terminate braces array in get_ok_arg 2015-06-18 14:24:14 +01:00
FRIGN d23cc72490 Simplify return & fshut() logic
Get rid of the !!()-constructs and use ret where available (or introduce it).

In some cases, there would be an "abort" on the first fshut-error, but we want
to close all files and report all warnings and then quit, not just the warning
for the first file.
2015-05-26 16:41:43 +01:00
sin f90354bfc9 find: Allow using multiple paths 2015-05-15 12:07:43 +01:00
Evan Gates 86f9ce55a8 need implicit -a between ) and ! 2015-05-13 23:43:33 +01:00
Evan Gates 1dbb848037 intptr_t is an optional type, use intmax_t 2015-04-14 16:42:43 +01:00
FRIGN 11e2d472bf Add *fshut() functions to properly flush file streams
This has been a known issue for a long time. Example:

printf "word" > /dev/full

wouldn't report there's not enough space on the device.
This is due to the fact that every libc has internal buffers
for stdout which store fragments of written data until they reach
a certain size or on some callback to flush them all at once to the
You can force the libc to flush them with fflush(). In case flushing
fails, you can check the return value of fflush() and report an error.

However, previously, sbase didn't have such checks and without fflush(),
the libc silently flushes the buffers on exit without checking the errors.
No offense, but there's no way for the libc to report errors in the exit-

GNU coreutils solve this by having onexit-callbacks to handle the flushing
and report issues, but they have obvious deficiencies.
After long discussions on IRC, we came to the conclusion that checking the
return value of every io-function would be a bit too much, and having a
general-purpose fclose-wrapper would be the best way to go.

It turned out that fclose() alone is not enough to detect errors. The right
way to do it is to fflush() + check ferror on the fp and then to a fclose().
This is what fshut does and that's how it's done before each return.
The return value is obviously affected, reporting an error in case a flush
or close failed, but also when reading failed for some reason, the error-
state is caught.

the !!( ... + ...) construction is used to call all functions inside the
brackets and not "terminating" on the first.
We want errors to be reported, but there's no reason to stop flushing buffers
when one other file buffer has issues.
Obviously, functionales come before the flush and ret-logic comes after to
prevent early exits as well without reporting warnings if there are any.

One more advantage of fshut() is that it is even able to report errors
on obscure NFS-setups which the other coreutils are unable to detect,
because they only check the return-value of fflush() and fclose(),
not ferror() as well.
2015-04-05 09:13:56 +01:00
Hiltjo Posthuma 7ebf02d749 find: fgetc() returns int 2015-03-27 22:48:05 +01:00
Evan Gates cf5114a133 untypedef expr, find, test, as is existing style in sbase 2015-03-17 20:04:26 +00:00
FRIGN 93fd817536 Add estrlcat() and estrlcpy()
It has become a common idiom in sbase to check strlcat() and strlcpy()

if (strl{cat, cpy}(dst, src, siz) >= siz)
        eprintf("path too long\n");

However, this was not carried out consistently and to this very day,
some tools employed unchecked calls to these functions, effectively
allowing silent truncations to happen, which in turn may lead to
security issues.
To finally put an end to this, the e*-functions detect truncation
automatically and the caller can lean back and enjoy coding without
trouble. :)
2015-03-17 11:24:49 +01:00
FRIGN 833c2aebb4 Remove mallocarray(...) and use reallocarray(NULL, ...)
After a short correspondence with Otto Moerbeek it turned out
mallocarray() is only in the OpenBSD-Kernel, because the kernel-
malloc doesn't have realloc.
Userspace applications should rather use reallocarray with an
explicit NULL-pointer.

Assuming reallocarray() will become available in c-stdlibs in the
next few years, we nip mallocarray() in the bud to allow an easy
transition to a system-provided version when the day comes.
2015-03-11 10:50:18 +01:00
FRIGN 3c33abc520 Implement mallocarray()
A function used only in the OpenBSD-Kernel as of now, but it surely
provides a helpful interface when you just don't want to make sure
the incoming pointer to erealloc() is really NULL so it behaves
like malloc, making it a bit more safer.

Talking about *allocarray(): It's definitely a major step in code-
hardening. Especially as a system administrator, you should be
able to trust your core tools without having to worry about segfaults
like this, which can easily lead to privilege escalation.

How do the GNU coreutils handle this?
$ strings -n 4611686018427387903
strings: invalid minimum string length -1
$ strings -n 4611686018427387904
strings: invalid minimum string length 0

They silently overflow...

In comparison, sbase:

$ strings -n 4611686018427387903
mallocarray: out of memory
$ strings -n 4611686018427387904
mallocarray: out of memory

The first out of memory is actually a true OOM returned by malloc,
whereas the second one is a detected overflow, which is not marked
in a special way.
Now tell me which diagnostic error-messages are easier to understand.
2015-03-10 22:19:19 +01:00
FRIGN 3b825735d8 Implement reallocarray()
Stateless and I stumbled upon this issue while discussing the
semantics of read, accepting a size_t but only being able to return
ssize_t, effectively lacking the ability to report successful
reads > SSIZE_MAX.
The discussion went along and we came to the topic of input-based
memory allocations. Basically, it was possible for the argument
to a memory-allocation-function to overflow, leading to a segfault
The OpenBSD-guys came up with the ingenious reallocarray-function,
and I implemented it as ereallocarray, which automatically returns
on error.
Read more about it here[0].

A simple testcase is this (courtesy to stateless):
$ sbase-strings -n (2^(32|64) / 4)

This will segfault before this patch and properly return an OOM-
situation afterwards (thanks to the overflow-check in reallocarray).

2015-03-10 21:23:36 +01:00
Evan Gates d21a958d88 bug and style fixes in find
1) don't mix declarations and code (leave recursion alone for now as I
plan on changing/using recurse)
2) change **argv to *argv[]
3) check for error on fork()
2015-03-06 18:31:14 +00:00
Evan Gates 3eed1fced6 a bunch of cleanup 2015-02-25 10:11:55 +00:00
Evan Gates e427364a28 use struct literal instead of filling each field manually 2015-02-25 10:11:47 +00:00
Evan Gates 4671b4c974 no need for array of function pointers for comparisons, just use the necessary function pointer itself 2015-02-25 10:11:35 +00:00
Evan Gates 49d80b46bb add space for END token 2015-02-24 10:23:18 +00:00
Evan Gates 91bb8ed7d6 remove extra get_n_arg declarations (copy pasta) 2015-02-24 10:23:17 +00:00
Evan Gates 6abce61877 insert implicit -a after primary before ! 2015-02-21 09:25:05 +00:00
Evan Gates 639a74a537 find: Change execv to execvp
- Make globals static
- Fix a comment
- Change some data types
- Rearrange struct members from largest to smallest
  (no affect due to small structs, good practice)
2015-02-21 09:24:31 +00:00
sin b08bb6aad6 Use st_mtime as opposed to st_mtim.tv_sec
This is more portable and fixes a build issue on NetBSD.  We ignore
nanoseconds in this case so there's no functional difference.
2015-02-20 15:47:50 +00:00
Roberto E. Vargas Caballero b8fbaa9b0d Remove bit fields
Before removing bit fields:

$ size find

   text	   data	    bss	    dec	    hex	filename

  16751	    968	     48	  17767	   4567	find

After removing bit fields:
$ size find
   text    data     bss     dec     hex filename
  16527     968      68   17563    449b find

This is an example where bit fields uses more memory
than integers or char. There is going to be only one
gflags struct, so the waste in instructions is bigger
than the space saved by bit fields. In the case of Permarg,
Sizearg, Execarg there is only one bit field, so at least
one unsigned is used, so there is no any gain.
2015-02-20 13:46:56 +00:00
Hiltjo Posthuma 6c7ff5fda5 code-style: unindent one level of switch 2015-02-20 13:29:38 +01:00
sin 028b0c206f Fix style
This broke sbase-box because it assumes that main(...) starts on
a separate line.
2015-02-20 10:24:11 +00:00
Evan Gates 76e6aacd60 Add initial find(1) implementation
No manpage yet.
2015-02-20 10:17:16 +00:00