sbase

Author	SHA1	Message	Date
sin	4c1f9ecdd2	Align end of comment	2015-11-13 14:24:09 +00:00
sin	996f992ac6	enmasse: For the special case of 2 args, do not distinguish between dirs and files This produces consistent error messages when moving or copying a file or dir to itself.	2015-11-13 14:21:07 +00:00
FRIGN	dae33f42d4	Fix typo in libutil/fshut.c	2015-10-26 16:53:28 +00:00
FRIGN	8be7c42863	Make strtol() parsing even stricter in parseoffset() Be strict about what we pass to it and how we handle errors. The base-check is done by strtol anyway. Also improve error-reporting.	2015-09-30 19:44:11 +01:00
FRIGN	870a75076d	Harden parseoffset() even more 1) Check for NULL. 2) Check for empty strings. 3) Clarify error-messages.	2015-09-30 19:44:10 +01:00
FRIGN	64929039e9	Don't forget to scale in parseoffset()	2015-09-30 19:44:10 +01:00
FRIGN	007df69fc5	Add parseoffset() This is a utility function to allow easy parsing of file or other offsets, automatically taking in regard suffixes, proper bases and so on, for instance used in split(1) -b or od -j, -N(1). Of course, POSIX is very arbitrary when it comes to defining the parsing rules for different tools. The main focus here lies on being as flexible and consistent as possible. One central utility-function handling the parsing makes this stuff a lot more trivial.	2015-09-30 19:44:10 +01:00
Hiltjo Posthuma	53be158979	code-style: whitespace fixes	2015-09-30 19:44:10 +01:00
Michael Forney	1d28fbd6cf	mv, cp: Preserve nanosecond timestamps Otherwise, we run into problems in a typical autoconf-based build system: - config.status is created at some point between two seconds. - config.status is run, generating Makefile by first writing to a file in /tmp, and then mv-ing it to Makefile. - If this mv happens before the beginning of the next second, Makefile will be created with the same tv_sec as config.status, but with tv_nsec = 0. - When make runs, it sees that Makefile is older than config.status, and re-runs config.status to generate Makefile.	2015-05-16 13:34:00 +01:00
FRIGN	0545d32ce9	Handle '-' consistently In general, POSIX does not define /dev/std{in, out, err} because it does not want to depend on the dev-filesystem. For utilities, it thus introduced the '-'-keyword to denote standard input (and output in some cases) and the programs have to deal with it accordingly. Sadly, the design of many tools doesn't allow strict shell-redirections and many scripts don't even use this feature when possible. Thus, we made the decision to implement it consistently across all tools where it makes sense (namely those which read files). Along the way, I spotted some behavioural bugs in libutil/crypt.c and others where it was forgotten to fshut the files after use.	2015-05-16 13:34:00 +01:00
Hiltjo Posthuma	29649762b3	libutil/getlines: fix potential crash linelen was uninitialized if for example: $ > empty $ sort ls.c empty	2015-05-08 16:38:06 +01:00
Hiltjo Posthuma	3f01706837	libutil/getlines: use known line length also style: linelen = length of getline(), this was slightly confusing.	2015-05-07 18:18:36 +01:00
Hiltjo Posthuma	adf9f47525	Revert "libutil/getlines: use known line length" This reverts commit c69a70ddfd5c2b1514d9efd1c7a0fcbee5b0d2e7.	2015-05-07 18:18:36 +01:00
Hiltjo Posthuma	bd67e7d92d	libutil/getlines: use known line length	2015-05-07 18:18:35 +01:00
sin	2deb40290e	Use off_t in humansize() as it is more descriptive and applicable	2015-04-29 16:42:49 +01:00
Dionysis Grigoropoulos	2d6cde1862	humansize: Use uintmax_t for size du(1) breaks on 32-bit size_t for files greater than 4G.	2015-04-28 11:36:58 +01:00
FRIGN	5595af5742	Convert humansize() to accept a size_t instead of a double General convention is to use size_t to store sizes of all kinds. Internally, the function uses double anyway, but at least this doesn't clobber up the API any more and there's a chance in the future to make this function a bit cleaner and not use this dirty static buffer hack any more.	2015-04-25 11:43:14 +01:00
sin	10b57e8a3d	Actually print <space> to stream in putword() too	2015-04-21 18:00:47 +01:00
sin	c914a2feca	Update putword() to accept a FILE *	2015-04-21 18:00:47 +01:00
sin	b9d60bee87	Move mkdirp() to libutil	2015-04-20 18:04:08 +01:00
sin	31af8555a7	Add LICENSE header to fshut.c	2015-04-20 18:04:08 +01:00
FRIGN	f83d7bc647	Add SILENT flag to recurse() recurse() is getting smarter every day. I expect it to pass the Turing test in a few months. Along the way, it was reported that "rm -f" on nonexistant files reports their missing as an internal recurse()-error. So recurse() knows when to shut up, I added the SILENT flag to fix all these things.	2015-04-20 11:12:40 +01:00
FRIGN	7b2465c101	Add maxdepth to recurse() This also makes more sense.	2015-04-20 11:12:40 +01:00
FRIGN	e14d9412f8	Properly handle recursion in recurse() The restructuring of recurse() in the last few weeks actually broke the recursion-flags in different tools. As a long-term goal, the recursor should have a field "maxdepth" which should be "1" for the non-Rflag-case. "0" stands for unlimited.	2015-04-20 11:12:40 +01:00
sin	bb2c0cff45	Fix function definition style for fshut.c	2015-04-05 09:16:50 +01:00
FRIGN	3eee8e1509	Remove DEBUG-define for eprintf.c Prepend program name only when fmt doesn't begin with "usage".	2015-04-05 09:13:56 +01:00
FRIGN	0c470f5563	Remove fflush-check from fshut() Basically, it's a conflict between POSIX and ISO C what do to when input streams are passed to fflush(). POSIX mandates that the seeking-position should be synced, but ISO C says it's undefined behaviour. We love POSIX, but the standard-documents specify that in all conflict cases, ISO C wins, so this breaks with EBADF on BSD's. musl and glibc follow POSIX behaviour, which makes sense, but involves numerous portability concerns. To get around this, we just don't check fflush() and rely on the fact that no implementation sets ferror on the file-stream in fflush if it is an input stream, so every issue caught in fflush() is caught later with ferror() and fclose(). Add a comment to fshut() because this stuff is so complicated, it took us a day to figure out.	2015-04-05 09:13:56 +01:00
FRIGN	11e2d472bf	Add *fshut() functions to properly flush file streams This has been a known issue for a long time. Example: printf "word" > /dev/full wouldn't report there's not enough space on the device. This is due to the fact that every libc has internal buffers for stdout which store fragments of written data until they reach a certain size or on some callback to flush them all at once to the kernel. You can force the libc to flush them with fflush(). In case flushing fails, you can check the return value of fflush() and report an error. However, previously, sbase didn't have such checks and without fflush(), the libc silently flushes the buffers on exit without checking the errors. No offense, but there's no way for the libc to report errors in the exit- condition. GNU coreutils solve this by having onexit-callbacks to handle the flushing and report issues, but they have obvious deficiencies. After long discussions on IRC, we came to the conclusion that checking the return value of every io-function would be a bit too much, and having a general-purpose fclose-wrapper would be the best way to go. It turned out that fclose() alone is not enough to detect errors. The right way to do it is to fflush() + check ferror on the fp and then to a fclose(). This is what fshut does and that's how it's done before each return. The return value is obviously affected, reporting an error in case a flush or close failed, but also when reading failed for some reason, the error- state is caught. the !!( ... + ...) construction is used to call all functions inside the brackets and not "terminating" on the first. We want errors to be reported, but there's no reason to stop flushing buffers when one other file buffer has issues. Obviously, functionales come before the flush and ret-logic comes after to prevent early exits as well without reporting warnings if there are any. One more advantage of fshut() is that it is even able to report errors on obscure NFS-setups which the other coreutils are unable to detect, because they only check the return-value of fflush() and fclose(), not ferror() as well.	2015-04-05 09:13:56 +01:00
Hiltjo Posthuma	27f258dd34	libutil/getlines: style fix	2015-03-29 21:55:34 +02:00
Hiltjo Posthuma	9f97430143	libutil/getlines: fix crash with no lines because b->lines and b->nlines would be 0 with no lines read. reproduce: printf '' \| sort or cols bug was introduced by commit: `66a5ea722d`	2015-03-29 21:48:49 +02:00
Hiltjo Posthuma	a9bedca038	fix some signed/unsigned warnings and style fixes	2015-03-27 22:48:05 +01:00
FRIGN	9144d51594	Check getline()-return-values properly It's not useful when 0 is returned anyway, so be sure that we have a string with length > 0, this also solves some indexing-gotchas like "len - 1" and so on. Also, add checked getline()'s whenever it has been forgotten and clean up the error-messages.	2015-03-27 14:49:48 +01:00
FRIGN	b6b977f63d	Audit tar(1), add DIRFIRST-flag to recurse() I've been wanting to do this for a while now, as tar(1) used to be one of messiest and cruftiest tools. First off, before walking through the audit, I'll talk about what the DIRFIRST-flag for recurse() does. It basically calls fn() on the first-level-dir before calling it's subentries. It's necessary here, because else the order of the tar-files would've been wrong (it would try to create dir/file before creating dir/). Now, to the audit: 1) Update manpage, fix mistake that compression is also available for compressing. It's only available for extracting. 2) Define the major, minor and makedev macros from glibc by ourselves. No need to rely on them, as they are common sense. decomp() 3) Simple refactorization. putoctal() 4) Add a truncation check for snprintf(). archive() 5) BUGFIX: Add checks to any checkable function, don't blindly call them, this is harmful and there are 100 ways to exploit that. 6) Use estrlcpy() instead of snprintf() wherever possible, fix alignment. 7) BUGFIX: Terminate the result-buffer of readlink(), check if it even succeeded. 8) Fix sizeof()-formatting. unarchive() 9) BUGFIX: Add checks to any checkable function, don't blindly call them, this is harmful and there are 100 ways to exploit that. 10) BUGFIX: strtoul can happily return negative numbers. Add checks for that and also if the full string has been processed. 11) Remove calls to perror(). We have eprintf, use it. 12) BUGFIX: "minor = strtoul(h->mode, 0, 8);". We need h->minor of course. 13) Fix typo "usupported", remove fprintf-call. print() 14) Check fread(). xt() 15) Get rid of snprintf-magic. Use estrlcat(). 16) BUGFIX: check for ferror() on the tarfile. usage() 17) Update it. The old usage() was like 1000 years old. main() 18) Add DIRFIRST-flag to the recursor. 19) Don't print usage() when a mode is re-set. We allow this in general. 20) Add function checks and fix error messages. 21) Add tarfilename-global for proper error-messages.	2015-03-21 01:30:47 +01:00
FRIGN	58098575e7	Audit cp() in libutil 1) Rename cp_HLPflag -> cp_follow for consistency. 2) Use function-pointers for stat to clear up the code. 3) BUGFIX: TERMINATE THE RESULT BUFFER OF READLINK !!! It's something I noticed earlier and it actually lead to some pretty insane behaviour on our side using glibc (musl somehow magically solves this). Basically, symlinks used to contain the data of the file they pointed to. I wondered for weeks where this came from and now this has finally been solved. 4) BUGFIX: Do not unconditionally unlink target-files. Even GNU coreutils do it wrong. The basic idea is this: If fflag == 0 --> don't touch target files if they exist. If fflag == 1 --> unlink all and don't error out when we try to unlink a file which doesn't exist. 5) Use estrlcpy and estrlcat instead of snprintf for path building. 6) Make it clearer what happens in preserve.	2015-03-19 17:57:12 +01:00
FRIGN	3111908b03	Refactor recurse() again Okay, why yet another recurse()-refactor? The last one added the recursor-struct, which simplified things on the user-end, but there was still one thing that bugged me a lot: Previously, all fn()'s were forced to (l)stat the paths themselves. This does not work well when you try to keep up with H-, L- and P- flags at the same time, as each utility-function would have to set the right function-pointer for (l)stat every single time. This is not desirable. Furthermore, recurse should be easy to use and not involve trouble finding the right (l)stat-function to do it right. So, what we needed was a stat-argument for each fn(), so it is directly accessible. This was impossible to do though when the fn()'s are still directly called by the programs to "start" the recurse. Thus, the fundamental change is to make recurse() the function to go, while designing the fn()'s in a way they can "live" with st being NULL (we don't want a null-pointer-deref). What you can see in this commit is the result of this work. Why all this trouble instead of using nftw? The special thing about recurse() is that you tell the function when to recurse() in your fn(). You don't need special flags to tell nftw() to skip the subtree, just to give an example. The only single downside to this is that now, you are not allowed to unconditionally call recurse() from your fn(). It has to be a directory. However, that is a cost I think is easily weighed up by the advantages. Another thing is the history: I added a procedure at the end of the outmost recurse to free the history. This way we don't leak memory. A simple optimization on the side: - if (h->dev == st.st_dev && h->ino == st.st_ino) + if (h->ino == st.st_ino && h->dev == st.st_dev) First compare the likely difference in inode-numbers instead of checking the unlikely condition that the device-numbers are different.	2015-03-19 01:08:19 +01:00
FRIGN	b3e8b17235	Audit concat() in libutil Be more pedantic about the error-checking, fread can also return values > 0 even though there has been a read-error. We want to write the last incoming data and then bail.	2015-03-18 22:58:42 +01:00
FRIGN	a68c2a9e6e	Remove apathmax() and implicitly agetcwd() pathconf() is just an insane interface to use. All sane operating- systems set sane values for PATH_MAX. Due to the by-runtime-nature of pathconf(), it actually weakens the programs depending on its values. Given over 3 years it has still not been possible to implement a sane and easy to use apathmax()-utility-function, and after discussing this on IRC, we'll dump this garbage. We are careful enough not to overflow PATH_MAX and even if, any user is able to set another limit in config.mk if he so desires.	2015-03-18 15:20:35 +01:00
FRIGN	93fd817536	Add estrlcat() and estrlcpy() It has become a common idiom in sbase to check strlcat() and strlcpy() using if (strl{cat, cpy}(dst, src, siz) >= siz) eprintf("path too long\n"); However, this was not carried out consistently and to this very day, some tools employed unchecked calls to these functions, effectively allowing silent truncations to happen, which in turn may lead to security issues. To finally put an end to this, the e*-functions detect truncation automatically and the caller can lean back and enjoy coding without trouble. :)	2015-03-17 11:24:49 +01:00
FRIGN	9fd4a745f8	Add history and config-struct to recurse For loop detection, a history is mandatory. In the process of also adding a flexible struct to recurse, the recurse-definition was moved to fs.h. The motivation behind the struct is to allow easy extensions to the recurse-function without having to change the prototypes of all functions in the process. Adding flags is really simple as well now. Using the recursor-struct, it's also easier to see which defaults apply to a program (for instance, which type of follow, ...). Another change was to add proper stat-lstat-usage in recurse. It was wrong before.	2015-03-13 00:29:48 +01:00
FRIGN	af61ba738c	Refactor recurse() Instead of allocating a buffer on each run, build a buf on the stack.	2015-03-12 13:22:37 +01:00
FRIGN	01de5df8e6	Audit du(1) and refactor recurse() While auditing du(1) I realized that there's no way the over 100 lines of procedures in du() would pass the audit. Instead, I decided to rewrite this section using recurse() from libutil. However, the issue was that you'd need some kind of payload to count the number of bytes in the subdirectories and use them in the higher hierarchies. The solution is to add a "void *data" data pointer to each recurse- function-prototype, which we might also be able to use in other recurse-applications. recurse() itself had to be augmented with a recurse_samedev-flag, which basically prevents recurse from leaving the current device. Now, let's take a closer look at the audit: 1) Removing the now unnecessary util-functions push, pop, xrealpath, rename print() to printpath(), localize some global variables. 2) Only pass the block count to nblks instead of the entire stat- pointer. 3) Fix estrtonum to use the minimum of LLONG_MAX and SIZE_MAX. 4) Use idiomatic argv+argc-loop 5) Report proper exit-status.	2015-03-11 23:21:52 +01:00
FRIGN	833c2aebb4	Remove mallocarray(...) and use reallocarray(NULL, ...) After a short correspondence with Otto Moerbeek it turned out mallocarray() is only in the OpenBSD-Kernel, because the kernel- malloc doesn't have realloc. Userspace applications should rather use reallocarray with an explicit NULL-pointer. Assuming reallocarray() will become available in c-stdlibs in the next few years, we nip mallocarray() in the bud to allow an easy transition to a system-provided version when the day comes.	2015-03-11 10:50:18 +01:00
FRIGN	3c33abc520	Implement mallocarray() A function used only in the OpenBSD-Kernel as of now, but it surely provides a helpful interface when you just don't want to make sure the incoming pointer to erealloc() is really NULL so it behaves like malloc, making it a bit more safer. Talking about *allocarray(): It's definitely a major step in code- hardening. Especially as a system administrator, you should be able to trust your core tools without having to worry about segfaults like this, which can easily lead to privilege escalation. How do the GNU coreutils handle this? $ strings -n 4611686018427387903 strings: invalid minimum string length -1 $ strings -n 4611686018427387904 strings: invalid minimum string length 0 They silently overflow... In comparison, sbase: $ strings -n 4611686018427387903 mallocarray: out of memory $ strings -n 4611686018427387904 mallocarray: out of memory The first out of memory is actually a true OOM returned by malloc, whereas the second one is a detected overflow, which is not marked in a special way. Now tell me which diagnostic error-messages are easier to understand.	2015-03-10 22:19:19 +01:00
FRIGN	3b825735d8	Implement reallocarray() Stateless and I stumbled upon this issue while discussing the semantics of read, accepting a size_t but only being able to return ssize_t, effectively lacking the ability to report successful reads > SSIZE_MAX. The discussion went along and we came to the topic of input-based memory allocations. Basically, it was possible for the argument to a memory-allocation-function to overflow, leading to a segfault later. The OpenBSD-guys came up with the ingenious reallocarray-function, and I implemented it as ereallocarray, which automatically returns on error. Read more about it here[0]. A simple testcase is this (courtesy to stateless): $ sbase-strings -n (2^(32\|64) / 4) This will segfault before this patch and properly return an OOM- situation afterwards (thanks to the overflow-check in reallocarray). [0]: http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man3/calloc.3	2015-03-10 21:23:36 +01:00
sin	7d36a35649	Fix off-by-one in apathmax() as the path is relative to "/" 1) Use size_t * instead of long * 2) Fallback to PATH_MAX instead of BUFSIZ 3) Header cleanup	2015-03-06 23:50:39 +00:00
FRIGN	0b9c02cd22	Use path[len] instead of *(path + len) Maybe it's time to go to bed...	2015-03-03 00:31:27 +01:00
FRIGN	903d43bbb8	Use dynamic array in recurse() instead of PATH_MAX-array Thanks Evan!	2015-03-03 00:11:41 +01:00
FRIGN	8dc92fbd6c	Refactor enmasse() and recurse() to reflect depth The HLP-changes to sbase have been a great addition of functionality, but they kind of "polluted" the enmasse() and recurse() prototypes. As this will come in handy in the future, knowing at which "depth" you are inside a recursing function is an important functionality. Instead of having a special HLP-flag passed to enmasse, each sub- function needs to provide it on its own and can calculate results based on the current depth (for instance, 'H' implies 'P' at depth > 0). A special case is recurse(), because it actually depends on the follow-type. A new flag "recurse_follow" brings consistency into what used to be spread across different naming conventions (fflag, HLP_flag, ...). This also fixes numerous bugs with the behaviour of HLP in the tools using it.	2015-03-02 22:50:38 +01:00
FRIGN	933ed8c00b	Rename unused flag in rm() Before somebody gets the wrong idea again like I did.	2015-03-02 14:36:26 +01:00
FRIGN	286df29e7d	Make already audited tools argv-centric instead of argc-centric This has already been suggested by Evan Gates <evan.gates@gmail.com> and he's totally right about it. So, what's the problem? I wrote a testing program asshole.c with int main(void) { execl("/path/to/sbase/echo", "echo", "test"); return 0; } and checked the results with glibc and musl. Note that the sentinel NULL is missing from the end of the argument list. glibc calculates an argc of 5, musl 4 (instead of 2) and thus mess up things anyway. The powerful arg.h also focuses on argv instead of argc as well, but ignoring argc completely is also the wrong way to go. Instead, a more idiomatic approach is to check *argv only and decrement argc on the go. While at it, I rewrote yes(1) in an argv-centric way as well. All audited tools have been "fixed" and each following audited tool will receive the same treatment.	2015-03-02 14:19:26 +01:00

1 2

81 Commits