Commit Graph

90 Commits

Author SHA1 Message Date
FRIGN b6b977f63d Audit tar(1), add DIRFIRST-flag to recurse()
I've been wanting to do this for a while now, as tar(1) used to
be one of messiest and cruftiest tools.
First off, before walking through the audit, I'll talk about
what the DIRFIRST-flag for recurse() does.
It basically calls fn() on the first-level-dir before calling
it's subentries. It's necessary here, because else the order
of the tar-files would've been wrong (it would try to create
dir/file before creating dir/).

Now, to the audit:
1)  Update manpage, fix mistake that compression is also available
    for compressing. It's only available for extracting.
2)  Define the major, minor and makedev macros from glibc by ourselves.
    No need to rely on them, as they are common sense.

3)  Simple refactorization.

4)  Add a truncation check for snprintf().

5)  BUGFIX: Add checks to any checkable function, don't blindly call
    them, this is harmful and there are 100 ways to exploit that.
6)  Use estrlcpy() instead of snprintf() wherever possible, fix
7)  BUGFIX: Terminate the result-buffer of readlink(), check if
    it even succeeded.
8)  Fix sizeof()-formatting.

9)  BUGFIX: Add checks to any checkable function, don't blindly call
    them, this is harmful and there are 100 ways to exploit that.
10) BUGFIX: strtoul can happily return negative numbers. Add checks
    for that and also if the full string has been processed.
11) Remove calls to perror(). We have eprintf, use it.
12) BUGFIX: "minor = strtoul(h->mode, 0, 8);". We need h->minor of
13) Fix typo "usupported", remove fprintf-call.

14) Check fread().

15) Get rid of snprintf-magic. Use estrlcat().
16) BUGFIX: check for ferror() on the tarfile.

17) Update it. The old usage() was like 1000 years old.

18) Add DIRFIRST-flag to the recursor.
19) Don't print usage() when a mode is re-set. We allow this in
20) Add function checks and fix error messages.
21) Add tarfilename-global for proper error-messages.
2015-03-21 01:30:47 +01:00
FRIGN 3111908b03 Refactor recurse() again
Okay, why yet another recurse()-refactor?
The last one added the recursor-struct, which simplified things
on the user-end, but there was still one thing that bugged me a lot:
Previously, all fn()'s were forced to (l)stat the paths themselves.
This does not work well when you try to keep up with H-, L- and P-
flags at the same time, as each utility-function would have to set
the right function-pointer for (l)stat every single time.

This is not desirable. Furthermore, recurse should be easy to use
and not involve trouble finding the right (l)stat-function to do it
So, what we needed was a stat-argument for each fn(), so it is
directly accessible. This was impossible to do though when the
fn()'s are still directly called by the programs to "start" the
Thus, the fundamental change is to make recurse() the function to
go, while designing the fn()'s in a way they can "live" with st
being NULL (we don't want a null-pointer-deref).

What you can see in this commit is the result of this work. Why
all this trouble instead of using nftw?
The special thing about recurse() is that you tell the function
when to recurse() in your fn(). You don't need special flags to
tell nftw() to skip the subtree, just to give an example.

The only single downside to this is that now, you are not allowed
to unconditionally call recurse() from your fn(). It has to be
a directory.
However, that is a cost I think is easily weighed up by the

Another thing is the history: I added a procedure at the end of
the outmost recurse to free the history. This way we don't leak

A simple optimization on the side:

-		if (h->dev == st.st_dev && h->ino == st.st_ino)
+		if (h->ino == st.st_ino && h->dev == st.st_dev)

First compare the likely difference in inode-numbers instead of
checking the unlikely condition that the device-numbers are
2015-03-19 01:08:19 +01:00
FRIGN 9fd4a745f8 Add history and config-struct to recurse
For loop detection, a history is mandatory. In the process of also
adding a flexible struct to recurse, the recurse-definition was moved
to fs.h.
The motivation behind the struct is to allow easy extensions to the
recurse-function without having to change the prototypes of all
functions in the process.
Adding flags is really simple as well now.

Using the recursor-struct, it's also easier to see which defaults
apply to a program (for instance, which type of follow, ...).

Another change was to add proper stat-lstat-usage in recurse. It
was wrong before.
2015-03-13 00:29:48 +01:00
FRIGN 01de5df8e6 Audit du(1) and refactor recurse()
While auditing du(1) I realized that there's no way the over 100 lines
of procedures in du() would pass the audit.
Instead, I decided to rewrite this section using recurse() from libutil.
However, the issue was that you'd need some kind of payload to count
the number of bytes in the subdirectories and use them in the higher
The solution is to add a "void *data" data pointer to each recurse-
function-prototype, which we might also be able to use in other
recurse() itself had to be augmented with a recurse_samedev-flag, which
basically prevents recurse from leaving the current device.

Now, let's take a closer look at the audit:
1) Removing the now unnecessary util-functions push, pop, xrealpath,
   rename print() to printpath(), localize some global variables.
2) Only pass the block count to nblks instead of the entire stat-
3) Fix estrtonum to use the minimum of LLONG_MAX and SIZE_MAX.
4) Use idiomatic argv+argc-loop
5) Report proper exit-status.
2015-03-11 23:21:52 +01:00
Hiltjo Posthuma 066a0306a1 fork: no need to _exit() on the error case 2015-03-10 20:05:18 +01:00
FRIGN a8bd21c0ab Use switch with fork()
Allows dropping a local variable if the explicit PID is not needed
and it makes it clearer what happens.
Also, one should always strive for consistency for cases like these.
2015-03-09 15:01:29 +01:00
FRIGN 6f207dac5f Don't return but _exit after failed exec*() and fork()
Quoting POSIX[0]:
"Care should be taken, also, to call _exit() rather than exit() if exec cannot be used, since
exit() flushes and closes standard I/O channels, thereby damaging the parent process' standard
I/O data structures. (Even with fork(), it is wrong to call exit(), since buffered data would
then be flushed twice.)"

2015-03-09 01:12:59 +01:00
FRIGN c8f2b068f6 Fix segmentation fault in tar(1) 2015-03-03 11:26:59 +01:00
FRIGN 8dc92fbd6c Refactor enmasse() and recurse() to reflect depth
The HLP-changes to sbase have been a great addition of functionality,
but they kind of "polluted" the enmasse() and recurse() prototypes.
As this will come in handy in the future, knowing at which "depth"
you are inside a recursing function is an important functionality.

Instead of having a special HLP-flag passed to enmasse, each sub-
function needs to provide it on its own and can calculate results
based on the current depth (for instance, 'H' implies 'P' at
depth > 0).
A special case is recurse(), because it actually depends on the
follow-type. A new flag "recurse_follow" brings consistency into
what used to be spread across different naming conventions (fflag,
HLP_flag, ...).

This also fixes numerous bugs with the behaviour of HLP in the
tools using it.
2015-03-02 22:50:38 +01:00
sin 29be7a3f23 tar: Style fix 2015-02-17 09:13:15 +00:00
FRIGN 8e016fad91 Make the tar(1)-header fixed again
This is clearer.
2015-02-16 20:01:33 +01:00
FRIGN eb17f2cc9c Refactor tar(1) 2015-02-16 19:47:36 +01:00
sin 8f068589fb Fix recurse() prototype and convert char to int flags 2015-02-16 16:23:12 +00:00
Tai Chi Minh Ralph Eastwood 82bc92da51 recurse: add symlink derefencing flags -H and -L 2015-02-16 15:53:55 +00:00
FRIGN 31572c8b0e Clean up #includes 2015-02-14 21:12:23 +01:00
sin 2f6ffc9ec9 No need to specify "rb" and "wb" in fopen, use "r" and "w" 2015-02-01 15:55:30 +00:00
sin 63d7f29bd9 Fix build 2015-01-26 16:14:45 +00:00
sin 2334c04952 tar: Remove support for old syntax (we now require '-' to parse flags) 2015-01-26 16:14:05 +00:00
sin 1412d07b7d tar: No need to use -f for gzip 2015-01-26 16:03:46 +00:00
sin 7fbb858bcd tar: Add support for -z and -j by invoking external programs
Only extraction is supported at the moment.
2015-01-26 15:59:47 +00:00
FRIGN ec8246bbc6 Un-boolify sbase
It actually makes the binaries smaller, the code easier to read
(gems like "val == true", "val == false" are gone) and actually
predictable in the sense of that we actually know what we're
working with (one bitwise operator was quite adventurous and
should now be fixed).

This is also more consistent with the other suckless projects
around which don't use boolean types.
2014-11-14 10:54:20 +00:00
FRIGN 7d2683ddf2 Sort includes and more cleanup and fixes in util/ 2014-11-14 10:54:10 +00:00
FRIGN eee98ed3a4 Fix coding style
It was about damn time. Consistency is very important in such a
big codebase.
2014-11-13 18:08:43 +00:00
sin 9750071b97 Fix stupid GCC warning
tar.c:239:9: warning: missing braces around initializer [-Wmissing-braces]

I believe this is an unresolved bug in GCC.
2014-11-03 10:21:05 +00:00
Michael Forney 7ed4866556 tar: Implement -m flag
This changes the default behavior to adjust mtimes to what is present in
the file header.
2014-11-01 22:34:29 +00:00
Michael Forney e1f87da43e tar: Handle archives with the prefix field
Also, handle names and prefixes that fill the entire field (and have no
NUL byte) by using a precision specifier.
2014-11-01 22:34:19 +00:00
Michael Forney 0e8a8c9426 tar: Support typeflag '\0' when extracting
POSIX recommends that "For backwards-compatibility, a typeflag value of
binary zero ( '\0' ) should be recognized as meaning a regular file when
extracting files from the archive".
2014-11-01 22:34:08 +00:00
Michael Forney 453ce96d44 tar: Don't crash when get{pw,gr}uid fails 2014-11-01 22:33:55 +00:00
sin 0c5b7b9155 Stop using EXIT_{SUCCESS,FAILURE} 2014-10-02 23:46:59 +01:00
Hiltjo Posthuma 953ebf3573 code style
Signed-off-by: Hiltjo Posthuma <>
2014-06-01 18:02:30 +01:00
sin e37e2782a9 Only use major()/minor() if they are available in tar(1)
Otherwise silently ignore them in the archive case.  This is the
same in principle as what we do in the unarchive case.
2014-01-30 16:17:25 +00:00
sin 0a7791a25c Use recurse() in tar(1) instead of ftw(3) 2014-01-30 14:55:38 +00:00
sin c83aef2cda Use preprocessor conditionals to check if makedev() is present
makedev() is not portable and is typically implemented as a
macro.  If it exists use it, otherwise silently ignore character
and block devices.
2014-01-28 17:22:48 +00:00
sin b5a511dacf Exit with EXIT_SUCCESS/EXIT_FAILURE instead of 0 and 1
Fixed for consistency purposes.
2013-10-07 16:44:22 +01:00
David Galos b5b7db4009 tar: Check inode AND dev before ignoring a file. Thanks, Lars Lindqvist! 2013-07-28 12:12:03 -04:00
Roberto E. Vargas Caballero f636ac791b Avoid infinite loop in tar
When the tar file is written in one directory archived by tar
the function archive enters in an infinite loop due to de
tar file written. This patch avoid this case checking the
inode of the tar file before of adding it to the archive.
2013-07-20 13:18:39 -04:00
David Galos 9f8deb4b23 Tar compiles on BSD, thanks Roberto E. Vargas Caballero. Also remove tons of trailing whitespace. 2013-07-20 01:27:42 -04:00
sin 43c4213631 Remove trailing whitespace 2013-07-20 00:56:04 -04:00
David Galos c5f10c4b06 Fixing idiotic mistake in tar 2013-07-18 11:52:01 -04:00
David Galos 2c75eb98d9 Adding tar. 2013-07-18 11:15:35 -04:00