Commit Graph

100 Commits

Author SHA1 Message Date
Michael Forney 8ca79a2993 linecmp: Handle NUL bytes properly
Test case:

if [ "$(printf 'a\na\0b' | ./sort -u)" = "$(printf 'a\na\0b')" ] ; then
	echo pass
else
	echo fail
fi
2016-07-09 10:09:50 +01:00
Mattias Andrée 44a6d65832 *sum: support - when using -c
Signed-off-by: Mattias Andrée <maandree@kth.se>
2016-03-26 08:18:47 +00:00
FRIGN 515525997c Fix linecmp() to return correct values 2016-03-11 15:38:36 +00:00
FRIGN 3debc5e064 Add linecmp() 2016-03-10 08:48:09 +00:00
FRIGN 19c0ca9830 Properly increment line lenght on edge-case in getlines() 2016-03-10 08:48:09 +00:00
FRIGN eb9bda8787 Support NUL-containing lines in sort(1)
For sort(1) we need memmem(), which I imported from OpenBSD.
Inside sort(1), the changes involved working with the explicit lengths
given by getlines() earlier and rewriting some of the functions.

Now we can handle NUL-characters in the input just fine.
2016-03-10 08:48:09 +00:00
FRIGN e4810f1cdb Support NUL-containing lines in cols(1)
This required an architectural change in getlines() by also storing
the line length.
2016-03-10 08:48:09 +00:00
FRIGN a88906b423 Rever the strmem() addition and add a TODO element
strmem() was not very well thought out. The thing is the following:
If the string contains a zero character, we want to match it, and not
stop right there in place.

The "real" solution is to use memmem() where needed and replace all
functions that assume zero-terminated-strings from standard input, which
could lead to early string-breakoffs.
This requires a strict tracking of string lengths.
2016-02-26 09:54:46 +00:00
FRIGN 3396088666 Implement strmem() and use it in join(1)
We want our delimiters to also contain 0 characters and have them
handled gracefully.
To accomplish this, I wrote a function strmem(), which looks for a
certain, arbitrarily long memory subset in a given string.
memmem() is a GNU extension and forces you to call strlen every time.
2016-02-26 09:54:46 +00:00
Mattias Andrée a392cd475e add sha512-224sum (SHA512/224) and sha512-256sum (SHA512/256)
Signed-off-by: Mattias Andrée <maandree@kth.se>
2016-02-24 10:40:57 +00:00
Mattias Andrée ae1da536bb add sha224sum and sha384sum
Signed-off-by: Mattias Andrée <maandree@kth.se>
2016-02-24 10:15:16 +00:00
FRIGN 6adb9b8ccd Fix compilation error 2016-02-21 08:52:48 +00:00
FRIGN 70adb1252d Do a range check on the resulting octal 2016-02-21 08:52:48 +00:00
FRIGN 41a600e1b8 Allow \0ooo octal escapes
Yeah well, the old topic. POSIX allows \0123 and \123 octals in
different tools, in printf, depending on %b or other things.
We'll just keep it simple and just allow 4 digits. the 0 does not make
a difference anyway.
2016-02-21 08:52:48 +00:00
FRIGN e40fc2b176 Check argv0 in xvprintf()
You never know, given printf'ing NULL-strings might crash the
program, we shouldn't just pass argv0 blindly to it.
2015-12-21 14:13:36 +00:00
sin 3b1b50cffa Do not indent label 2015-12-21 09:55:01 +00:00
FRIGN 94e92d9cc0 Refactor eprintf.c
When we move the exit() out of venprintf(), we can reuse it for
weprintf(), which basically had duplicate code.
I also renamed venprintf() to xvprintf (extended vprintf) so it's
more obvious what it actually does.
2015-12-21 09:34:13 +00:00
sin 67157b7c4e Use SSIZE_MAX for overflow check in parseoffset()
There's no such thing as OFF_MAX so we can't use that.  off_t is
signed, so use SSIZE_MAX which will typically match the range of
off_t.
2015-11-26 10:35:46 +00:00
sin d24584466c Revert "enmasse: For the special case of 2 args, do not distinguish between dirs and files"
This reverts commit a564a67c4ea70e90a4dc543814458e4903869d3e.

Not as trivial as I thought.  This breaks cp when used as:
cp -r /foo/bar /baz

The old code expands this to:
cp -r /foo/bar /baz/bar
2015-11-13 14:31:34 +00:00
sin 4c1f9ecdd2 Align end of comment 2015-11-13 14:24:09 +00:00
sin 996f992ac6 enmasse: For the special case of 2 args, do not distinguish between dirs and files
This produces consistent error messages when moving or copying a file or dir to
itself.
2015-11-13 14:21:07 +00:00
FRIGN dae33f42d4 Fix typo in libutil/fshut.c 2015-10-26 16:53:28 +00:00
FRIGN 8be7c42863 Make strtol() parsing even stricter in parseoffset()
Be strict about what we pass to it and how we handle errors.
The base-check is done by strtol anyway.
Also improve error-reporting.
2015-09-30 19:44:11 +01:00
FRIGN 870a75076d Harden parseoffset() even more
1) Check for NULL.
2) Check for empty strings.
3) Clarify error-messages.
2015-09-30 19:44:10 +01:00
FRIGN 64929039e9 Don't forget to scale in parseoffset() 2015-09-30 19:44:10 +01:00
FRIGN 007df69fc5 Add parseoffset()
This is a utility function to allow easy parsing of file or other
offsets, automatically taking in regard suffixes, proper bases and
so on, for instance used in split(1) -b or od -j, -N(1).
Of course, POSIX is very arbitrary when it comes to defining the
parsing rules for different tools.
The main focus here lies on being as flexible and consistent as
possible. One central utility-function handling the parsing makes
this stuff a lot more trivial.
2015-09-30 19:44:10 +01:00
Hiltjo Posthuma 53be158979 code-style: whitespace fixes 2015-09-30 19:44:10 +01:00
Michael Forney 1d28fbd6cf mv, cp: Preserve nanosecond timestamps
Otherwise, we run into problems in a typical autoconf-based build
system:

  - config.status is created at some point between two seconds.
  - config.status is run, generating Makefile by first writing to a file
    in /tmp, and then mv-ing it to Makefile.
  - If this mv happens before the beginning of the next second, Makefile
    will be created with the same tv_sec as config.status, but with
    tv_nsec = 0.
  - When make runs, it sees that Makefile is older than config.status,
    and re-runs config.status to generate Makefile.
2015-05-16 13:34:00 +01:00
FRIGN 0545d32ce9 Handle '-' consistently
In general, POSIX does not define /dev/std{in, out, err} because it
does not want to depend on the dev-filesystem.
For utilities, it thus introduced the '-'-keyword to denote standard
input (and output in some cases) and the programs have to deal with
it accordingly.

Sadly, the design of many tools doesn't allow strict shell-redirections
and many scripts don't even use this feature when possible.

Thus, we made the decision to implement it consistently across all
tools where it makes sense (namely those which read files).

Along the way, I spotted some behavioural bugs in libutil/crypt.c and
others where it was forgotten to fshut the files after use.
2015-05-16 13:34:00 +01:00
Hiltjo Posthuma 29649762b3 libutil/getlines: fix potential crash
linelen was uninitialized if for example:

$ > empty
$ sort ls.c empty
2015-05-08 16:38:06 +01:00
Hiltjo Posthuma 3f01706837 libutil/getlines: use known line length
also style: linelen = length of getline(), this was slightly confusing.
2015-05-07 18:18:36 +01:00
Hiltjo Posthuma adf9f47525 Revert "libutil/getlines: use known line length"
This reverts commit c69a70ddfd5c2b1514d9efd1c7a0fcbee5b0d2e7.
2015-05-07 18:18:36 +01:00
Hiltjo Posthuma bd67e7d92d libutil/getlines: use known line length 2015-05-07 18:18:35 +01:00
sin 2deb40290e Use off_t in humansize() as it is more descriptive and applicable 2015-04-29 16:42:49 +01:00
Dionysis Grigoropoulos 2d6cde1862 humansize: Use uintmax_t for size
du(1) breaks on 32-bit size_t for files greater than 4G.
2015-04-28 11:36:58 +01:00
FRIGN 5595af5742 Convert humansize() to accept a size_t instead of a double
General convention is to use size_t to store sizes of all kinds.
Internally, the function uses double anyway, but at least this
doesn't clobber up the API any more and there's a chance in the
future to make this function a bit cleaner and not use this dirty
static buffer hack any more.
2015-04-25 11:43:14 +01:00
sin 10b57e8a3d Actually print <space> to stream in putword() too 2015-04-21 18:00:47 +01:00
sin c914a2feca Update putword() to accept a FILE * 2015-04-21 18:00:47 +01:00
sin b9d60bee87 Move mkdirp() to libutil 2015-04-20 18:04:08 +01:00
sin 31af8555a7 Add LICENSE header to fshut.c 2015-04-20 18:04:08 +01:00
FRIGN f83d7bc647 Add SILENT flag to recurse()
recurse() is getting smarter every day. I expect it to pass the Turing
test in a few months.
Along the way, it was reported that "rm -f" on nonexistant files reports
their missing as an internal recurse()-error.
So recurse() knows when to shut up, I added the SILENT flag to fix all
these things.
2015-04-20 11:12:40 +01:00
FRIGN 7b2465c101 Add maxdepth to recurse()
This also makes more sense.
2015-04-20 11:12:40 +01:00
FRIGN e14d9412f8 Properly handle recursion in recurse()
The restructuring of recurse() in the last few weeks actually broke
the recursion-flags in different tools.
As a long-term goal, the recursor should have a field "maxdepth"
which should be "1" for the non-Rflag-case. "0" stands for unlimited.
2015-04-20 11:12:40 +01:00
sin bb2c0cff45 Fix function definition style for fshut.c 2015-04-05 09:16:50 +01:00
FRIGN 3eee8e1509 Remove DEBUG-define for eprintf.c
Prepend program name only when fmt doesn't begin with "usage".
2015-04-05 09:13:56 +01:00
FRIGN 0c470f5563 Remove fflush-check from fshut()
Basically, it's a conflict between POSIX and ISO C what do to when
input streams are passed to fflush().
POSIX mandates that the seeking-position should be synced, but ISO C
says it's undefined behaviour.
We love POSIX, but the standard-documents specify that in all conflict
cases, ISO C wins, so this breaks with EBADF on BSD's.

musl and glibc follow POSIX behaviour, which makes sense, but involves
numerous portability concerns.

To get around this, we just don't check fflush() and rely on the fact
that no implementation sets ferror on the file-stream in fflush if it
is an input stream, so every issue caught in fflush() is caught later
with ferror() and fclose().

Add a comment to fshut() because this stuff is so complicated, it
took us a day to figure out.
2015-04-05 09:13:56 +01:00
FRIGN 11e2d472bf Add *fshut() functions to properly flush file streams
This has been a known issue for a long time. Example:

printf "word" > /dev/full

wouldn't report there's not enough space on the device.
This is due to the fact that every libc has internal buffers
for stdout which store fragments of written data until they reach
a certain size or on some callback to flush them all at once to the
kernel.
You can force the libc to flush them with fflush(). In case flushing
fails, you can check the return value of fflush() and report an error.

However, previously, sbase didn't have such checks and without fflush(),
the libc silently flushes the buffers on exit without checking the errors.
No offense, but there's no way for the libc to report errors in the exit-
condition.

GNU coreutils solve this by having onexit-callbacks to handle the flushing
and report issues, but they have obvious deficiencies.
After long discussions on IRC, we came to the conclusion that checking the
return value of every io-function would be a bit too much, and having a
general-purpose fclose-wrapper would be the best way to go.

It turned out that fclose() alone is not enough to detect errors. The right
way to do it is to fflush() + check ferror on the fp and then to a fclose().
This is what fshut does and that's how it's done before each return.
The return value is obviously affected, reporting an error in case a flush
or close failed, but also when reading failed for some reason, the error-
state is caught.

the !!( ... + ...) construction is used to call all functions inside the
brackets and not "terminating" on the first.
We want errors to be reported, but there's no reason to stop flushing buffers
when one other file buffer has issues.
Obviously, functionales come before the flush and ret-logic comes after to
prevent early exits as well without reporting warnings if there are any.

One more advantage of fshut() is that it is even able to report errors
on obscure NFS-setups which the other coreutils are unable to detect,
because they only check the return-value of fflush() and fclose(),
not ferror() as well.
2015-04-05 09:13:56 +01:00
Hiltjo Posthuma 27f258dd34 libutil/getlines: style fix 2015-03-29 21:55:34 +02:00
Hiltjo Posthuma 9f97430143 libutil/getlines: fix crash with no lines
because b->lines and b->nlines would be 0 with no lines read.

reproduce: printf '' | sort or cols

bug was introduced by commit: 66a5ea722d
2015-03-29 21:48:49 +02:00
Hiltjo Posthuma a9bedca038 fix some signed/unsigned warnings and style fixes 2015-03-27 22:48:05 +01:00