After setting up qemu and testing od(1) in a Big Endian environment,
I found out that the conditional in the printing function was not
Instead, it's supposed to be way simpler. While at it, we don't need
HOST_BIG_ENDIAN any more. Just set big_endian properly in main()
and be done with it.
The -e and -E flags allow the user to override the host endianness
and force od(1) to handle input according to a little (-e) or big (-E)
The previous handling was broken as bitshifts alone are already
Looking at the old code, it became clear that the desired
functionality with the t-flag could not be added unless the
underlying data-structures were reworked.
Thus the only way to be successful was to rewrite the whole thing.
od(1) allows giving arbitrarily many type-specs per call, both via
-t x1o2... and -t x1 -t o2 and intermixed.
This fortunately is easy to parse.
Now, to be flexible, it should not only support types of integral
length. Erroring out like this is inacceptable:
$ echo -n "shrek"| od -t u3
od: invalid type string ‘u3’;
this system doesn't provide a 3-byte integral type
Thus, this new od(1) just collects the bytes until shortly before
printing, when the numbers are written into a long long with the
The bytes per line are just the lcm of all given type-lengths and >= 16.
They are equal to 16 for all types that are possible to print using
the old od(1)'s.
Endianness is of course also supported, needs some testing though,
especially on Big Endian systems.
The one specified by mdoc is hard to read for non-native
speakers from countries which read the date day-first (like
Germany, Greece, North-Korea, Swamp,...).
This is also consistent with how we generally specify dates
The logic is simple, it's just a pain in the ass to fill the
Some lines had to be commented out, as glibc/musl apparently
have not fully implemented the mandatory variables for the
2013 corrigendum of POSIX 2008.
Also added a manpage and the necessary entries in README.
I also removed it from the TODO.
If this flag is not given, od(1) automatically replaces duplicate
adjacent lines with an '*' for each reoccurence.
If this flag is set, thus, no such filtering occurs.
In this case this would mean having to somehow keep the last printed
line in some backbuffer, building the next line and then doing the
necessary comparisons. This basically means that we duplicate the
functionality provided with uniq(1).
So instead of
$ od -t a > dump
you'd rather do
$ od -t a | uniq -f 1 -c > dump
Skipping the first field is necessary, as the addresses obviously differ.
Now, I was thinking hard why this flag even exists. If POSIX mandated
to add the address before the asterisk, so we know the offset of duplicate
occurrences, this would make sense. However, this is not the case.
Using uniq(1) also gives nicer output:
~ $ echo "111111111111111111111111111111111111111111111111" | od -t a -v | uniq -f 1 -c
3 0000000 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 0000060 nl
in comparison to
$ echo "111111111111111111111111111111111111111111111111" | od -t a
0000000 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Before working on od(1), I didn't even know it would filter out
duplicate adjacent lines like that. This is also a matter of
Concluding, the v-flag is implicitly set and users urged to just
use the existing tools provided by the system.
I don't think we would break scripts either. Firstly, it's rather
unlikely to have duplicate lines exactly matching the line-length of
od(1). Secondly, even if a script did that specifically, in the worst
case there would be a counting error or something.
Given od(1) is mostly used interactively, we can safely assume this
feature is for the benefit of the users.
Ditch this legacy POSIX crap!
Please enter the commit message for your changes. Lines starting
This is a utility function to allow easy parsing of file or other
offsets, automatically taking in regard suffixes, proper bases and
so on, for instance used in split(1) -b or od -j, -N(1).
Of course, POSIX is very arbitrary when it comes to defining the
parsing rules for different tools.
The main focus here lies on being as flexible and consistent as
possible. One central utility-function handling the parsing makes
this stuff a lot more trivial.