Commit Graph

1044 Commits

Author SHA1 Message Date
FRIGN
dc3a2ffc4a Handle empty format string in printf(1) 2015-02-15 15:20:32 +01:00
FRIGN
5c8a9b375f Update escape-sequence information in tr.1 2015-02-15 14:56:49 +01:00
FRIGN
bafd41e1cf Add printf(1)
This is a particularly interesting program.
I managed to implement everything according to POSIX except how
octal escapes are specified in the standard, which is yet another
format compared to the one demanded for tr(1).
This not only confuses people, it also adds unnecessary cruft
for no real gain.
So in order to be able to use unescape() easily and for consistency,
I used our initial format \o[oo] instead of \0[ooo].

Marked as optional is UTF-8 support for %c in the POSIX specification.
Given how well-developed libutf has become, doing this here was more
or less trivial, putting us yet again ahead of the competition.
2015-02-15 14:46:58 +01:00
FRIGN
d7a438b2f8 Add \e, \", \' and hex-escapes (\xH[H]) to unescape()
So the users control the program, and the program doesn't
control the users.
2015-02-14 22:55:37 +01:00
FRIGN
bf518929b9 Remove runetype and to*rune section from TODO 2015-02-14 21:56:19 +01:00
sin
3e1e54051e Add some missing includes 2015-02-14 20:15:01 +00:00
FRIGN
31572c8b0e Clean up #includes 2015-02-14 21:12:23 +01:00
sin
71fb259d02 Remove evil and unused assert.h from cols(1) 2015-02-14 18:26:45 +00:00
sin
d4830dba30 Fix fgetrune on systems where char is unsigned by default (ARM)
Store the result in an int and do the comparison.  This is always
safe without using strange constructs like "signed char".

wc(1) would go into an infinite loop when executed on an ARM
system.
2015-02-13 15:42:54 +00:00
sin
9f1f8d5dd8 Update uuencode manpage and mark as complete in README 2015-02-13 11:38:52 +00:00
sin
f7b100ecdd Fix section order in uudecode.1 2015-02-13 11:37:17 +00:00
sin
bf4a39bb8c Clarify that -m is an extension to the POSIX specification for uudecode 2015-02-13 11:32:54 +00:00
sin
e8cd6947ec Update uudecode manpage and mark as complete 2015-02-13 11:30:55 +00:00
sin
c4e951366f Update README for uudecode and uuencode 2015-02-13 11:21:52 +00:00
sin
7768918d6a uudecode: Style fix 2015-02-13 11:20:23 +00:00
sin
d61add5dee uuencode: Style fix 2015-02-13 11:20:22 +00:00
Tai Chi Minh Ralph Eastwood
b64b51dc91 uudecode: fix flushing (again) through rewrite 2015-02-13 11:20:22 +00:00
Tai Chi Minh Ralph Eastwood
b907e8747d uuencode: refactor by removing extranous #include 2015-02-13 11:20:22 +00:00
Tai Chi Minh Ralph Eastwood
6d2cbf7a3f uudecode: fix flushing in corner case 2015-02-13 11:20:21 +00:00
Tai Chi Minh Ralph Eastwood
ec02816d3e uuencode: add support for base64 and -o to stdout 2015-02-13 11:20:21 +00:00
FRIGN
93d0178852 uudecode(1) also needs the m-flag 2015-02-13 10:12:16 +01:00
FRIGN
066540b3c5 Update README with chgrp(1) and chown(1) status 2015-02-12 23:35:24 +01:00
FRIGN
8cac5a9ef5 Also add proper error-reporting to chown(1) 2015-02-12 21:57:57 +01:00
FRIGN
c965539b66 Add h-flag to chown(1) and chgrp(1) 2015-02-12 21:56:06 +01:00
FRIGN
ab9b240dc6 Fix warnings and update isalpharune() 2015-02-12 17:08:02 +01:00
FRIGN
4f6d696894 Also add "B"-type characters to isspacerune() 2015-02-12 16:48:22 +01:00
FRIGN
24d6cb90e7 Amend isspacerune() properly with WS and S Unicode characters 2015-02-12 16:41:57 +01:00
FRIGN
ce11e1f195 Add section for laces in lowerrune and upperrune and more ranges
This is a special third kind of structure found in Unicode, besides
singletons and ranges.
This dramatically reduces the number of explicit singletons in the
lookup tables.
Also, I changed the awk-script so that it can sort trivial
translations as well, breaking down the LOC even more.

The binary size of tr dropped from 67K to 51K.
2015-02-12 16:18:02 +01:00
sin
113caaf677 Make getlines() less verbose
Thanks Roberto for the suggestion.
2015-02-12 14:34:07 +00:00
FRIGN
9565eef895 Refactor uppercase-inclusion in libutf
Previously, the to*rune function would have to jiggle with two
arrays, and it somehow evaded me that it is actually way simpler
to just add another entry to the arrays if needed.
Binary size goes slightly down, e.g. tr statically linked against
musl: 68072 -> 67688

Behind the scenes though the conversion should be a bit faster and,
more importantly, the scary case-conversion function is simplified
and easier to understand.

It also drops nearly half the LOC in upperrune.c and lowerrune.c.
2015-02-12 12:28:45 +01:00
FRIGN
73577f10a0 Scrap chartorunearr(), introducing utftorunestr()
Interface and function as proposed by cls.

The reasoning behind this function is that cls expressed his
interest to keep memory allocation out of libutf, which is a
very good motive.
This simplifies the function a lot and should also increase the
speed a bit, but the most important factor here is that there's
no malloc anywhere in libutf, making it a lot smaller and more
robust with a smaller attack-surface.

Look at the paste(1) and tr(1) changes for an idiomatic way to
allocate the right amount of space for the Rune-array.
2015-02-11 21:32:09 +01:00
FRIGN
1c462012e4 Rename variable Rune * p -> r in fgetrune() 2015-02-11 21:14:28 +01:00
FRIGN
7c578bf5b0 Scrap writerune(), introducing fputrune()
Interface and function as proposed by cls.
Code is also shorter, everything else analogous to fgetrune().
2015-02-11 20:58:00 +01:00
FRIGN
a5ae899a48 Scrap readrune(), introducing fgetrune()
Interface as proposed by cls, but internally rewritten after a few
considerations.
The code is much shorter and to the point, aligning itself with other
standard functions. It should also be much faster, which is not bad.
2015-02-11 20:16:49 +01:00
sin
4888bae455 uniq: Add standards section to manpage and update README 2015-02-11 15:55:58 +00:00
sin
2e5a02dd26 uniq is now complete, update README 2015-02-11 15:27:19 +00:00
Tai Chi Minh Ralph Eastwood
5c811577a2 uniq.1: add [input [output]] information 2015-02-11 15:26:59 +00:00
Tai Chi Minh Ralph Eastwood
70694a318c uniq: add support for writing output files 2015-02-11 15:26:57 +00:00
FRIGN
f9846a9a6b Split up is*rune() and to*rune() functions into individual source files
This optimizes the binary size for each tool that uses these functions.
Previously, if a program just used one single function, maybe even a
one-liner, it would statically compile in all lookup-tables, bloating
the binary by up to 20K.
All these changes are derived from a local libutf where I do the
primary changes. So I hope that I can merge these things into libutf
sooner or later, as discussed on the ml.
2015-02-11 15:48:18 +01:00
FRIGN
471cf8f5bc Use runetypebody.h-functions in wc(1) 2015-02-11 15:48:18 +01:00
sin
17dad35015 uniq: Fix typo in usage 2015-02-11 12:50:39 +00:00
sin
b2370171e6 uniq: Match usage with manpage 2015-02-11 12:21:31 +00:00
sin
5f06185b1b uniq: Fixup program usage and manpage
Remove -i as it is not required by POSIX.  We'll add it if we
hit scripts that require it.
2015-02-11 12:19:38 +00:00
FRIGN
5836ef72e3 Use runetypebody.h-functions in tr(1)
That's one small step for a man, one giant leap for mankind.
2015-02-11 13:12:27 +01:00
FRIGN
02ec321419 Add missing is*rune() functions and tolowerrune() and toupperrune()
This basically means that we now have an autogenerating typecheck
and case-conversion tool.
Don't freak out when you see the added LOC. Given we now have
an additional mapping to the uppercase-characters, some ranges got
"lost" and have to be written literally by the generating awk-script.

The runetypebody.h was generated by myself using my modified version
of mkrunetype.awk and I'll push the changed version as soon as this
has been discussed on the ml.
If you worry about speed, consider, that bsearch is just the right
tool for this job and can even handle a long array like this.
2015-02-11 13:12:27 +01:00
sin
26bc079ecc uniq: Style fix 2015-02-11 12:02:33 +00:00
sin
6d4a7989cd readlink: Use eprintf() to report errors 2015-02-11 11:58:13 +00:00
sin
a29d31e94b Update readlink in README 2015-02-11 11:54:58 +00:00
sin
3f3e15b314 readlink: Use strlcat() instead of strncat() 2015-02-11 11:51:57 +00:00
sin
aed987a9af Update README 2015-02-11 10:57:00 +00:00