* sprinkle const, static
* account for lineno in unput
* Add an EMPTY string that is used when a non-const empty string is needed.
* make inputFS static and dynamically allocated
* Simplify and in the process avoid -Wwritable-strings
* make fs const to avoid -Wwritable-strings
An input/output error indicates a fatal condition, even if it
occurs when closing a file. Awk should not return success on I/O
error, but treat I/O errors as it already treats write errors.
Test case:
$ (trap '' PIPE; awk 'BEGIN { print "hi"; }'; echo "E $?" >&2) | :
awk: i/o error occurred while closing /dev/stdout
source line number 1
E 2
The test case pipes a line into a dummy command that reads no
input, with SIGPIPE ignored so we rely on awk's own I/O checking.
No write error is detected, because the pipe is buffered; the
broken pipe is only detected as an I/O error on closing stdout.
Before this commit, "E 0" was printed (indicating status 0/success)
because an I/O error merely produced a warning. A shell script
was unable to detect the I/O error using the exit status.
On case-insensitive file systems (i.e.: macOS), T.concat and
t.concat are the same file, so these conflicted. This commit
renames T.concat to avoid the conflict.
Further simplify printf % parsing by eating the length specifiers
during the copy phase, and substitute 'j' when finalizing the format.
Add some more tests for this.
* More cleanups:
- sprinkle const
- add a macro (setptr) that cheats const to temporarily NUL terminate strings
remove casts from allocations
- use strdup instead of strlen+strcpy
- use x = malloc(sizeof(*x)) instead of x = malloc(sizeof(type of *x)))
- add -Wcast-qual (and casts through unitptr_t in the two macros we
cheat (xfree, setptr)).
* More cleanups:
- add const
- use bounded sscanf
- use snprintf instead of sprintf
* More cleanup:
- use snprintf/strlcat instead of sprintf/strcat
- use %j instead of %l since we are casting to intmax_t/uintmax_t
* Merge the 3 copies of the code that evaluated array strings with separators
and convert them to keep track of lengths and use memcpy instead of strcat.
* Fix formats for 32 bit machines broken by previous commit.
We use intmax_t to provide maximum range for both 32 and 64 bit machines.
* More cleanups:
- sprinkle const
- add a macro (setptr) that cheats const to temporarily NUL terminate strings
remove casts from allocations
- use strdup instead of strlen+strcpy
- use x = malloc(sizeof(*x)) instead of x = malloc(sizeof(type of *x)))
- add -Wcast-qual (and casts through unitptr_t in the two macros we
cheat (xfree, setptr)).
* More cleanups:
- add const
- use bounded sscanf
- use snprintf instead of sprintf
* More cleanup:
- use snprintf/strlcat instead of sprintf/strcat
- use %j instead of %l since we are casting to intmax_t/uintmax_t
* Merge the 3 copies of the code that evaluated array strings with separators
and convert them to keep track of lengths and use memcpy instead of strcat.
- sprinkle const
- add a macro (setptr) that cheats const to temporarily NUL terminate strings
remove casts from allocations
- use strdup instead of strlen+strcpy
- use x = malloc(sizeof(*x)) instead of x = malloc(sizeof(type of *x)))
- add -Wcast-qual (and casts through unitptr_t in the two macros we
cheat (xfree, setptr)).
As it is never dereferenced in the n == -1 case it shouldn't cause any
problems. However, UBSAN complains about this, so it is required to run
the tests when compiling with -fsanitize=undefined.
When matching a caret, the expression `f->gototab[s][c] = f->curstat;` in
cgoto() will index the 2D-array gototab with [s][261]. However, gototab
is declared as being of size [NSTATES][NCHARS], so [32][259]. Therefore,
this assignment will write to the state for character 0x1.
I'm not sure how to create a regression test for this, but increasing the
array size to HAT+1 values fixes the error and the tests still pass.
I found this issue while running awk on a CHERI system with sub-object
protection enabled. On x86, this can be reproduced by compiling awk
with -fsanitize=undefined.
Support POSIX-specified C-style escape sequences "\a" (alarm)
and "\v" (vertical tab) in command line arguments and regular
expressions, further to the support for them in strings added on
Apr 9, 1989. These now no longer match as literal "a" and "v"
characters (as they don't on gawk and mawk).
IOW, lex.c already supported these (lines 390-391 as of 4e343460);
the support needed to be added to b.c and tran.c.
Relevant POSIX reference:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html#tag_20_06_13_04