IEEE Std 1003.1-2008 mandates that -F str be treated the same as -v
FS=str. For a null string, this was not the case. Since awk(1) documents
that a null string for FS has a specific behavior, make -F '' behave
consistently with -v FS="".
PR:
upstream issue: https://github.com/onetrueawk/awk/issues/127
Sponsored by: Netflix
If awk prints an error message while when compile_time is still set
to ERROR_PRINTING, don't try to print the context since there is
none. This can happen due to a problem with, e.g., unknown command
line options.
When awk reaches EOF parsing the program file, curpfile is incremented.
However, cursource() uses curpfile without checking it against npfile
which can cause an out of bounds access of pfile[] if there is a syntax
error at the end of the program file.
* Cast to uschar when storing a char in an int that will be used as an index.
Fixes a heap underflow when the input char has the high bit set and
FS is a regex.
* Add regress test for underflow when RS is a regex and input is 8-bit.
* In closeall(), skip stdin and flush std{err,out} instead of closing.
Otherwise awk could fclose(stdin) twice (it may appear more than once)
and closing stderr means awk cannot report errors with other streams.
For example, "awk 'BEGIN { getline < "-" }' < /dev/null" will call
fclose(stdin) twice, with undefined results.
* If closefile() is called on std{in,out,err}, freopen() /dev/null instead.
Otherwise, awk will continue trying to perform I/O on a closed stdio
stream, the behavior of which is undefined.
Commit 0d8778bbbb reintroduced a
regression that was fixed in commit
97a4b7ed21. The length of SUBSEP needs to
be rechecked after calling execute(), in case SUBSEP itself has been
changed.
Co-authored-by: Tim van der Molen <tim@kariliq.nl>
The optimization in commit 1d6ddfd9c0
reintroduced the regression that was fixed in commit
e26237434f.
Co-authored-by: Tim van der Molen <tim@kariliq.nl>
POSIX specifies a dprintf function that operates on an fd instead of
a stdio stream. Using upper case for macros is more idiomatic too.
We no longer need to use an extra set of parentheses for debugging
printf statements.
The errcheck() function treats an errno of ERANGE or EDOM as something
to report, so make sure errno is set to zero before invoking a
function to check so that a previous such errno value won't result
in a false positive. This could happen simply due to input line fields
that looked enough like floating-point input to trigger ERANGE.
Reported by Jordan Geoghegan, fix from Philip Guenther.
* LC_NUMERIC radix issue.
According to https://pubs.opengroup.org/onlinepubs/7990989775/xcu/awk.html
The period character is the character recognized in processing awk
programs. Make it so that during output we also print the period
character, since this is what other awk implementations do, and it
makes sense from an interoperability point of view.
* print "T.builtin" in the error message
* Fix backslash continuation line handling.
* Keep track of RS processing so we apply the regex properly only once
per record.
* - enhance fpe handler to print the error type
- cleanup argument parsing
- dynamically allocate program filename array
* bison uses enums now, not #define's, make it work with that.
* We need to use either the enums or the defines but not both. This
is because bison -y will create both enums and #defines, while bison
without -y produces only the enums, and byacc produces just #defines.
* fix indentation
* Set the tokentype when we have a match in the scan, and reset it later
when we decide that the match was bad. Fixes nbyacc.
* - don't use pattern rules for portability
- try to move both flavors of generated names for portability
* Amend tests for the new error messages