Commit Graph

221 Commits (master)

Author SHA1 Message Date
ozan yigit 075624a72a minor edit, maintainer email. 2022-01-24 18:59:18 -05:00
ozan yigit b9c01f5122 Merge branch 'mpinjr-fix-ioerrors' into staging 2021-12-26 13:44:58 -05:00
Miguel Pineiro Jr 7816d47dc8 When closing, don't flush input streams
They don't need it.
2021-12-16 20:07:10 -05:00
Miguel Pineiro Jr 99f6a43296 Fix error handling in closefile and closeall
printstat and awkprintf are very clear: print statement errors
are fatal.

In Jan 2020 [1], to prevent fatal print errors from masquerading
as fclose warnings, every WARNING in closefile and closeall became
FATAL. This broke awk's close and getline functions.

close no longer returns if there's an error, unless the stream
doesn't exist.

getline read errors still return -1, but they are no longer
ignorable. Eventually, one of the closing functions will inspect the
stream with ferror and call FATAL.

In Jul 2020 [2], fatal stdout write errors which had been detectable by
closefile for a few months became invisible, a consequence of switching
standard streams from fclose (which reports flush errors) to freopen
(which ignores them). The Jan 2020 changes which broke getline and
close were themselves partially broken.

The solution is to finish printing before closing. That is to flush
and ferror every stream opened for writing before calling fclose,
pclose, or freopen. A failure to write print statement data is
fatal. A failure to close a flushed stream is a warning. They must
be handled separately.

Every redirected print statement is finished in printstat or awkprintf.

The same is not true of unredirected print statements. To finish
these, stdout must be flushed at some point after the final such
statement. Any problem with that flush is fatal.

Though only stdout needs it, let's defensively finish every stream
opened for writing, so this bug won't recur if someone changes how
redirected streams are flushed.

Write errors on stderr by the implementation are never fatal. When
closing, we only warn of them. Write errors from an application
attempting a redirected print to /dev/stderr are as immediately fatal
as every other redirected print statement.

[1] fed1a562c3
[2] b82b649aa6
2021-12-08 23:06:02 -05:00
Miguel Pineiro Jr 1d780ac4f8 Delete leading spaces surrounding closefile
Aesthetics and convention aside, they confuse git diff into
misidentifying closefile as filename.
2021-12-08 22:07:49 -05:00
ozan yigit 01749f04cf Revert "resolve parsing of a slash character within a cclass "/[/]/" without escape"
This reverts commit d91c473c7c.
2021-11-25 13:29:49 -05:00
ozan yigit cfe6b6b99d Revert "version and FIXES updated."
This reverts commit 52fb5d07d3.
2021-11-25 13:29:16 -05:00
ozan yigit 52fb5d07d3 version and FIXES updated. 2021-11-23 16:10:27 -05:00
ozan yigit d91c473c7c resolve parsing of a slash character within a cclass "/[/]/" without escape 2021-11-23 16:09:51 -05:00
ozan yigit c50ef66d11 updated 2021-11-03 22:44:35 -04:00
ozan yigit a4ca5ea32f updated version date 2021-11-03 22:41:23 -04:00
ozan yigit 14c3fe42d2 updated FIXES, added getline corruption tests 2021-11-03 22:34:20 -04:00
Todd C. Miller 1debe1993f awkgetline: do not access unitialized data on EOF
getrec() returns 0 on EOF and leaves the contents of buf unchanged.
Fixes #133.
2021-11-01 12:03:32 -06:00
ozan yigit 275a80ff33 Heap buffer overflow from PR #83 fixed in #121 2021-10-12 00:06:51 -04:00
Anonymous AWK fan 40f0527d5d Fix 2021-10-09 19:23:05 +01:00
ozan yigit f9affa922c fix -F "str" vs -v FS="str" when str is null 2021-07-27 14:00:36 -04:00
ozan yigit dfd28d2e93 Merge branch 'fs-inconsistency' of into bsdimp-fs-inconsistency 2021-07-26 21:50:22 -04:00
ozan yigit aa8731ea81 PR #112, #116, #117 2021-07-25 14:37:03 -04:00
ozan yigit 3913329120 Merge branch 'fix-readrec' of into staging 2021-07-24 15:10:25 -04:00
ozan yigit 30fb6ef0da Merge branch 'fix-RS' of into staging 2021-07-24 15:06:21 -04:00
Warner Losh 45dab2a7e0 awk: Make -F '' and -v FS="" behave the same
IEEE Std 1003.1-2008 mandates that -F str be treated the same as -v
FS=str. For a null string, this was not the case. Since awk(1) documents
that a null string for FS has a specific behavior, make -F '' behave
consistently with -v FS="".

upstream issue:
Sponsored by:		Netflix
2021-07-20 08:10:50 -06:00
Miguel Pineiro Jr 92f9e8a9be Fix readrec's definition of a record
I botched readrec's definition of a record, when I implemented
RS regular expression support. This is the relevant hunk from the
old diff:

-	return c == EOF && rr == buf ? 0 : 1;
+	isrec = *buf || !feof(inf);
+	   dprintf( ("readrec saw <%s>, returns %d\n", buf, isrec) );
+	return isrec;

Problem #1

Unlike testing with EOF, `*buf || !feof(inf)` is blind to stdio
errors. This can cause an infinite loop whose each iteration fabricates
an empty record.

The following demonstration uses standard terminal access control
policy to produce a persistent error condition. Note that the "i/o
error" message does not come from readrec(). It's produced much later
by closeall() at shutdown.

$ trap '' SIGTTIN && awk 'END {print NR}' &
[1] 33517
$ # After fg, type ^D
$ fg
trap '' SIGTTIN && awk 'END {print NR}'
awk: i/o error occurred on /dev/stdin
 input record number 13847376, file
 source line number 1

Each time awk tries to read the terminal from the background,
while ignoring SIGTTIN, the read fails with EIO, getc returns EOF,
the stream's end-of-file indicator remains clear, and `!feof`
erroneously promotoes the empty buffer to an empty record.  So long
as the error persists, the stream's position does not advance and
end-of-file is never set.

Problem #2:

When RS is a regex, `*buf || !feof(inf)` can't see an empty record's
terminator at the end of a stream.

$ echo a | awk 1 RS='a\n'

That pipeline should have found one empty record and printed a blank
line, but `*buf || !feof(inf)` considers reaching the end of the
stream the conclusion of a fruitless search. That's only correct when
the terminator is a single character, because a regex RS search can
set the end-of-file marker even when it succeeds.

The Fix

`isrec` must be 0 **iff** no record is found. The correct definition
of "no record" is a failure to find a record terminator and a
failure to find any data (possibly from a final, unterminated
record). Conceptually, for any RS:

isrec = (noTERM && noDATA) ? 0 : 1

noDATA is an expression that's true if `buf` is empty, false otherwise.

When RS is null or a single character, noTERM is an expression
that is true when the sought after character is not found, false
otherwise. Since the search for a single character can only end with
that character or EOF, noTERM is `c == EOF`.

isrec = (c == EOF && rr == buf) ? 0 : 1

When RS is a regular expression: noTERM is an expression that is
true if a match for RS is not found, false otherwise. This is simply
the inverse of the result of the function that conducts the search,

isrec = (found == 0 && *buf == '\0') ? 0 : 1
2021-04-23 20:08:58 -04:00
Miguel Pineiro Jr feaf62d159 Fix regular expression RS ^-anchoring
RS ^-anchoring needs to know if it's reading the first record of a file.
Unfortunately, innew, the flag that the main i/o loop uses to track
this, didn't make it from NetBSD unscathed. This commit restores the
last of the wayward lines.

Without this fix, when reading the first record of an input file named
on the command line, the regular expression machinery will be
misconfigured, precluding a successful match.

Relevant commits:
1. 643a5a3dad (Initial import)
2. ffee7780fe (Restoring innew)
2021-04-16 20:31:36 -04:00
Todd C. Miller d54b703cae Fix size computation in replace_repeat() for special_case REPEAT_WITH_Q.
This resulted in the NUL terminator being written to the end of the
buffer which was not the same as the end of the string.  That in
turn caused garbage bytes from malloc() to be processed.  Also
change the NUL termination to be less error prone by writing the
NUL immediately after the last byte copied.

Reproducible with the following under valgrind:
echo '#!/usr/bin/awk' | awk \
'/^#! ?\/.*\/[a-z]{0,2}awk/ {sub(/^#! ?\/.*\/[a-z]{0,2}awk/,"#! awk"); print}'
2021-03-02 12:58:50 -07:00
Arnold D. Robbins c0f4e97e45 Fix compiling with g++. 2021-02-15 20:33:15 +02:00
ozan s. yigit 178f660b5a Change T.errmsg print to file fail test.
We cannot have a test that destroys eg. /etc/passwd if someone
runs it as root.
2021-01-10 15:24:37 -05:00
ozan s. yigit 1fd5fa38cc Fix a decision bug with trailing stuff in lib.c:is_valid_number
after dec 18 changes. updated FIXES, adjusted version date.
2021-01-06 18:37:48 -05:00
ozan s. yigit 7d1848cfa6 Merge branch 'staging' for 2020-12-25 16:55:02 -05:00
ozan s. yigit fdc0388333 updated: new maintainer 2020-12-25 16:53:55 -05:00
Arnold D. Robbins 8909e00b57 Inf and NaN values fixed and printing improved. "This time for sure!" 2020-12-18 11:57:48 +02:00
Arnold D. Robbins 982a574e32 Update FIXES and version. 2020-12-15 14:49:18 +02:00
Michael Forney 38e525fb7b
Include <strings.h> for strcasecmp (#99)
Though some implementations include this header indirectly through
string.h by default, the POSIX header that declares strcasecmp is

2020-12-15 14:46:30 +02:00
Arnold D. Robbins 6535bd6c35 Update FIXES and version in main.c. 2020-12-08 09:20:58 +02:00
Arnold Robbins cc9e9b68d1
Rework floating point conversions. (#98) 2020-12-08 08:05:22 +02:00
Arnold D. Robbins e508d2861c Update version and FIXES. 2020-12-03 19:33:11 +02:00
Todd C. Miller feb247a852
Don't print extra newlines on error before awk starts parsing. (#97)
If awk prints an error message while when compile_time is still set
to ERROR_PRINTING, don't try to print the context since there is
none.  This can happen due to a problem with, e.g., unknown command
line options.
2020-12-03 19:30:36 +02:00
Arnold D. Robbins a2a41a8e35 Add .TF macro to man page. Closes Issue #96. 2020-11-24 19:14:26 +02:00
Arnold D. Robbins 3b42cfaf73 Make it compile with g++. 2020-10-13 20:52:43 +03:00
Arnold D. Robbins 9804285af0 Additional fixes for DJGPP. 2020-08-16 18:48:05 +03:00
Arnold D. Robbins 9c63cb6ccd Update FIXES and version in main.c. 2020-08-07 13:15:17 +03:00
Chris b785141019
printf: The argument p shall be a pointer to void. (#93) 2020-08-07 13:10:20 +03:00
Arnold D. Robbins 1b3984634f Fix Issue #92; see FIXES. 2020-08-04 10:02:26 +03:00
Arnold D. Robbins 9b80a7c137 Update version and FIXES. 2020-07-30 17:15:58 +03:00
Arnold D. Robbins 07f0438423 Move exclusively to bison as parser generator. 2020-07-30 17:12:45 +03:00
Todd C. Miller 453ce8642b
Avoid accessing pfile[] out of bounds on syntax error at EOF. (#90)
When awk reaches EOF parsing the program file, curpfile is incremented.
However, cursource() uses curpfile without checking it against npfile
which can cause an out of bounds access of pfile[] if there is a syntax
error at the end of the program file.
2020-07-29 21:31:29 +03:00
Tim van der Molen e22bb7c625
Fix the T.errmsg test (#91)
Co-authored-by: Tim van der Molen <>
2020-07-29 21:29:46 +03:00
Todd C. Miller 22ee26b925
Cast to uschar when storing a char in an int that will be used as an index (#88)
* Cast to uschar when storing a char in an int that will be used as an index.
Fixes a heap underflow when the input char has the high bit set and
FS is a regex.

* Add regress test for underflow when RS is a regex and input is 8-bit.
2020-07-29 21:27:45 +03:00
Todd C. Miller b82b649aa6
Avoid using stdio streams after they have been closed. (#89)
* In closeall(), skip stdin and flush std{err,out} instead of closing.
Otherwise awk could fclose(stdin) twice (it may appear more than once)
and closing stderr means awk cannot report errors with other streams.
For example, "awk 'BEGIN { getline < "-" }' < /dev/null" will call
fclose(stdin) twice, with undefined results.

* If closefile() is called on std{in,out,err}, freopen() /dev/null instead.
Otherwise, awk will continue trying to perform I/O on a closed stdio
stream, the behavior of which is undefined.
2020-07-27 10:03:58 +03:00
Arnold D. Robbins 2a4146ec30 Add a note about low-level maintenance. 2020-07-02 21:39:56 +03:00
Arnold D. Robbins b2554a9e3d Add regression script for bugs-fixed directory. 2020-07-02 21:35:06 +03:00