Commit Graph

198 Commits

Author SHA1 Message Date
Warner Losh
45dab2a7e0 awk: Make -F '' and -v FS="" behave the same
IEEE Std 1003.1-2008 mandates that -F str be treated the same as -v
FS=str. For a null string, this was not the case. Since awk(1) documents
that a null string for FS has a specific behavior, make -F '' behave
consistently with -v FS="".

PR:
upstream issue:		https://github.com/onetrueawk/awk/issues/127
Sponsored by:		Netflix
2021-07-20 08:10:50 -06:00
Arnold D. Robbins
c0f4e97e45 Fix compiling with g++. 2021-02-15 20:33:15 +02:00
ozan s. yigit
178f660b5a Change T.errmsg print to file fail test.
We cannot have a test that destroys eg. /etc/passwd if someone
runs it as root.
2021-01-10 15:24:37 -05:00
ozan s. yigit
1fd5fa38cc Fix a decision bug with trailing stuff in lib.c:is_valid_number
after dec 18 changes. updated FIXES, adjusted version date.
2021-01-06 18:37:48 -05:00
ozan s. yigit
7d1848cfa6 Merge branch 'staging' for README.md 2020-12-25 16:55:02 -05:00
ozan s. yigit
fdc0388333 updated: new maintainer 2020-12-25 16:53:55 -05:00
Arnold D. Robbins
8909e00b57 Inf and NaN values fixed and printing improved. "This time for sure!" 2020-12-18 11:57:48 +02:00
Arnold D. Robbins
982a574e32 Update FIXES and version. 2020-12-15 14:49:18 +02:00
Michael Forney
38e525fb7b
Include <strings.h> for strcasecmp (#99)
Though some implementations include this header indirectly through
string.h by default, the POSIX header that declares strcasecmp is
strings.h[0].

[0] https://pubs.opengroup.org/onlinepubs/9699919799/functions/strcasecmp.html
2020-12-15 14:46:30 +02:00
Arnold D. Robbins
6535bd6c35 Update FIXES and version in main.c. 2020-12-08 09:20:58 +02:00
Arnold Robbins
cc9e9b68d1
Rework floating point conversions. (#98) 2020-12-08 08:05:22 +02:00
Arnold D. Robbins
e508d2861c Update version and FIXES. 2020-12-03 19:33:11 +02:00
Todd C. Miller
feb247a852
Don't print extra newlines on error before awk starts parsing. (#97)
If awk prints an error message while when compile_time is still set
to ERROR_PRINTING, don't try to print the context since there is
none.  This can happen due to a problem with, e.g., unknown command
line options.
2020-12-03 19:30:36 +02:00
Arnold D. Robbins
a2a41a8e35 Add .TF macro to man page. Closes Issue #96. 2020-11-24 19:14:26 +02:00
Arnold D. Robbins
3b42cfaf73 Make it compile with g++. 2020-10-13 20:52:43 +03:00
Arnold D. Robbins
9804285af0 Additional fixes for DJGPP. 2020-08-16 18:48:05 +03:00
Arnold D. Robbins
9c63cb6ccd Update FIXES and version in main.c. 2020-08-07 13:15:17 +03:00
Chris
b785141019
printf: The argument p shall be a pointer to void. (#93) 2020-08-07 13:10:20 +03:00
Arnold D. Robbins
1b3984634f Fix Issue #92; see FIXES. 2020-08-04 10:02:26 +03:00
Arnold D. Robbins
9b80a7c137 Update version and FIXES. 2020-07-30 17:15:58 +03:00
Arnold D. Robbins
07f0438423 Move exclusively to bison as parser generator. 2020-07-30 17:12:45 +03:00
Todd C. Miller
453ce8642b
Avoid accessing pfile[] out of bounds on syntax error at EOF. (#90)
When awk reaches EOF parsing the program file, curpfile is incremented.
However, cursource() uses curpfile without checking it against npfile
which can cause an out of bounds access of pfile[] if there is a syntax
error at the end of the program file.
2020-07-29 21:31:29 +03:00
Tim van der Molen
e22bb7c625
Fix the T.errmsg test (#91)
Co-authored-by: Tim van der Molen <tim@kariliq.nl>
2020-07-29 21:29:46 +03:00
Todd C. Miller
22ee26b925
Cast to uschar when storing a char in an int that will be used as an index (#88)
* Cast to uschar when storing a char in an int that will be used as an index.
Fixes a heap underflow when the input char has the high bit set and
FS is a regex.

* Add regress test for underflow when RS is a regex and input is 8-bit.
2020-07-29 21:27:45 +03:00
Todd C. Miller
b82b649aa6
Avoid using stdio streams after they have been closed. (#89)
* In closeall(), skip stdin and flush std{err,out} instead of closing.
Otherwise awk could fclose(stdin) twice (it may appear more than once)
and closing stderr means awk cannot report errors with other streams.
For example, "awk 'BEGIN { getline < "-" }' < /dev/null" will call
fclose(stdin) twice, with undefined results.

* If closefile() is called on std{in,out,err}, freopen() /dev/null instead.
Otherwise, awk will continue trying to perform I/O on a closed stdio
stream, the behavior of which is undefined.
2020-07-27 10:03:58 +03:00
Arnold D. Robbins
2a4146ec30 Add a note about low-level maintenance. 2020-07-02 21:39:56 +03:00
Arnold D. Robbins
b2554a9e3d Add regression script for bugs-fixed directory. 2020-07-02 21:35:06 +03:00
Tim van der Molen
ee5b49bb33
Fix regression with changed SUBSEP in subscript (#86)
Commit 0d8778bbbb reintroduced a
regression that was fixed in commit
97a4b7ed21. The length of SUBSEP needs to
be rechecked after calling execute(), in case SUBSEP itself has been
changed.

Co-authored-by: Tim van der Molen <tim@kariliq.nl>
2020-07-02 21:22:15 +03:00
Tim van der Molen
cc19af1308
Fix concatenation regression (#85)
The optimization in commit 1d6ddfd9c0
reintroduced the regression that was fixed in commit
e26237434f.

Co-authored-by: Tim van der Molen <tim@kariliq.nl>
2020-07-02 21:21:10 +03:00
Arnold D. Robbins
f232de85f6 Update FIXES and date in main.c. 2020-06-25 21:36:24 +03:00
Arnold D. Robbins
0f25df0619 Merge branch 'staging' 2020-06-25 21:34:50 +03:00
awkfan77
e5a89e63fe
Fix onetrueawk#83 (#84) 2020-06-25 21:33:52 +03:00
Todd C. Miller
292d39f7b7
Rename dprintf to DPRINTF and use C99 cpp variadic arguments. (#82)
POSIX specifies a dprintf function that operates on an fd instead of
a stdio stream.  Using upper case for macros is more idiomatic too.
We no longer need to use an extra set of parentheses for debugging
printf statements.
2020-06-25 21:32:34 +03:00
Arnold D. Robbins
cef5180110 Fix Issue 78 and apply PR 80. 2020-06-12 14:30:03 +03:00
Todd C. Miller
b2de1c4ee7
Clear errno before using errcheck() to avoid spurious errors. (#80)
The errcheck() function treats an errno of ERANGE or EDOM as something
to report, so make sure errno is set to zero before invoking a
function to check so that a previous such errno value won't result
in a false positive.  This could happen simply due to input line fields
that looked enough like floating-point input to trigger ERANGE.

Reported by Jordan Geoghegan, fix from Philip Guenther.
2020-06-12 14:16:12 +03:00
Arnold D. Robbins
754cf93645 In fldbld(), check that inputFS is set. 2020-06-05 12:25:15 +03:00
Arnold D. Robbins
1107437dce Fix test for use of noreturn. 2020-05-15 15:12:15 +03:00
Arnold D. Robbins
93e5dd87a1 Fix noreturn for old compilers. 2020-04-16 20:56:49 +03:00
Arnold D. Robbins
c3d8f9c500 Update FIXES and version date. 2020-04-05 21:14:46 +03:00
awkfan77
bb538fe67e
Replace __attribute__((__noreturn__)) with _Noreturn. (#77)
* Replace __attribute__((__noreturn__)) with _Noreturn.

* Change _Noreturn to noreturn and #include <stdnoreturn.h>
2020-04-05 21:10:52 +03:00
Arnold D. Robbins
2017c2e6ea Fixes from Christo Zoulas. 2020-02-28 13:47:42 +02:00
Arnold D. Robbins
92b775b3ec Merge branch 'master' into staging 2020-02-28 13:24:19 +02:00
zoulasc
ffee7780fe
3 more fixes (#75)
* LC_NUMERIC radix issue.

According to https://pubs.opengroup.org/onlinepubs/7990989775/xcu/awk.html
The period character is the character recognized in processing awk
programs.  Make it so that during output we also print the period
character, since this is what other awk implementations do, and it
makes sense from an interoperability point of view.

* print "T.builtin" in the error message

* Fix backslash continuation line handling.

* Keep track of RS processing so we apply the regex properly only once
per record.
2020-02-28 13:23:54 +02:00
enh-google
7b245a0266
Fix hwasan global overflow. (#76)
* Fix hwasan global overflow.

Crash found with https://source.android.com/devices/tech/debug/hwasan
but also detectable by regular ASan. Here's an ASan crash:

==215690==ERROR: AddressSanitizer: global-buffer-overflow on address
  0x55d90f8da140 at pc 0x55d90f8b7503 bp 0x7ffd3dae6100 sp 0x7ffd3dae60f8
  READ of size 4 at 0x55d90f8da140 thread T0
    #0 0x55d90f8b7502 in word /tmp/awk/lex.c:496
    #1 0x55d90f8b939f in yylex /tmp/awk/lex.c:191
    #2 0x55d90f894ab9 in yyparse /tmp/awk/awkgram.tab.c:2366
    #3 0x55d90f89edc2 in main /tmp/awk/main.c:216
    #4 0x7ff263a78bba in __libc_start_main ../csu/libc-start.c:308
    #5 0x55d90f8945a9 in _start (/tmp/awk/a.out+0x115a9)

0x55d90f8da141 is located 0 bytes to the right of global variable
'infunc' defined in 'awkgram.y:35:6' (0x55d90f8da140) of size 1

SUMMARY: AddressSanitizer: global-buffer-overflow /tmp/awk/lex.c:496 in word
Shadow bytes around the buggy address:
  0x0abba1f133d0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0abba1f133e0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0abba1f133f0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0abba1f13400: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
  0x0abba1f13410: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
=>0x0abba1f13420: 04 f9 f9 f9 f9 f9 f9 f9[01]f9 f9 f9 f9 f9 f9 f9
  0x0abba1f13430: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
  0x0abba1f13440: 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
  0x0abba1f13450: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
  0x0abba1f13460: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
  0x0abba1f13470: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9

And here's the stack trace from hwasan:

  Stack Trace:
  RELADDR           FUNCTION         FILE:LINE
  00000000000168d4  word             external/one-true-awk/lex.c:496:18
  000000000002d1ec  yyparse          y.tab.c:2460:16
  000000000001c82c  main             external/one-true-awk/main.c:179:2
  00000000000b41a0  __libc_init      bionic/libc/bionic/libc_init_dynamic.cpp:151:8

As it says, we're doing a 4-byte read from a 1-byte global.

`infunc` is declared as an int but defined as a bool.

Signed-off-by: Evgenii Stepanov <eugenis@google.com>

* Add ASan cflags to makefile.

They're not used by default, but this way they're easily to hand next
time they're wanted.
2020-02-28 13:18:29 +02:00
Arnold D. Robbins
91eaf7f701 Small fix to the man page. 2020-02-20 19:53:39 +02:00
Arnold D. Robbins
e92c8e4d0e Update FIXES, version. 2020-02-19 20:47:40 +02:00
zoulasc
c2c8ecbedf
More minor fixes: (#73)
* More minor fixes:

- add missing initializers
- fix sign-compare warnings
- fix shadowed variable
2020-02-19 20:44:49 +02:00
Arnold D. Robbins
ed6ff8c1cb Small cleanups before merge to master. 2020-02-18 21:26:24 +02:00
zoulasc
94e4c04561
argument parsing cleanups, dynamic program file allocation, fpe error enhancement. (#72)
* - enhance fpe handler to print the error type
- cleanup argument parsing
- dynamically allocate program filename array

* bison uses enums now, not #define's, make it work with that.

* We need to use either the enums or the defines but not both. This
is because bison -y will create both enums and #defines, while bison
without -y produces only the enums, and byacc produces just #defines.

* fix indentation

* Set the tokentype when we have a match in the scan, and reset it later
when we decide that the match was bad. Fixes nbyacc.

* - don't use pattern rules for portability
- try to move both flavors of generated names for portability

* Amend tests for the new error messages
2020-02-18 21:20:27 +02:00
Arnold D. Robbins
e9c99065fd Update README.md PR instructions. 2020-02-07 09:32:41 +02:00