Commit Graph

207 Commits

Author SHA1 Message Date
Arnold D. Robbins
92b775b3ec Merge branch 'master' into staging 2020-02-28 13:24:19 +02:00
zoulasc
ffee7780fe
3 more fixes (#75)
* LC_NUMERIC radix issue.

According to https://pubs.opengroup.org/onlinepubs/7990989775/xcu/awk.html
The period character is the character recognized in processing awk
programs.  Make it so that during output we also print the period
character, since this is what other awk implementations do, and it
makes sense from an interoperability point of view.

* print "T.builtin" in the error message

* Fix backslash continuation line handling.

* Keep track of RS processing so we apply the regex properly only once
per record.
2020-02-28 13:23:54 +02:00
enh-google
7b245a0266
Fix hwasan global overflow. (#76)
* Fix hwasan global overflow.

Crash found with https://source.android.com/devices/tech/debug/hwasan
but also detectable by regular ASan. Here's an ASan crash:

==215690==ERROR: AddressSanitizer: global-buffer-overflow on address
  0x55d90f8da140 at pc 0x55d90f8b7503 bp 0x7ffd3dae6100 sp 0x7ffd3dae60f8
  READ of size 4 at 0x55d90f8da140 thread T0
    #0 0x55d90f8b7502 in word /tmp/awk/lex.c:496
    #1 0x55d90f8b939f in yylex /tmp/awk/lex.c:191
    #2 0x55d90f894ab9 in yyparse /tmp/awk/awkgram.tab.c:2366
    #3 0x55d90f89edc2 in main /tmp/awk/main.c:216
    #4 0x7ff263a78bba in __libc_start_main ../csu/libc-start.c:308
    #5 0x55d90f8945a9 in _start (/tmp/awk/a.out+0x115a9)

0x55d90f8da141 is located 0 bytes to the right of global variable
'infunc' defined in 'awkgram.y:35:6' (0x55d90f8da140) of size 1

SUMMARY: AddressSanitizer: global-buffer-overflow /tmp/awk/lex.c:496 in word
Shadow bytes around the buggy address:
  0x0abba1f133d0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0abba1f133e0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0abba1f133f0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0abba1f13400: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
  0x0abba1f13410: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
=>0x0abba1f13420: 04 f9 f9 f9 f9 f9 f9 f9[01]f9 f9 f9 f9 f9 f9 f9
  0x0abba1f13430: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
  0x0abba1f13440: 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
  0x0abba1f13450: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
  0x0abba1f13460: f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9
  0x0abba1f13470: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9

And here's the stack trace from hwasan:

  Stack Trace:
  RELADDR           FUNCTION         FILE:LINE
  00000000000168d4  word             external/one-true-awk/lex.c:496:18
  000000000002d1ec  yyparse          y.tab.c:2460:16
  000000000001c82c  main             external/one-true-awk/main.c:179:2
  00000000000b41a0  __libc_init      bionic/libc/bionic/libc_init_dynamic.cpp:151:8

As it says, we're doing a 4-byte read from a 1-byte global.

`infunc` is declared as an int but defined as a bool.

Signed-off-by: Evgenii Stepanov <eugenis@google.com>

* Add ASan cflags to makefile.

They're not used by default, but this way they're easily to hand next
time they're wanted.
2020-02-28 13:18:29 +02:00
Arnold D. Robbins
91eaf7f701 Small fix to the man page. 2020-02-20 19:53:39 +02:00
Arnold D. Robbins
e92c8e4d0e Update FIXES, version. 2020-02-19 20:47:40 +02:00
zoulasc
c2c8ecbedf
More minor fixes: (#73)
* More minor fixes:

- add missing initializers
- fix sign-compare warnings
- fix shadowed variable
2020-02-19 20:44:49 +02:00
Arnold D. Robbins
ed6ff8c1cb Small cleanups before merge to master. 2020-02-18 21:26:24 +02:00
zoulasc
94e4c04561
argument parsing cleanups, dynamic program file allocation, fpe error enhancement. (#72)
* - enhance fpe handler to print the error type
- cleanup argument parsing
- dynamically allocate program filename array

* bison uses enums now, not #define's, make it work with that.

* We need to use either the enums or the defines but not both. This
is because bison -y will create both enums and #defines, while bison
without -y produces only the enums, and byacc produces just #defines.

* fix indentation

* Set the tokentype when we have a match in the scan, and reset it later
when we decide that the match was bad. Fixes nbyacc.

* - don't use pattern rules for portability
- try to move both flavors of generated names for portability

* Amend tests for the new error messages
2020-02-18 21:20:27 +02:00
Arnold D. Robbins
e9c99065fd Update README.md PR instructions. 2020-02-07 09:32:41 +02:00
Arnold D. Robbins
e6fe674b40 Restore zoulas fixes, stage 3. 2020-02-06 22:38:30 +02:00
Arnold D. Robbins
cd552112a7 Restore zoulas fixes, stages 2. 2020-02-06 22:32:55 +02:00
Arnold D. Robbins
5068d20ef6 Restore zoulas fixes, step 1. 2020-02-06 22:27:31 +02:00
Arnold D. Robbins
d7a7e4d147 Revert zoulas changes until we can keep tests passing. 2020-02-06 22:08:20 +02:00
Arnold D. Robbins
8447cc9d47 Update version and FIXES. 2020-02-06 21:47:31 +02:00
Arnold D. Robbins
3c755f73f4 Fix closeall for portability. 2020-02-06 21:45:46 +02:00
zoulasc
110bdc6b3e
misc fixes (#69)
* Add a test for german case folding.

* Add a function to copy a string with a string with a larger allocation
  (to be used by the case folding routines)
* Add printf attributes to the printf-like functions and fix one format
  warning
* Cleanup the tempfree macro
* make more functions static
* rename fp to frp (FRame Pointer) to avoid shadowing with fp (File Pointer).
* add more const
* fix indent in UPLUS case
* add locale-aware case folding
* make nfiles size_t
* fix bugs in file closing:
    - compare fclose to EOF and pclose to -1
    - use nfiles instead of FOPEN_MAX in closeall
    - don't close files we did not open (0,1,2) fpurge/fflush instead

* - use NUL instead of 0 for char comparisons
- add ISWS() macro
- use continue; instead of ;

* Check for existance of the german locale before using it.

* Add missing parentheses, thanks Arnold.
2020-02-06 21:25:36 +02:00
Arnold D. Robbins
768d6b5886 Get tests working again. 2020-01-31 08:54:10 +02:00
Arnold D. Robbins
78c79c06d0 Fix a{0}, update tests. 2020-01-31 08:40:11 +02:00
Arnold D. Robbins
e2d71a98a4 Merge branch 'fix-int-expr-zero' 2020-01-31 08:25:51 +02:00
Michael Forney
69325710b1
Use MB_LEN_MAX instead of MB_CUR_MAX to avoid VLA (#70)
MB_CUR_MAX is the maximum number of bytes in a multibyte character
for the current locale, and might not be a constant expression.
MB_LEN_MAX is the maximum number of bytes in a multibyte character
for any locale, and always expands to a constant-expression.
2020-01-31 08:23:34 +02:00
Arnold D. Robbins
a3e9e8285e Fix for a{0} bug. 2020-01-24 11:16:31 +02:00
Arnold D. Robbins
4d9b12969e Update version info. 2020-01-24 11:15:30 +02:00
zoulasc
6a8770929d Small fixes (#68)
* sprinkle const, static
* account for lineno in unput
* Add an EMPTY string that is used when a non-const empty string is needed.
* make inputFS static and dynamically allocated
* Simplify and in the process avoid -Wwritable-strings
* make fs const to avoid -Wwritable-strings
2020-01-24 11:11:59 +02:00
Arnold D. Robbins
5a18f63b8d Set the close-on-exec flag for file and pipe redirections. 2020-01-22 02:10:59 -07:00
Arnold D. Robbins
de6284e037 Fix Issue 60; sub/gsub follow POSIX if POSIXLY_CORRECT in the environment. 2020-01-19 20:37:33 +02:00
Arnold D. Robbins
df6ccd2982 Add TODO file. 2020-01-17 14:08:59 +02:00
Arnold D. Robbins
3ed74525f6 Update date in version. 2020-01-17 14:03:52 +02:00
Martijn Dekker
fed1a562c3 Make I/O errors fatal instead of mere warnings (#63)
An input/output error indicates a fatal condition, even if it
occurs when closing a file. Awk should not return success on I/O
error, but treat I/O errors as it already treats write errors.

Test case:

$ (trap '' PIPE; awk 'BEGIN { print "hi"; }'; echo "E $?" >&2) | :
awk: i/o error occurred while closing /dev/stdout
 source line number 1
E 2

The test case pipes a line into a dummy command that reads no
input, with SIGPIPE ignored so we rely on awk's own I/O checking.
No write error is detected, because the pipe is buffered; the
broken pipe is only detected as an I/O error on closing stdout.

Before this commit, "E 0" was printed (indicating status 0/success)
because an I/O error merely produced a warning. A shell script
was unable to detect the I/O error using the exit status.
2020-01-17 14:02:57 +02:00
Martijn Dekker
2976507cc1 rename T.concat to T.csconcat to avoid case-insensitive conflict (#64)
On case-insensitive file systems (i.e.: macOS), T.concat and
t.concat are the same file, so these conflicted. This commit
renames T.concat to avoid the conflict.
2020-01-10 12:13:26 +02:00
Arnold D. Robbins
944989bf68 Minor fixes. 2020-01-06 00:01:46 -07:00
Arnold D. Robbins
c7eeb57210 Fix merging of concatenated string constants. 2020-01-05 21:18:36 +02:00
Arnold D. Robbins
a1aad88728 Correct text in README.md. 2020-01-01 22:45:04 +02:00
Arnold D. Robbins
140802c128 Small formatting cleanups in b.c. 2020-01-01 22:42:50 +02:00
Arnold D. Robbins
3358f3f36b Cleanups from valgrind. 2020-01-01 22:42:20 +02:00
Arnold D. Robbins
7db55ba13f Bug fix in interval expressions. 2019-12-27 12:03:35 +02:00
Arnold D. Robbins
1951e01288 More edits in README.md. 2019-12-11 21:15:38 +02:00
Arnold D. Robbins
eda30ac8c5 Add last updated date to README.md. 2019-12-11 20:53:26 +02:00
Arnold D. Robbins
bab7b07f01 Move README to README.md. 2019-12-11 20:52:54 +02:00
Arnold D. Robbins
0b82bc6eb4 Small edits, update version and FIXES. 2019-12-11 09:24:38 +02:00
zoulasc
a96aebbbd6 Fix printf format conversions. (#59)
Further simplify printf % parsing by eating the length specifiers
during the copy phase, and substitute 'j' when finalizing the format.
Add some more tests for this.
2019-12-11 09:17:34 +02:00
zoulasc
af86dacfad Fix memory corruption manifested on 32 bit binaries (#58)
* Don't update gototab entries for HAT (corrupts memory)
2019-12-09 09:00:45 +02:00
Arnold D. Robbins
416c6db5ee Update version and FIXES. 2019-12-08 21:43:32 +02:00
zoulasc
ff5d67610c Fix printf formats for integers (#57)
* More cleanups:
- sprinkle const
- add a macro (setptr) that cheats const to temporarily NUL terminate strings
  remove casts from allocations
- use strdup instead of strlen+strcpy
- use x = malloc(sizeof(*x)) instead of x = malloc(sizeof(type of *x)))
- add -Wcast-qual (and casts through unitptr_t in the two macros we
  cheat (xfree, setptr)).

* More cleanups:
- add const
- use bounded sscanf
- use snprintf instead of sprintf

* More cleanup:
- use snprintf/strlcat instead of sprintf/strcat
- use %j instead of %l since we are casting to intmax_t/uintmax_t

* Merge the 3 copies of the code that evaluated array strings with separators
and convert them to keep track of lengths and use memcpy instead of strcat.

* Fix formats for 32 bit machines broken by previous commit.
We use intmax_t to provide maximum range for both 32 and 64 bit machines.
2019-12-08 21:41:27 +02:00
Arnold D. Robbins
108224b484 Convert variables to bool and enum. 2019-11-10 21:19:18 +02:00
Arnold D. Robbins
c879fbf013 From Ori Bernstein, ori@eigenstate.org, for FS="" in multibyte locale. 2019-11-08 14:40:18 +02:00
Arnold D. Robbins
0e1bebcc09 Small fixes in the test suite. 2019-11-08 14:36:37 +02:00
Arnold D. Robbins
b73bfabb42 Typo fix in FIXES. 2019-11-08 14:31:05 +02:00
Arnold D. Robbins
2a8f1758b9 Update FIXES and main.c version. 2019-10-25 11:03:02 -04:00
Arnold D. Robbins
938b26cfae Don't use 'j' flag on %x. Remove an unnecessary cast. 2019-10-25 11:01:17 -04:00
zoulasc
0d8778bbbb more cleanups (#55)
* More cleanups:
- sprinkle const
- add a macro (setptr) that cheats const to temporarily NUL terminate strings
  remove casts from allocations
- use strdup instead of strlen+strcpy
- use x = malloc(sizeof(*x)) instead of x = malloc(sizeof(type of *x)))
- add -Wcast-qual (and casts through unitptr_t in the two macros we
  cheat (xfree, setptr)).

* More cleanups:
- add const
- use bounded sscanf
- use snprintf instead of sprintf

* More cleanup:
- use snprintf/strlcat instead of sprintf/strcat
- use %j instead of %l since we are casting to intmax_t/uintmax_t

* Merge the 3 copies of the code that evaluated array strings with separators
and convert them to keep track of lengths and use memcpy instead of strcat.
2019-10-25 10:59:09 -04:00