Commit Graph

33 Commits

Author SHA1 Message Date
Anonymous AWK fan 40f0527d5d Fix https://github.com/onetrueawk/awk/issues/121 2021-10-09 19:23:05 +01:00
Todd C. Miller d54b703cae Fix size computation in replace_repeat() for special_case REPEAT_WITH_Q.
This resulted in the NUL terminator being written to the end of the
buffer which was not the same as the end of the string.  That in
turn caused garbage bytes from malloc() to be processed.  Also
change the NUL termination to be less error prone by writing the
NUL immediately after the last byte copied.

Reproducible with the following under valgrind:
echo '#!/usr/bin/awk' | awk \
'/^#! ?\/.*\/[a-z]{0,2}awk/ {sub(/^#! ?\/.*\/[a-z]{0,2}awk/,"#! awk"); print}'
2021-03-02 12:58:50 -07:00
Arnold D. Robbins 3b42cfaf73 Make it compile with g++. 2020-10-13 20:52:43 +03:00
Arnold D. Robbins 07f0438423 Move exclusively to bison as parser generator. 2020-07-30 17:12:45 +03:00
Todd C. Miller 22ee26b925
Cast to uschar when storing a char in an int that will be used as an index (#88)
* Cast to uschar when storing a char in an int that will be used as an index.
Fixes a heap underflow when the input char has the high bit set and
FS is a regex.

* Add regress test for underflow when RS is a regex and input is 8-bit.
2020-07-29 21:27:45 +03:00
Arnold D. Robbins 0f25df0619 Merge branch 'staging' 2020-06-25 21:34:50 +03:00
awkfan77 e5a89e63fe
Fix onetrueawk#83 (#84) 2020-06-25 21:33:52 +03:00
Todd C. Miller 292d39f7b7
Rename dprintf to DPRINTF and use C99 cpp variadic arguments. (#82)
POSIX specifies a dprintf function that operates on an fd instead of
a stdio stream.  Using upper case for macros is more idiomatic too.
We no longer need to use an extra set of parentheses for debugging
printf statements.
2020-06-25 21:32:34 +03:00
Arnold D. Robbins a3e9e8285e Fix for a{0} bug. 2020-01-24 11:16:31 +02:00
zoulasc 6a8770929d Small fixes (#68)
* sprinkle const, static
* account for lineno in unput
* Add an EMPTY string that is used when a non-const empty string is needed.
* make inputFS static and dynamically allocated
* Simplify and in the process avoid -Wwritable-strings
* make fs const to avoid -Wwritable-strings
2020-01-24 11:11:59 +02:00
Arnold D. Robbins 944989bf68 Minor fixes. 2020-01-06 00:01:46 -07:00
Arnold D. Robbins 140802c128 Small formatting cleanups in b.c. 2020-01-01 22:42:50 +02:00
Arnold D. Robbins 7db55ba13f Bug fix in interval expressions. 2019-12-27 12:03:35 +02:00
zoulasc af86dacfad Fix memory corruption manifested on 32 bit binaries (#58)
* Don't update gototab entries for HAT (corrupts memory)
2019-12-09 09:00:45 +02:00
Arnold D. Robbins 108224b484 Convert variables to bool and enum. 2019-11-10 21:19:18 +02:00
zoulasc 6589208eaf More cleanups: (#53)
- sprinkle const
- add a macro (setptr) that cheats const to temporarily NUL terminate strings
  remove casts from allocations
- use strdup instead of strlen+strcpy
- use x = malloc(sizeof(*x)) instead of x = malloc(sizeof(type of *x)))
- add -Wcast-qual (and casts through unitptr_t in the two macros we
  cheat (xfree, setptr)).
2019-10-24 09:40:15 -04:00
zoulasc c16e8696d7 Amended the all pull request. (#51)
* [from NetBSD]
- dynamic state allocation
- centralize vector growth
- centralize int array allocation
- no casts for void * functions

* - add missing allocation
- revert change loop in pmatch
2019-10-17 13:04:45 -04:00
Arnold D. Robbins 5ff28208db Fix compile warnings. 2019-10-07 15:50:53 +03:00
Arnold D. Robbins 643a5a3dad Add RS as regex code, ifdefed-out, from NetBSD. 2019-09-10 12:19:48 +03:00
Alexander Richardson cbf924342b Fix out-of-bounds access in gototab array for caret character (#47)
When matching a caret, the expression	`f->gototab[s][c] = f->curstat;` in
cgoto() will index the 2D-array gototab with [s][261]. However, gototab
is declared as being of size [NSTATES][NCHARS], so [32][259]. Therefore,
this assignment will write to the state for character 0x1.
I'm not sure how to create a regression test for this, but increasing the
array size to HAT+1 values fixes the error and the tests still pass.

I found this issue while running awk on a CHERI system with sub-object
protection enabled. On x86, this can be reproduced by compiling awk
with -fsanitize=undefined.
2019-09-10 09:54:11 +03:00
Arnold D. Robbins 795a06b58c Remove trailing whitespace on lines in all files. 2019-07-28 05:51:52 -06:00
Martijn Dekker 5b602ca8a2 Add support for "\a" and "\v" to regex and command line args (#44)
Support POSIX-specified C-style escape sequences "\a" (alarm)
and "\v" (vertical tab) in command line arguments and regular
expressions, further to the support for them in strings added on
Apr 9, 1989. These now no longer match as literal "a" and "v"
characters (as they don't on gawk and mawk).

IOW, lex.c already supported these (lines 390-391 as of 4e343460);
the support needed to be added to b.c and tran.c.

Relevant POSIX reference:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html#tag_20_06_13_04
2019-07-26 11:46:58 +03:00
pfg 650d868ec4 MFV r315425: one-true-awk: have calloc(3) do the multiplication.
MFC after:	3 days

git-svn-id: svn+ssh://svn.freebsd.org/base/head@315426 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
2019-07-16 22:12:26 +03:00
pfg 524219409a MFV r300961: one-true-awk: replace 0 with NULL for pointers
Also remove a redundant semicolon.

Also had to rebase on upstream pull.

git-svn-id: svn+ssh://svn.freebsd.org/base/head@301289 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
2019-07-16 22:11:57 +03:00
Arnold D. Robbins 28dacbd66b Allow unmatched right paren in regexes. Fixes Issue #40. 2019-06-04 23:53:31 -06:00
Martijn Dekker e6ecf52e04 Merge remote-tracking branch 'upstream/master' into interval-expr
main.c: bump version to 20190305
2019-03-05 03:45:40 +01:00
Martijn Dekker 0619d5d537 repeat(): add FATAL calls for errors that should be impossible 2019-02-21 22:38:16 +01:00
Leonardo Taccari 031aac816d
Merge branch 'master' into cc_func-avoid-undefined-behaviour 2019-01-28 17:34:58 +01:00
Martijn Dekker 8a2222286c backport ERE interval/repetition expressions from Apple awk-24
The lack of POSIX interval expressions[*] (a.k.a. bounds, a.k.a.
repetition expressions) in regular expressions is listed under BUGS
in 'awk.1'. Apple's version of onetrueawk has supported these since
at least 2009, judging by the date stamp on their src/b.c in:
https://opensource.apple.com/tarballs/awk/awk-24.tar.gz

A bug report prompted NetBSD to swiftly integrate this code into
their awk. This commit is based on that NetBSD diff.
http://gnats.netbsd.org/53885
f3e4c4ca1d

b.c:
- Backport POSIX-standard interval expressions support in regular
  expressions via NetBSD from Apple awk-24 (20070501).

main.c:
- Bump version ID.

FIXES:
- Add note and credit for this feature.

awk.1: section BUGS:
- Remove line saying interval expressions are not supported.

_________
[*] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_06
2019-01-23 09:12:27 +00:00
Cody Peter Mello a6392ef31c Fix regular expressions containing [[:cntrl:]] 2018-11-12 10:25:44 -08:00
Leonardo Taccari 05014f5b9e avoid undefined behaviour when using ctype(3) functions in relex()
Because NCHARS is (256+3) cc->cc_func(i) was called with 256, 257
and 258 as argument leading to possible undefined behaviour (at
least on NetBSD with non-C locale (e.g. `en_US.UTF-8') this led to
only honoring one `[:...:]' character class in bracket expressions).

Fix #11
2018-08-29 18:06:33 +02:00
Arnold D. Robbins 32093f5bbf Fix multiple long-standing bugs, improve test suite. 2018-08-22 20:40:26 +03:00
Brian Kernighan 87b94932e6 initial commit for github 2012-12-22 10:35:39 -05:00