The single-line macro argument parsing was completely broken as a
comma would not be recognized as an argument separator.
In the process of fixing this, make a fair bit of code cleanups.
Note: reverse tokens for smacro->expansion doesn't actually make any
sense anymore, might reconsider that.
This checkin also removes the distinction between "magic" and plain
smacros; the only difference is which specific expand method is being
invoked.
Finally, extend the allocating-string functions such that *all* the
allocating string functions support querying the length of the string
a posteori.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
We have to call expand_one_smacro() recursively, otherwise we will not
expand smacros which point to other smacros. We cannot simply do this
by looping after token pasting, because we need to make sure we don't
recursively expand the same smacro.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
For constructs like TIMES xx RESB yy merge the TIMES and RESB and feed
a single reservation to the backend; this can (obviously) be
dramatically faster.
Add byte count in listings for <incbin> and repeat count to <rept>; to
make them more reasonable in length shorten to <bin ...> and <rep ...>
respectively, and don't require leading zeroes in bin/rep/res count.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
TOKEN_ID is from enum pp_token_type, but struct Type has enum
token_type. TOK_ID seems to be a matched one.s
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Split the code for getting a line of tokens from the code that sets
verror and detokenizes the resulting string.
While we are at it, merge the handling of EOF and ^Z into the general
loop in read_line().
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The smacro expansion code was virtually impossible to understand, and
was leading to very strange failures. Clean it up, and do much better
handling of magic macros. This should also allow for recursive
macros, but recursive macros are extremely tricky in that it is very
hard to keep them from recursing forever, unless there is at least one
argument which is never expanded. They are not currently implemented.
Even so, I believe token pasting makes it possible to create infinite
loops; e.g.:
%define foo foo %+
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
In nasm_unquote_cstr(), disallow any control character, not just
NUL. This will matter when allowing quoting symbols.
Merge nasm_unquote() and nasm_unquote_cstr().
Strings can now be concatenated, C style: adjacent quoted strings
(including whitespace-separated) are merged into a single string.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
All directives which create single-line macros now have %i... variants
to define case-insensitive versions. Case insensitive rather sucks,
but at least this way it is consistent.
Single-line macro parameters can now be evaluated as a number, as done
by %assign. To do so, declare a parameter starting with =, for
example:
%define foo(x,=y) mov [x],macro_array_y
... would evaluate y as a number but leave x as a string.
NOTE: it would arguably be better to have this as a per-instance
basis, but it is easily handled by having a secondary macro called
with the same argument twice.
Finally, add a more consistent method for defining "magic" macros,
which need to be evaluated at runtime. For now, it is only used by the
special macros __FILE__, __LINE__, __BITS__, __PTR__, and __PASS__.
__PTR__ is a new macro which evaluates to word, dword or qword
matching the value of __BITS__.
The magic macro framework, however, provides a natural hook for a
future plug-in infrastructure to hook into a scripting language.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
"compiler.h" already includes a bunch of common include files. There
is absolutely no reason to duplicate them in individual files, and in
fact it robs us of central control of how these files are used.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
For almost everything we should use "nctype.h". Right now we don't
have a nasm_toupper() to use <ctype.h> for things that need toupper().
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
There is absolutely no reason not to include <string.h> globally, and
with the inline function for mempcpy() we need it there anyway.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
With buffered warnings, most warnings *must* be issued on every pass,
so ERR_PASS1 is simply wrong in most cases.
ERR_PASS1 now means "force this warning to be output even in
pass_first(). This is to be used for the case where the warning is
only executed in pass_first() code; this is highly discouraged as it
means the warnings will not appear in the list file and subsequent
passes may make the warning suddenly vanish.
ERR_PASS2 just as before suppresses an error or warning unless we are
in pass_final().
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The use of pass0, pass1, pass2, and "pass" passed as an argument is
really confusing and already caused a severe bug in the 2.14.01
release cycle. Clean them up and be far more explicit about what
various passes mean.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
We want to strongly encourage writers of warnings to create warning
categories, so remove the flagless nasm_warn() and change nasm_warnf()
to nasm_warn().
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
For debugging preprocessed code, it is useful to be able to ignore
%line directives rather than having to filter them out externally.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
It is extremely desirable to allow the user fine-grained control of
warnings, but this has been complicated by the fact that a warning
class has had to be defined in no less than three places (error.h,
error.c, nasmdoc.src) before it can be used in source code. Instead,
use a script to define these via magic comments at the point of use.
This hopefully will encourage creating new classes as needed.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Change the severity parameter to the error function from "int" to an
unsigned typedef, currently uint32_t.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The prefix ERR_WARN_ is unnecessarily long and may be a disincentive
to create new warning categories. Change it to WARN_*, it is still
plenty distinctive.
This is equivalent to nasm-2.14.xx checkin 77f53ba6d4cb90e5a7e09b33357ed7c1fe9f6b9d.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
ERR_HERE is used to mark messages of the form "... here" so that we
can emit sane output to the list file with filename and line number,
instead of a nonsensical "here" which could point almost anywhere.
This patch contains some changes from the one in the master branch to
make the code cleaner.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The prefix ERR_WARN_ is unnecessarily long and may be a disincentive
to create new warning categories. Change it to WARN_*, it is still
plenty distinctive.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The currently-unused strtbl was basically a slightly different version
of strlist, with the find and linearize capabilities. Merge these two
together by augmenting strlist to have the same capabilities.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add binary key support to the hash table interface. Clean up the
interface to contain less extraneous crud.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
ERR_HERE is used to mark messages of the form "... here" so that we
can emit sane output to the list file with filename and line number,
instead of a nonsensical "here" which could point almost anywhere.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Create our own ctype table where we can do the tests we want to do
cheaply, instead of calling ctype functions and then adding additional
tests all over the code.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
readnum returns 64bit number which may become
a negative integer upon conversion which in
turn lead to out of bound array access.
Fix it by explicit conversion with bounds check
| POC6:2: error: parameter count `2222222222' is out of bounds [0; 2147483647]
https://bugzilla.nasm.us/show_bug.cgi?id=3392528
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
* commit '9a1216a1efa0ccb48e5df97acc763ea3de71e0ce':
NASM 2.14
nasmdoc.src: fix compound word
doc: Add a description for a useful case of mangling symbols
preproc: Don't access out of bound data on malformed input
rdstrnum: Make sure we dont shift out of bound
preproc: Fix out of bound access on malformed input
doc: Clarify %include search directory semantics
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
There are a number of places still where we test text
data which is potentially may be an empty string. This
is known to happen on fuzzer input but usually doesn't
take place in regular valid programs. Surely we need
to revisit preprocessor code for this kind of errors.
https://bugzilla.nasm.us/show_bug.cgi?id=3392525
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
A fuzzer revealed a problem in preproc code.
https://bugzilla.nasm.us/show_bug.cgi?id=3392521
Reported-by: ganshuitao <ganshuitao@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Use a hash table to enforce uniqueness in a string list. It is still
an ordered list, however, and can be walked in insertion order.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
All include paths to nasm must already have a trailing separator
prefix which is uncommon among tools. Change to using nasm_catfile
which gives a more normal behaviour.
https://bugzilla.nasm.us/show_bug.cgi?id=3392205
Signed-off-by: night199uk <night199uk@hermitcrabslab.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
* nasm-2.14.xx: (83 commits)
NASM 2.14rc16
doc: Update changes
preproc: expand_smacro -- Fix nil dereference on error path
eval: Eliminate division by zero
doc: Update changes
opflags: Convert is_class and is_reg_class to helpers
preproc: Fix out of range access in expand mmacro
doc: Update changes
parser: Fix sigsegv on certain equ instruction parsing
labels: Make sure nil label is never passed
labels: Don't nil dereference if no label provided
macho: Add warning message in macho_output()
macho/reloc: Fix addr size sensitive conditions
macho/reloc: Fix macho_output() to get the offset adjustments by add_reloc()
macho/reloc: Fixed offset adjustment in add_reloc()
macho/reloc: Allow absolute relocation when forcing a symbol reference
macho/reloc: Adjust SUB relocation information
macho/reloc: Fixed in handling GOT/GOTLOAD/TLV relocations
macho/reloc: Simplified relocation for REL/BRANCH
macho/sym: Record initial symbol number always
...
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
On specially crafetd malformed input file the params
might be zapped (say due to invalid syntax) so we might
access out of bound having nil dereference in best case.
Note the later code in this helper uses tok_isnt_ helper
which already has similar check.
https://bugzilla.nasm.us/show_bug.cgi?id=3392518
Reported-by: Jordan Zebor <j.zebor@f5.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Nearly all instances of nasm_fatal() and nasm_panic() take a flags
argument of zero. Simplify the code by making nasm_fatal and
nasm_panic default to no flags, and add an alternate version if flags
really are desired. This also means that every call site doesn't have
to initialize a zero argument.
Furthermore, ERR_NOFILE is now often not necessary, as the error code
will no longer cause a null reference if there is no current
file. Therefore, we can remove many instances of ERR_NOFILE which only
deprives the user of information.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Make all limit counters 64 bits, in case someone really has a usage
for an insanely large program. The globallines limit was omitted, add
it to the list of configurable limits.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>