The size calculation done in len_extops() (called by insn_size()) for
EOT_DB_RESERVE (i.e. uninitialized storage "?" token) does not take
into account the element size (e->elem), thus calculating a wrong
size for any Dx larger than DB (DW, DQ, etc).
The bug is silent, but it makes NASM error out if a "Dx ?" (larger
than DB) is followed by any label because the label offset gets
mismatched in the final code generation stage:
$ cat test.asm
[section .bss]
DW ?
x:
$ nasm test.asm
test.asm:3: error: label `x' changed during code generation [-w+error=label-redef-late]
See also: https://stackoverflow.com/q/70012188/3889449
Signed-off-by: Marco Bonelli <marco@mebeim.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Andrew reported that we may access unitialized memory
> SUMMARY: MemorySanitizer: use-of-uninitialized-value nasm/asm/parser.c:982:41 in parse_line
It turns out that in case of malformed data the expression is terminator
itself so we should not "lookup ahead" for next one. Thus test for first
expression initially and if test passes check for terminator.
Reported-by: Andrew Bao <xiaobaozidi@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
In commit 2469b8b6 we occasionally bring the ability
to read unitialized memory due to refactoring. Fix it
doing needed test inside the function and setting up
an error message if needed.
Side note: passing 7 arguments into the function means
we have to decompose this helper somehow, such number
of arguments is a way over the top.
Bugzilla: https://bugzilla.nasm.us/show_bug.cgi?id=3392751
Reported-by: Marco <mvanotti@protonmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Add a {rex} prefix to force REX encoding (typically a redundant 40h
prefix).
For prefix parsing, we can use t_inttwo to encode the prefix slot
number.
Give more verbose error messages for encoding mismatches.
Make the pasting behavior of TOKEN_QMARK, TOKEN_HERE and TOKEN_BASE
match the NASM 2.15 behavior: ? is a keyword and pastes as an ID, $
and $$ are treated as operators (which doesn't seem to make much
sense, but it is the current legacy behavior.)
Reported-by: C. Masloch <pushbx@ulukai.org>
Bugzilla: https://bugzilla.nasm.us/show_bug.cgi?id=3392733
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
If macro is undefined while it's being expanded, use after free occurs,
since the MMacro instance is released, but it is still used to proceed
the expansion.
This change forbids macro undefinition: non-fatal error is raised and
the MMacro instance is not released if it is being processed by NASM
preprocessor.
Consider the following example:
| $ cat test.asm
| %macro m 0
| %unmacro m 0
| %endmacro
| m
| $ ./nasm test.asm
| test.asm:4: error: `%unmacro' can't undefine the macro being expanded
| test.asm:2: ... from macro `m' defined here
Fixes BR3392531 and BR3392716.
Signed-off-by: Igor Munkin <imun@cpan.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Better point out explicitly that SMacro::next member
is untouched, thus do not use SMacro::next and an array.
CID 1432925
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
When we process a TOKEN_QMARK we also need to advance p, in order to
get the proper start for the next token.
This fixes travis test br3392707.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The operation of the ',' and ')' tokens are very similar, except for:
',' issues a error if the processed parameter is greedy;
')' sets the "done" variable.
The code would incorrectly set "done" for a ',' token. This fixes
travis test br3392711.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
When pasting and stripping %+ and whitespace tokens, we either need to
set *nextp in the loop, or treat next as a separate variable and
update *nextp after the loop finishes. This implements the second
option.
This fixes travis test "amx".
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add the %eval() preprocessor function. It evaluates each of its
arguments like a number and expands to a comma-separated lists of the
evaluated arguments.
To support this, add the concept of "true varadic" macros, which are
only used internally. True varadic macros differ from greedy macros in
that the parameter list is still parsed as individual parameters and
provided to the expansion function. As this isn't meaningful for
user-defined macros, there is no way to specify it from a directive.
Add back the %isnfoo() functions. Although one could just as well write
!%isfoo(), it doesn't cost much to provide them, and might help avoid
programmer confusion.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
We need the ability to produce consistent output for our own tests,
anyway, so make this a user-accessible feature. This was requested in
BR 3392635.
This obsoletes the NASM_TEST_RUN environment variable; simply use the
normal NASMENV environment variable instead.
The .obj tests in travis needed to be updated in order to remove the
rather pointless suffix " CONST" from the NASM signatures.
Reported-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
%ifid $ and %ifid $$ has traditionally been false, revert to that
behavior.
Reported-by: Mike Hommey <mh+anfz@glandium.org>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
NASM 2.15.04
Conflicts:
asm/listing.h
asm/pptok.pl
asm/preproc.c
version
This doesn't pass travis test 3392711, which is using an extremely odd
construct of %?? in the middle of an argument sequence for an smacro
while not being in a macro itself, and expecting it to expand to the
macro name. This seems to *really* confuse the master branch.
Resolve this later...
At least DWARF can encode C-style macros. In doing so, it wants the
file include hierarchy, so give the debug format backend the option of
receiving that information from the preprocessor.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The warning files are generated by a script, but the scripts is fast
enough run every time a C file is updated. To prevent having to
rebuild every file, however, make the generation script only actually
modify the file if it has changed.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The mempcpy helper returns *last* byte pointer thus when
we call set_text_free we have to pass a pointer to the
start of the string.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Make sure the data being freed get double
freed after -- the pointers must be zapped
(actually nasm_free and free_tlist support
being called with NULL pointer as an argument).
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
BNDMK, BNDLDX, and BNDSTX are split-SIB (MIB) instructions, but do
*not* require a SIB encoding. However, TILELOAD* and TILESTORE* *do*
require a SIB in all cases. Split the MIB flag into MIB (split
address) and SIB (SIB required) flags.
This fixes travis test mpx.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
We need to add the byte offset into the floating-point value to get
the correct result for these floating point to integer conversions.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add a new macro vprintf_func() for vprintf-style functions, and add
printf_func() and vprintf_func() attribute arguments whereever
meaningful.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
After issuing an error message for a missing %stacksize argument, need
to quit rather than continuing to try to access the pointer.
Fold uses of tok_text() while we are at it.
Reported-by: Suhwan <prada960808@gmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
An eop may have a data buffer associated with it as part of the same
memory allocation. Therefore, we need to move "subexpr" up instead of
merging it into "eop".
This *partially* resolves BR 3392707, but that test case still
triggers a violation when using -gcv8.
Reported-by: Suhwan <prada960808@gmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
-Lw really is only useful to debug NASM crashes, and can hugely slow
down the assembler. Make -L+ simply imply full verbosity; if NASM
crashes use -Lw+ instead.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Set the hash size scaling constant to 1.6, signifying 3.2 times the
hash load. This both reduces the convergence time and makes it less
likely (< 25%) that a non-entry will require a secondary comparison,
and after all, in most of our use cases non-entries are by far the
more common.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The %? and %?? tokens are ambiguous when used inside a multi-line
macro. Add tokens %*? and %*?? that only expand during single-macro
expansion.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Support generating bfloat16 constants. This is a bit awkward, as "DW"
already generates IEEE half precision constants; therefore there is no
longer a single floating-point format for each size. This requires
some replumbing.
Fortunately bfloat16 fits in 64 bits, so support generating them with
a macro that uses __?bfloat16?__() to convert to integers first before
passing them to DW.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
If macros are nolisted, *or* they don't have any filename associated
with them, it is absolutely pointless to try to descend into them for
error messages, so just don't, even if -Lb is provided.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The previous code to fix whitespace around and multiple %+ symbols in
a row (checkin 122c5fb759) had some
seriously broken pointer handling when zapping tokens. This could
cause paste_tokens() to go into an infinite loop because it would
attach %+ to another token and then immediately break them apart
again, over and over.
Reported-by: <alexfru@gmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The different token codes between the preprocessor and the assembler
is a completely unnecessary headache. Furthermore, lumping all the
operators under TOK_OTHER in the preprocessor causes a whole bunch of
unnecessary headaches.
In combining them, the only tricky part is that PP_CONCAT_MASK() is no
longer usable, as the range of token codes is too large. Replace with
dedicated category masks.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
To handle escape codes in filename strings after # markers correctly,
we need nasm_unquote() to be aware that it is using C escapes;
otherwise things like "foo`bar" will break.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
In 41e9682efe we've
changed the nasm_quote arguments still not all callers
were converted which could lead to nil dereference.
[hpa: no need to call strlen() for the asm/preproc.c chunk]
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add the first "preprocessor functions". These are simply "magic"
single-line macros with a suitable expansion function. The first
application is functions equal to the %if directives, e.g.
%ifdef blah == %if %isdef(blah) except can be used anywhere (not just
in %if statements like defined() in C.)
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
In 41e9682efe we've
changed the nasm_quote arguments still not all callers
were converted which could lead to nil dereference.
[hpa: no need to call strlen() for the asm/preproc.c chunk]
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
In 41e9682efe we've
changed the nasm_quote arguments still not all callers
were converted which could lead to nil dereference.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Simplify the code generators by merging the two hash constant arrays
into one. The hash is effectively the same, although the order of the
constants differ (possibly in a way which makes the indexing easier.)
The main difference is the amount of code is necessary to generate
each of the output C files.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>