This is a WIP checkpoint; not all tests pass yet.
More matching changes, and hopefully something much closer to what
really is desired now. The number of required patterns is now much
smaller.
However, a lot of *changes* are needed to the patterns.
Since some patterns are repeated all over the place, clean up the
x86/addflags.pl script and make it able to generate macro-based
common patterns; first use being the patterns for the "basic 8"
arithmetic patterns.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Work through a number of changes toward making matching a lot saner,
both to reduce the number of patterns to generate for APX but also to
make a number of code patterns simpler.
This replaces a fair number of byte codes.
Improve a number of error messages, especially related to overflows.
Move process_insn() from nasm.c to assemble.c, as it really is the
primary entry point to the assembler module.
Reorder some prefixes. In particular, F2/F3 override 66 when used as a
mandatory prefix, so it makes more sense for them to be closer to the
opcode.
Move a lot more information into struct insn. It is better to have it
in one place; memory consumption is not an issue because struct insn
is transient information.
Get rid of "optimization levels" and replace it with a mask of
flags. That was already halfway done; complete the job.
Replace seg:offset in struct out_data with a struct location. It would
be better to extend this to more places, too.
The ARx and SMx flags are now explicit bitmasks, instead of having a
couple of hard-coded ranges.
Add __func__ to assert or panic messages.
Because of prefix and message changes, a number of travis tests had to
be audited and updated.
Fix a number of instruction patterns which had .128 when they ought to
be .lig. This is no longer a minor issue with the disassembler: for
AVX10, the pattern vector length determines how SAE/RC are encoded,
and there is no valid 128-bit encoding. However, with .lig the 512-bit
encoding can be used.
Separate "o64nw" into two pieces: opsize 64 and "nw" = "REX.w not necessary". The
latter can be included in non-64-bit patterns. "o64" still set REX.W
since that is still the common thing.
New "osz" bytecode: emit an OSP *or* REX.W depending on the current
mode and operand size. Useful for special cases like "nop" where "o64
nop" probably wants to be encoded as "48 90".
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Don't do an out-of-range check for the operands, even
temporarily. Setting the operand pointer to NULL will help catch
errors when accessing non-operands, too.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Implement the JMPABS instruction, which can also be specified as JMP
ABS for consistency. Since ABS is already a keyword, this does not
pollute the namespace.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
a
Support generating code for APX instruction and add support for the
{nf} prefix.
No disassembler support yet, and only a handful instructions encoded.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Change the byte code format and the byte code compiler to be able to
generate various kinds of APX-format instructions.
THE NEW BYTE CODES ARE NOT YET IMPLEMENTED IN THE ASSEMBLER OR
DISASSEMBLER.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
{dfv=} is basically a constant (immediate). Treat it as such during
parsing, except that if "naked" (not in an expression), it has special
matching properties and does not need a terminal comma.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The parser state does not just necessarily include the position of the
buffer, but make it possible to maintain additional state.
Furthermore, add an explicit ability to push back a token.
All of this might make it easier at some point in the future to keep
track of horizontal position, although that will require lots of
changes to the preprocessor.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Clean up the handling of prefixes in general. Allow a set of braced
prefixes to follow the instruction; this is required for things like
{dfv=} but might also be a nicer syntax for things like {rex}.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
NASM would try to "eat the comma token" in db expressions, even for
cases where the token was not a comma. Fix that and error out
properly.
To give better error messages, track where in the input string a token
starts or ends. This information is only valid as long as the input
string is kept, but that is just fine for error messages during
parsing.
Reported-by: Peter Cordes <pcordes@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Instead of handling conditional instructions ad hoc, generate
individual instruction patterns as normal. This simplifies the code
and makes CMPccXADD support simpler (otherwise it would be necessary
to hack in the handling of a condition code in the middle of an
instruction.)
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add support for AVX512-FP16 instructions and the associated
handling. Allow "mapN" syntax as well as "mN" syntax to match the
documentation.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Andrew reported that we may access unitialized memory
> SUMMARY: MemorySanitizer: use-of-uninitialized-value nasm/asm/parser.c:982:41 in parse_line
It turns out that in case of malformed data the expression is terminator
itself so we should not "lookup ahead" for next one. Thus test for first
expression initially and if test passes check for terminator.
Reported-by: Andrew Bao <xiaobaozidi@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Add a {rex} prefix to force REX encoding (typically a redundant 40h
prefix).
For prefix parsing, we can use t_inttwo to encode the prefix slot
number.
Give more verbose error messages for encoding mismatches.
An eop may have a data buffer associated with it as part of the same
memory allocation. Therefore, we need to move "subexpr" up instead of
merging it into "eop".
This *partially* resolves BR 3392707, but that test case still
triggers a violation when using -gcv8.
Reported-by: Suhwan <prada960808@gmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Support generating bfloat16 constants. This is a bit awkward, as "DW"
already generates IEEE half precision constants; therefore there is no
longer a single floating-point format for each size. This requires
some replumbing.
Fortunately bfloat16 fits in 64 bits, so support generating them with
a macro that uses __?bfloat16?__() to convert to integers first before
passing them to DW.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The autoconf process automatically generates macros for function
attributes, including empty placeholders. Said empty placeholders also
propagate automatically into config/unconfig.h for the compilers which
don't support autoconf.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
clang, unlike gcc, will warn on inline functions which are
unused. This can happen if a function is either intended to be used in
the future, or it is only used under certain config options. Mark
those functions with the "unused" attribute; not only does it quiet
the warning, but it also documents it for the user.
Shuffle around the warning options in configure and add a few more
that are specific to clang.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Haiku apparently wants to include <float.h> rather than
"float.h". Rename float.[ch] to floats.[ch] to avoid unnecessary
namespace confusion.
Reported-by: <alaviss0+nasm@gmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add support for complex data (Dx) statement expressions involving both
initialized and uninitialized data. In addition, we have support for
overriding the size of each element on an individual item and/or list
basis.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
We need to create a separate paragraph if the help text had used \c
anyway. Putting the enabled/disabled separately for all entries makes
it read a lot cleaner anyway.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
rdsrc.pl requires blank lines around \c paragraph, but warnings.pl
would strip them. Create a *!- prefix to force a blank line.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Very limited MASM emulation.
The parser has been extended to emulate the PTR keyword if the
corresponding macro is enabled, and the syntax displacement[index] for
memory operations is now recognized.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
This allows the K instructions to be specified without a size suffix
as long as the operands are sized; this matches the way most other x86
instructions work. As this is not the syntax specified in the SDM,
don't use it for disassembly.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
"compiler.h" already includes a bunch of common include files. There
is absolutely no reason to duplicate them in individual files, and in
fact it robs us of central control of how these files are used.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
For almost everything we should use "nctype.h". Right now we don't
have a nasm_toupper() to use <ctype.h> for things that need toupper().
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
There is absolutely no reason not to include <string.h> globally, and
with the inline function for mempcpy() we need it there anyway.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
There is space in the token table to explicitly encode the size
corresponding to a size token. We might as well do so...
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
With buffered warnings, most warnings *must* be issued on every pass,
so ERR_PASS1 is simply wrong in most cases.
ERR_PASS1 now means "force this warning to be output even in
pass_first(). This is to be used for the case where the warning is
only executed in pass_first() code; this is highly discouraged as it
means the warnings will not appear in the list file and subsequent
passes may make the warning suddenly vanish.
ERR_PASS2 just as before suppresses an error or warning unless we are
in pass_final().
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The use of pass0, pass1, pass2, and "pass" passed as an argument is
really confusing and already caused a severe bug in the 2.14.01
release cycle. Clean them up and be far more explicit about what
various passes mean.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
We want to strongly encourage writers of warnings to create warning
categories, so remove the flagless nasm_warn() and change nasm_warnf()
to nasm_warn().
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
It is extremely desirable to allow the user fine-grained control of
warnings, but this has been complicated by the fact that a warning
class has had to be defined in no less than three places (error.h,
error.c, nasmdoc.src) before it can be used in source code. Instead,
use a script to define these via magic comments at the point of use.
This hopefully will encourage creating new classes as needed.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The prefix ERR_WARN_ is unnecessarily long and may be a disincentive
to create new warning categories. Change it to WARN_*, it is still
plenty distinctive.
This is equivalent to nasm-2.14.xx checkin 77f53ba6d4cb90e5a7e09b33357ed7c1fe9f6b9d.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
* nasm-2.14.xx: (83 commits)
NASM 2.14rc16
doc: Update changes
preproc: expand_smacro -- Fix nil dereference on error path
eval: Eliminate division by zero
doc: Update changes
opflags: Convert is_class and is_reg_class to helpers
preproc: Fix out of range access in expand mmacro
doc: Update changes
parser: Fix sigsegv on certain equ instruction parsing
labels: Make sure nil label is never passed
labels: Don't nil dereference if no label provided
macho: Add warning message in macho_output()
macho/reloc: Fix addr size sensitive conditions
macho/reloc: Fix macho_output() to get the offset adjustments by add_reloc()
macho/reloc: Fixed offset adjustment in add_reloc()
macho/reloc: Allow absolute relocation when forcing a symbol reference
macho/reloc: Adjust SUB relocation information
macho/reloc: Fixed in handling GOT/GOTLOAD/TLV relocations
macho/reloc: Simplified relocation for REL/BRANCH
macho/sym: Record initial symbol number always
...
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
We should check for bounds when accessing nasm_reg_flags.
Seems this bug was for long time already.
https://bugzilla.nasm.us/show_bug.cgi?id=3392516
Reported-by: Jordan Zebor <j.zebor@f5.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
While configuring optimization in a level is conventional,
a certain optimization tends to conflict with some pragma.
For example, jump match conflicts with Mach-O's
"subsections-via-symbols" macro.
This configurability will workaround such conflicts.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Support the +n syntax for multiple contiguous registers, and emit it
in the output from ndisasm as well.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
We can be in absolute space and still end up with segment-relative
references. This is in fact the meaning of absolute.segment. Make
sure we define the labels appropriately.
Reported-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Nearly all instances of nasm_fatal() and nasm_panic() take a flags
argument of zero. Simplify the code by making nasm_fatal and
nasm_panic default to no flags, and add an alternate version if flags
really are desired. This also means that every call site doesn't have
to initialize a zero argument.
Furthermore, ERR_NOFILE is now often not necessary, as the error code
will no longer cause a null reference if there is no current
file. Therefore, we can remove many instances of ERR_NOFILE which only
deprives the user of information.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>