It is somewhat counterintuitive, but the correct flag for the memory
operand is "OSIZE". The "nw" flag takes care of promoting the default
operand size on 64 bits to 64.
Fixes: https://github.com/netwide-assembler/nasm/issues/130
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Quick and easy way to run the same test for 16-, 32- and 64-bit output
without mixing them together in one binary output file.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
-- TCMMIMFP16PS, TCMMRLFP16PS instructions
-- AMX.asm fix: Similar to GATHER instructions, 3-operand AMX instructions cannot have the same operand more than once
Checked with XED version: [v2025.06.08]
If a line is suppressed, the %if or %rep condition must never be
evaluated. Test for it, and add the exitrep test to travis.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Don't redo the whole clone and compile if one wants to re-run the
test. Only rebuild the NASM files.
Minor script cleanups.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The assembler can't know if something is a colonless label or a
misspelled instruction, so print both when complaining about a missing
instruction.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The MMX and early SSE PSHUF* instructions were annotated SM0-1, which
is unnecessary (no ambiguity) but broke the tighter SM matching the
assembler now uses.
This is almost certainly underspecified now, but the MMX and early SSE
instruction patterns need to be tidied up anyway, and this is the
least impactful change that seems to fix the problem.
This unbreaks compiling ffmpeg.
Reported-by: Yongjie Sheng (Intel) <sheng.yongjie@outlook.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Far jmp and call are special in many ways... not the least because of
the old legacy syntax of putting the size on the segment instead of
the offset.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Using "extern" or "required" after the definition should be
interpreted as "global", just as if "extern" or "required" had been
specified before the definition.
Unfortunately the code did not correctly handle the case of upgrading
from LOCAL to GLOBAL via an EXTERN or REQUIRED directive, only from
EXTERN or REQUIRED to GLOBAL via definition or a GLOBAL or COMMON
directive.
Fix.
Reported-by: E. C. Masloch <pushbx@ulukai.org>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
A pattern for XCHG was incompletely macroized. This caused a
fallthrough to the next pattern, reversing the operands, but would
probably have had generated incorrect code in at least some cases.
Beef up the xchg test.
Reported-by: E. C. Masloch <pushbx@ulukai.org>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Convenience preprocessor functions that allows for efficient packing
of binary data in source code.
Move some functions that has previously been local but are more
generally useful into more accessible places.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
It is sometimes just too convenient to be able to convert between
strings and bytes at will. At one point I was considering making
something with the full power of the db (et al) directives, but that
is a much bigger change...
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add the %find() and %findi() functions to look for a string in a
list. This is useful with picking apart the contents of the
__?DEFAULT?__ macro, for example.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Making DEFAULT ABS the default for 64-bit mode was a real
mistake. Issue a warning so we can eventually change it.
Support making FS: and GS: references also be REL by default.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Make it a little easier to run bench tests which include multiple bit
sizes, and add the SRC define to make SRC/BIN tests easier to run on
the bench.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
JMPABS does not need .w1 and in fact is documented to NOT have or
require it.
Add jump-over emulation for the !APX case, similar to the jump-over
for long conditional branches in < 386.
Move JMP ABS patterns ahead of regular jumps; otherwise JMP ABS syntax
doesn't work.
Prefer JMPABS in the disassembler, since that is the documented form.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Along with C and other languages, the current trend is to be able to
probe for features rather than relying on version numbers. This is
motivated in part by the intent of bumping the major version number to
3.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Check a few more corner cases, including $ and $$, as well as parsing
in the assembler (dd) and the preprocessor (%assign).
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The use of $ prefixes for hexadecimal numbers conflicts with
the use of $ to escape symbols. Add a directive to disable
$ for hexadecimal numbers so that those escapes work OK.
As a result, allow escaped symbols to start with a digit.
Add a warning that this syntax is deprecated.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
With these changes, the disassembler correctly decodes the ccmp.asm
and apx.asm tests.
Fix rebuilding the main tools from test/Makefile.in.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
With ndisasm now built separately, make it easier to explicitly make
nasm and ndisasm from the test directory.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
- Significantly overhauled the disassembler internals to make
better use of the information already in the instruction template
and to reduce the implementation differences with the assembler
- Add APX support to the disassembler
- Fix problem with disassembler truncating addresses of jumps
- Fix generation of invalid EAs in 16-bit mode
- Fix array overrun for types in a few modules
- Fix invalid ND flag on near JMP
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
- rex2.w is used as a opcode extension (JMPABS), not rex2.x1 as an
earlier version of the spec had.
- Segment prefixes used as Jcc hints are valid in 64-bit mode.
- Avoid duplicate warning messages for ignored/invalid prefixes.
* emit_prefixes() is called twice during code generation.
- Add the UDB #UD opcode in 64-bit mode; SALC is 16/32-bit only.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Factor the objects ONLY needed for the disassembler into a
separate library. This allows building the assembler even while
the disassembler is not yet buildable; this makes working on
the disassembler easier.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The late cleanup of macros can cause severe memory hogging with nested
%reps. Instead, implement proper reference counting for mmacros.
Adds some other minor cleanups as well, notably delete_*() are
designed to update or null the pointer that is passed to it.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
More matching and macrofication work.
Improve some error and warning messages.
Update some travis tests for better messages and added optimizations.
Fix duplicated warning messages for the same out-of-range value
problem.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
This is a WIP checkpoint; not all tests pass yet.
More matching changes, and hopefully something much closer to what
really is desired now. The number of required patterns is now much
smaller.
However, a lot of *changes* are needed to the patterns.
Since some patterns are repeated all over the place, clean up the
x86/addflags.pl script and make it able to generate macro-based
common patterns; first use being the patterns for the "basic 8"
arithmetic patterns.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Work through a number of changes toward making matching a lot saner,
both to reduce the number of patterns to generate for APX but also to
make a number of code patterns simpler.
This replaces a fair number of byte codes.
Improve a number of error messages, especially related to overflows.
Move process_insn() from nasm.c to assemble.c, as it really is the
primary entry point to the assembler module.
Reorder some prefixes. In particular, F2/F3 override 66 when used as a
mandatory prefix, so it makes more sense for them to be closer to the
opcode.
Move a lot more information into struct insn. It is better to have it
in one place; memory consumption is not an issue because struct insn
is transient information.
Get rid of "optimization levels" and replace it with a mask of
flags. That was already halfway done; complete the job.
Replace seg:offset in struct out_data with a struct location. It would
be better to extend this to more places, too.
The ARx and SMx flags are now explicit bitmasks, instead of having a
couple of hard-coded ranges.
Add __func__ to assert or panic messages.
Because of prefix and message changes, a number of travis tests had to
be audited and updated.
Fix a number of instruction patterns which had .128 when they ought to
be .lig. This is no longer a minor issue with the disassembler: for
AVX10, the pattern vector length determines how SAE/RC are encoded,
and there is no valid 128-bit encoding. However, with .lig the 512-bit
encoding can be used.
Separate "o64nw" into two pieces: opsize 64 and "nw" = "REX.w not necessary". The
latter can be included in non-64-bit patterns. "o64" still set REX.W
since that is still the common thing.
New "osz" bytecode: emit an OSP *or* REX.W depending on the current
mode and operand size. Useful for special cases like "nop" where "o64
nop" probably wants to be encoded as "48 90".
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add MOVSX[D] -> CBW/CWDE/CDQE optimization patterns when the suitable
form of the AX register are referenced.
Add MOVZX reg64,rm32 pattern which converts to a 32-bit MOV.
Add MOVZXD reg64,rm32 alias pattern for consistency.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>