The shift and rotate patterns are "interesting" in the following way:
1. Even though only 4/5/6 bits of the input are ever used, for the
regular instructions the input is specified as the CL register, but
for the -X instructions as a size-matching register. This makes the
optimization patterns "interesting."
2. The sequencing of legacy, VEX -X versions, APX EVEX, and APX -X
For #1, allow any size register to contain the shift count.
For #2, split up the macro generation of the patterns, and add a new
"$xmacro" macro to deal with the combinatorics of generating all the
-X patterns. Written directly in Perl since it seemed easier than
trying to make anything more general for what is very much a special
case...
Reported-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Most instructions support contracted forms, but those had been overlooked.
Signed-off-by: Henrik Gramner <henrik@gramner.com>
[ hpa: manual merge ]
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The rounding specifier should be applied to src2, not src1.
Furthermore, VCVTSI2SD with a 32-bit source operand does not
support specifying a rounding mode (as no rounding can occur).
Signed-off-by: Henrik Gramner <henrik@gramner.com>
[ hpa: manual merge ]
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
According to the June edition of SDM evex-form of the VCVTPS2PH command only exists with mmmmm equal to 0f38, and map5 only exists for the VCVTPS2PHX command
It is somewhat counterintuitive, but the correct flag for the memory
operand is "OSIZE". The "nw" flag takes care of promoting the default
operand size on 64 bits to 64.
Fixes: https://github.com/netwide-assembler/nasm/issues/130
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Officially the syntax for TEST is "rm,reg"; however TEST is
commutative in every aspect, and as such "reg,mem" is an equivalent
form that NASM has also supported in the past.
Reinstate it properly.
Fixes: https://bugzilla.nasm.us/show_bug.cgi?id=3392962
Reported-by: E. C. Masloch <pushbx@ulukai.org>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
-- TCMMIMFP16PS, TCMMRLFP16PS instructions
-- AMX.asm fix: Similar to GATHER instructions, 3-operand AMX instructions cannot have the same operand more than once
Checked with XED version: [v2025.06.08]
see Intel® Architecture Instruction Set Extensions and Future Features Programming Reference, March 2025 319433-057
else (without this correction) it conflict with VPERMI2PS
SPDX is an international standard for documenting software license
requirements. Remove the existing headers and replace with a brief
SPDX preamble.
See: https://spdx.dev/use/specifications/
The script used to convert the files is added to "tools", and the
file header templates in headers/ are updated.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Error out if an encoding position is invalid, like an "r" operand
matches an "xmmrm" operand.
Document the instruction encoding symbols; there are too many of them
by now.
Add symbols 'n' and 'w' meaning immediates that are supposed to be
encoded as if they were 'm' memory addresses and 'v' register numbers,
respectively; this is necessary to indicate a validation exception.
Remove broken ARPL "memory-like" encoding. It probably never worked
anyway.
This verification caught two bugs already:
- VPMASKMOV[DQ] cannot omit the second operand.
- Incorrect operand encoding order for VREDUCESH.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
For disassembler to work correctly NOPs should be at the very end of the
database file.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
The MMX and early SSE PSHUF* instructions were annotated SM0-1, which
is unnecessary (no ambiguity) but broke the tighter SM matching the
assembler now uses.
This is almost certainly underspecified now, but the MMX and early SSE
instruction patterns need to be tidied up anyway, and this is the
least impactful change that seems to fix the problem.
This unbreaks compiling ffmpeg.
Reported-by: Yongjie Sheng (Intel) <sheng.yongjie@outlook.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
PF and CF are always set to the same value; allow the programmer to
specify either or both.
Allow EQU to take a {dfv} expression without needing parens.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add all the missing instructions / instruction variants that are
specified in the 2025 June Intel ISE.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Memory operand size for all VSM4KEY4 versions is specified as 128 bit
long, while the ymm register version should use 256 bit size.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
The control and debug registers are always using the default operand
size. It is probably easiest to just encode it explicitly for now.
Control registers are particularly weird because of the AMD "lock as
REX.R" hack...
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Far jmp and call are special in many ways... not the least because of
the old legacy syntax of putting the size on the segment instead of
the offset.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Do not force the operand size for K registers and "ko#"
encodings. This resolves BR 3392957.
Reported-by: ig <glucksmann@avast.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
A pattern for XCHG was incompletely macroized. This caused a
fallthrough to the next pattern, reversing the operands, but would
probably have had generated incorrect code in at least some cases.
Beef up the xchg test.
Reported-by: E. C. Masloch <pushbx@ulukai.org>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Instead of do more ad hoc hacks in preinsns.pl, add explicit macro
flags for the arithmetic instructions. This also allows folding CMP
back into the standard arithmetic instructions.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
ADC and SBB don't support using the {nf} prefix. They are the only one
in the arithmetic instructions group that are this way.
Add a flag that will warn when an instructions wants to use {nf} but
doesnt' support it.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Add the database entry for CTESTcc and the relevant test cases. The
syntax is basically CCMPSCC without two syntax variants.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
The encodings of 'true' and 'false' variants of the CCMPSCC instructions
were swapped. Correct that in the preprocessor script.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Add database entries and test cases for WRSSD, WRSSQ, WRUSSD and WRUSSQ
instructions.
Fix whitespace in legacy database entries.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>