Along with C and other languages, the current trend is to be able to
probe for features rather than relying on version numbers. This is
motivated in part by the intent of bumping the major version number to
3.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add the %null(), %note(), %warning(), %error(), and %fatal()
functions. They behave like the corresponding directives, then expands
to nothing.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
$$ is TOKEN_BASE, not a symbol. If this is done incorrectly, ppscan
chokes on $$ as it ends up being an invalid symbol.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Segment references can come from two places: either an explicit SEG
operator or from a far expression. In the former case, the segment can
have a programmer-provided offset, but in the latter case, it
cannot.
The fix for bug 3392949 fixed the latter case, but broke the
former. This patch hopefully makes both work.
Rename out_segment() to out_farseg() and add a comment to explain the
logic behind the difference.
Reported-by: E. C. Masloch <pushbx@ulukai.org>
Fixes: https://bugzilla.nasm.us/show_bug.cgi?id=3392950
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
It is possible for m->mstk.mmac to point back to itself. In that case,
don't terminate the macro debug invocation just yet.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Micro-optimize the set_bit() function. Using size_t (in particular,
using an unsigned type) produces better code.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The documentation actually states that $ as a hex prefix is only valid
when followed by a digit, which at least makes the syntax conflict
less complicated. Actually match the documentation.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
To avoid repeated warnings, there is no warning issued for when
tokenizing a number starting with $ in the preprocessor, but issue a
warning if such a number is *consumed* (used in arithmetic) in the
preprocessor.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The use of $ prefixes for hexadecimal numbers conflicts with
the use of $ to escape symbols. Add a directive to disable
$ for hexadecimal numbers so that those escapes work OK.
As a result, allow escaped symbols to start with a digit.
Add a warning that this syntax is deprecated.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
For a symbol to start with $, it needs to be escaped with a second
dollar sign: $$. This was not handled correctly, instead $$ was seen
as TOKEN_BASE.
Fix this.
Reported-by: E. C. Masloch <pushbx@ulukai.org>
Fixes: https://bugzilla.nasm.us/show_bug.cgi?id=3392922
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
If VEX.V is an immediate, it should not be subject to register range
checks.
If the WW flag is set, REX_W needs to be OR'd in, not XOR'd, because
the map might have the W bit set for matching purposes.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
- Significantly overhauled the disassembler internals to make
better use of the information already in the instruction template
and to reduce the implementation differences with the assembler
- Add APX support to the disassembler
- Fix problem with disassembler truncating addresses of jumps
- Fix generation of invalid EAs in 16-bit mode
- Fix array overrun for types in a few modules
- Fix invalid ND flag on near JMP
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
- rex2.w is used as a opcode extension (JMPABS), not rex2.x1 as an
earlier version of the spec had.
- Segment prefixes used as Jcc hints are valid in 64-bit mode.
- Avoid duplicate warning messages for ignored/invalid prefixes.
* emit_prefixes() is called twice during code generation.
- Add the UDB #UD opcode in 64-bit mode; SALC is 16/32-bit only.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The late cleanup of macros can cause severe memory hogging with nested
%reps. Instead, implement proper reference counting for mmacros.
Adds some other minor cleanups as well, notably delete_*() are
designed to update or null the pointer that is passed to it.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
It is good to have a way to test for the existence of macro functions,
and since they are really just a special case of single-line macros,
allow %ifdef to test for them instead of coming up with something
entirely new.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The format wasn't actually uleb128 because it was accidentally
bigendian (like UTF-8). That is just begging for confusion in the
future, if and when the uleb128 code gets librarized.
Fix it now.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The very simple compression scheme used for the builtin macro sets no
longer works adequately, and in fact it generates incorrect output
now.
Drop the whole idea of an ad hoc compression scheme and just use
zlib. For the case where there is no system zlib available, include a
(subset of) the zlib distribution. The configure script can be set to
force this included zlib if desired (e.g. for testing.)
Unfortunately this turned out to be a pretty painful can of worms in
terms of complexity. On the other hand having zlib available might be
useful at some point in the future.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add a function to test for the existence of a file, and a function
query the real operating system path, if available.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Implement preprocessor function equivalents of the %pathsearch and
%depend directives.
Simplify the incbin standard macro by using these functions.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
When throwing one of the "instruction expected" error messages, print
what was encountered instead.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Remove the legacy output entry point. It has proven impossible to find
the time to completely port the backends all at once.
Instead, always generate the legacy output data, but put them into the
out_data structure. Then add a macro to explode these arguments into
separate variables, equivalent to the old function arguments. This
also centralizes the type definitions for these variables.
Most importantly, it means that the entire struct out_data is now
always available, which means that backends that need the additional
information available in that structure, such as the specific
instruction template, can access that information without needing to
revamp the entire backend code all at once.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
This code incorrectly would try to use "path" as the hash key instead
of full->path, causing the key in struct hash_insert to diverge from
the one used in hash_add(). Fix that.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Under some circumstances, such as:
- Certain uses of %exitrep in syntactically invalid code;
- %unmacro of a *alias* to a macro currently being expanded;
... it is possible for an mmacro to get freed while it is still in
use. Although inefficient, the easiest way to avoid this is to not
free mmacros until the end of pass cleanup, when named mmacros are
also freed.
To support this, use the existing ->next field in the MMacro structure
to keep a list of anonymous or removed MMacros. Don't free ->name at
this point, though, since that is currently used to distinguish
between %rep's and %macro's. (This needs to be cleaned up to support
constructs such as %while or %for, but that is for later.)
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add a optimization frameword for operand narrowing (where the operand
size doesn't matter beyond a certain range because only certain bits
are referenced.)
Add a macro *and* matching facility for dealing with segment selectors, which are
typically rm16/r32/r64, but exactly how that is applied varies
depending on if a datum is read or written.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
More work on cleaning up instruction patterns, fixing matchig corner
cases, and tidying up the organization of insns.dat.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
More matching and macrofication work.
Improve some error and warning messages.
Update some travis tests for better messages and added optimizations.
Fix duplicated warning messages for the same out-of-range value
problem.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Tag pseudo-instructions explicitly and don't set any CPU level flag
for those.
Change insnsa.c to have (length, pointer) rather than using an ever
increasing in size sentinel at the end of each table. This also means
that empty tables (Dx, INCBIN) can be omitted entirely.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add more instruction macros and fix problems. Adjust some matching
problems.
Remove all FUTURE tags from the instruction list, and add a bunch of
new CPUID tags. Hopefully a small step toward actually getting CPU
feature selection working properly in the future.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
"nw" now means: 64-bit operand size is the default, o32 is not
permitted in 64-bit mode.
"osz" means: instruction size determined by prefixes, otherwise the
mode default.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This is a WIP checkpoint; not all tests pass yet.
More matching changes, and hopefully something much closer to what
really is desired now. The number of required patterns is now much
smaller.
However, a lot of *changes* are needed to the patterns.
Since some patterns are repeated all over the place, clean up the
x86/addflags.pl script and make it able to generate macro-based
common patterns; first use being the patterns for the "basic 8"
arithmetic patterns.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Work through a number of changes toward making matching a lot saner,
both to reduce the number of patterns to generate for APX but also to
make a number of code patterns simpler.
This replaces a fair number of byte codes.
Improve a number of error messages, especially related to overflows.
Move process_insn() from nasm.c to assemble.c, as it really is the
primary entry point to the assembler module.
Reorder some prefixes. In particular, F2/F3 override 66 when used as a
mandatory prefix, so it makes more sense for them to be closer to the
opcode.
Move a lot more information into struct insn. It is better to have it
in one place; memory consumption is not an issue because struct insn
is transient information.
Get rid of "optimization levels" and replace it with a mask of
flags. That was already halfway done; complete the job.
Replace seg:offset in struct out_data with a struct location. It would
be better to extend this to more places, too.
The ARx and SMx flags are now explicit bitmasks, instead of having a
couple of hard-coded ranges.
Add __func__ to assert or panic messages.
Because of prefix and message changes, a number of travis tests had to
be audited and updated.
Fix a number of instruction patterns which had .128 when they ought to
be .lig. This is no longer a minor issue with the disassembler: for
AVX10, the pattern vector length determines how SAE/RC are encoded,
and there is no valid 128-bit encoding. However, with .lig the 512-bit
encoding can be used.
Separate "o64nw" into two pieces: opsize 64 and "nw" = "REX.w not necessary". The
latter can be included in non-64-bit patterns. "o64" still set REX.W
since that is still the common thing.
New "osz" bytecode: emit an OSP *or* REX.W depending on the current
mode and operand size. Useful for special cases like "nop" where "o64
nop" probably wants to be encoded as "48 90".
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Don't do an out-of-range check for the operands, even
temporarily. Setting the operand pointer to NULL will help catch
errors when accessing non-operands, too.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Merge a bunch of common code in the handling of modr/m
generation. Make the handing of compressed disp8 simpler and more
transparent by exporting a the shift factor for the compressed
immediate in ea_data. For the case of no compression, the shift factor
is simply 0; there is no need to distinguish "compressed" from
"uncompressed".
The tidied up version of the disp8 code is simple enough that it makes
more sense to inline it.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The {nf} and {zu} prefixes (or suffixes) can be used on a number of
instructions without actually change the encodings (either they don't
touch the flags at all, or they write a 32- or 64-bit register
already.)
Make this a bit more flexible, by adding an FL instruction flag for
the instructions which actually touch the flags, and a ZU instruction
flag for the instructions which zero the upper half.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Implement the JMPABS instruction, which can also be specified as JMP
ABS for consistency. Since ABS is already a keyword, this does not
pollute the namespace.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>