0
0
mirror of https://github.com/netwide-assembler/nasm.git synced 2025-10-10 00:25:06 -04:00

doc: corrections and improvements (no changes to intended meaning)

This commit is contained in:
Joshua Perrett
2025-09-09 17:39:35 +01:00
parent 746fe8384d
commit 2d9b58f293
7 changed files with 85 additions and 85 deletions

View File

@@ -51,7 +51,7 @@ mode, for 8-, 16-, 32- and 64-bit references, respectively:
This is consistent with the AMD documentation and most other
assemblers. The Intel documentation, however, uses the names
\c{R8L-R15L} for 8-bit references to the higher registers. It is
possible to use those names by definiting them as macros; similarly,
possible to use those names by defining them as macros; similarly,
if one wants to use numeric names for the low 8 registers, define them
as macros. The standard macro package \c{altreg} (see \k{pkg_altreg})
can be used for this purpose.
@@ -95,7 +95,7 @@ Note that \c{lea rax,[rel symbol]} is position-independent, whereas
position-independent code in 64-bit mode. However, the \c{MOV}
instruction is able to reference a symbol anywhere in the 64-bit
address space, whereas \c{LEA} is only able to access a symbol within
within 2 GB of the instruction itself (see below.)
within 2 GB of the instruction itself (see below).
The only instructions which take a full \I{64-bit displacement}64-bit
\e{displacement} is loading or storing, using \c{MOV}, \c{AL}, \c{AX},

View File

@@ -71,7 +71,7 @@ necessary.
When the \c{REX} prefix is used, the processor does not know how to
address the AH, BH, CH or DH (high 8-bit legacy) registers. Instead,
it is possible to access the the low 8-bits of the SP, BP SI and DI
it is possible to access the the low 8-bits of the SP, BP, SI and DI
registers as SPL, BPL, SIL and DIL, respectively; but only when the
REX prefix is used.
@@ -560,7 +560,7 @@ this behaviour:
The standard macros \i\c{__?FLOAT_DAZ?__}, \i\c{__?FLOAT_ROUND?__}, and
\i\c{__?FLOAT?__} contain the current state, as long as the programmer
has avoided the use of the brackeded primitive form, (\c{[FLOAT]}).
has avoided the use of the bracketed primitive form, (\c{[FLOAT]}).
\c{__?FLOAT?__} contains the full set of floating-point settings; this
value can be saved away and invoked later to restore the setting.

View File

@@ -452,7 +452,7 @@ Some examples (all producing exactly the same code):
A character string consists of up to eight characters enclosed in
either single quotes (\c{'...'}), double quotes (\c{"..."}) or
backquotes (\c{`...`}). Single or double quotes are equivalent to
backquotes (\c{`...`}). Single or double quotes are equivalent in
NASM (except of course that surrounding the constant with single
quotes allows double quotes to appear within it and vice versa); the
contents of those are represented verbatim. Strings enclosed in
@@ -513,7 +513,7 @@ the sense of character constants understood by the Pentium's
String constants are character strings used in the context of some
pseudo-instructions, namely the
\I\c{DW}\I\c{DD}\I\c{DQ}\I\c{DT}\I\c{DO}\I\c{DY}\i\c{DB} family and
\i\c{INCBIN} (where it represents a filename.) They are also used in
\i\c{INCBIN} (where it represents a filename). They are also used in
certain preprocessor directives.
A string constant looks like a character constant, only longer. It
@@ -543,7 +543,7 @@ The special operators \i\c{__?utf16?__}, \i\c{__?utf16le?__},
\i\c{__?utf32be?__} allows definition of Unicode strings. They take a
string in UTF-8 format and converts it to UTF-16 or UTF-32,
respectively. Unless the \c{be} forms are specified, the output is
littleendian.
little endian.
For example:

View File

@@ -18,7 +18,7 @@ NASM (\c{%ifusable}) or a particular package already loaded (\c{%ifusing}).
The \c{altreg} standard macro package provides alternate register
names. It provides numeric register names for all registers (not just
\c{R8}-\c{R15}), the Intel-defined aliases \c{R8L}-\c{R15L} for the
low bytes of register (as opposed to the NASM/AMD standard names
low bytes of registers (as opposed to the NASM/AMD standard names
\c{R8B}-\c{R15B}), and the names \c{R0H}-\c{R3H} (by analogy with
\c{R0L}-\c{R3L}) for \c{AH}, \c{CH}, \c{DH}, and \c{BH}.
@@ -35,7 +35,7 @@ See also \k{reg64}.
\H{pkg_smartalign} \i\c{smartalign}\I{align, smart}: Smart \c{ALIGN} Macro
The \c{smartalign} standard macro package provides for an \i\c{ALIGN}
The \c{smartalign} standard macro package provides an \i\c{ALIGN}
macro which is more powerful than the default (and
backwards-compatible) one (see \k{align}). When the \c{smartalign}
package is enabled, when \c{ALIGN} is used without a second argument,
@@ -151,7 +151,7 @@ argument had been put in square brackets:
\c mov eax,[foo] ; memory reference
\c mov eax,dword ptr foo ; memory reference
\c mov eax,dowrd ptr flat:foo ; memory reference
\c mov eax,dword ptr flat:foo ; memory reference
\c mov eax,offset foo ; address
\c mov eax,foo ; address (ambiguous syntax in MASM)

View File

@@ -109,8 +109,9 @@ with \i\c{vstart=}.
start address.
\b Arguments to \c{org}, \c{start}, \c{vstart}, and \c{align=} are
critical expressions. See \k{crit}. E.g. \c{align=(1 << ALIGN_SHIFT)}
- \c{ALIGN_SHIFT} must be defined before it is used here.
critical expressions. See \k{crit}. For example, in the case of
\c{align=(1 << ALIGN_SHIFT)}, \c{ALIGN_SHIFT} must be defined before
it is used here.
\b Any code which comes before an explicit \c{SECTION} directive
is directed by default into the \c{.text} section.
@@ -179,7 +180,7 @@ for historical reasons) is the one produced by \i{MASM} and
\c{obj} provides a default output file-name extension of \c{.obj}.
\c{obj} is not exclusively a 16-bit format, though: NASM has full
\c{obj} is not exclusively a 16-bit format, though; NASM has full
support for the 32-bit extensions to the format. In particular,
32-bit \c{obj} format files are used by \i{Borland's Win32
compilers}, instead of using Microsoft's newer \i\c{win32} object
@@ -631,42 +632,42 @@ Any other section name is treated by default like \c{.text}.
\S{win32safeseh} \c{win32}: Safe Structured Exception Handling
Among other improvements in Windows XP SP2 and Windows Server 2003
Microsoft has introduced concept of "safe structured exception
handling." General idea is to collect handlers' entry points in
designated read-only table and have alleged entry point verified
against this table prior exception control is passed to the handler. In
order for an executable module to be equipped with such "safe exception
handler table," all object modules on linker command line has to comply
with certain criteria. If one single module among them does not, then
the table in question is omitted and above mentioned run-time checks
will not be performed for application in question. Table omission is by
default silent and therefore can be easily overlooked. One can instruct
linker to refuse to produce binary without such table by passing
Among other improvements in Windows XP SP2 and Windows Server 2003,
Microsoft has introduced the concept of "safe structured exception
handling." The general idea is to collect handlers' entry points
in a designated read-only table and have SEH entry points verified
against this table before exception control is passed to the
corresponding handler. In order for an executable module to be
equipped with this read-only table, all object modules on linker
command line have to comply with certain criteria. If even a single
module among them does not, then the table in question is omitted
and above mentioned run-time checks will not be performed for the
application in question. Table omission is silent by default and
therefore can be easily missed. One can instruct the linker to
refuse to produce binary without such table by passing the
\c{/safeseh} command line option.
Without regard to this run-time check merits it's natural to expect
Without regard to this run-time check, it's natural to expect
NASM to be capable of generating modules suitable for \c{/safeseh}
linking. From developer's viewpoint the problem is two-fold:
linking. From the developer's viewpoint the problem is two-fold:
\b how to adapt modules not deploying exception handlers of their own;
\b how to adapt/develop modules utilizing custom exception handling;
Former can be easily achieved with any NASM version by adding following
line to source code:
The former can be easily achieved with any NASM version by adding the
following line to the source code:
\c $@feat.00 equ 1
As of version 2.03 NASM adds this absolute symbol automatically. If
it's not already present to be precise. I.e. if for whatever reason
developer would choose to assign another value in source file, it would
still be perfectly possible.
As of version 2.03 NASM adds this absolute symbol automatically, if
it is not already present (in which case the developer can choose to
assign another value, if desired, for whatever reason).
Registering custom exception handler on the other hand requires certain
"magic." As of version 2.03 additional directive is implemented,
\c{safeseh}, which instructs the assembler to produce appropriately
formatted input data for above mentioned "safe exception handler
Registering a custom exception handler on the other hand requires
certain "magic." As of version 2.03, an additional \c{safeseh} directive
is implemented, which instructs the assembler to produce appropriately
formatted input data for the above-mentioned "safe exception handler
table." Its typical use would be:
\c section .text
@@ -699,19 +700,18 @@ table." Its typical use would be:
\c section .drectve info
\c db '/defaultlib:user32.lib /defaultlib:msvcrt.lib '
As you might imagine, it's perfectly possible to produce .exe binary
with "safe exception handler table" and yet engage unregistered
exception handler. Indeed, handler is engaged by simply manipulating
\c{[fs:0]} location at run-time, something linker has no power over,
run-time that is. It should be explicitly mentioned that such failure
to register handler's entry point with \c{safeseh} directive has
undesired side effect at run-time. If exception is raised and
unregistered handler is to be executed, the application is abruptly
terminated without any notification whatsoever. One can argue that
system could at least have logged some kind "non-safe exception
handler in x.exe at address n" message in event log, but no, literally
no notification is provided and user is left with no clue on what
caused application failure.
As you might imagine, it's perfectly possible to produce an .exe binary
with the "safe exception handler table" and yet invoke an unregistered
exception handler. A handler is invoked by manipulating \c{[fs:0]}
at run-time, something the linker has no power over. It is therefore
important to note that such failure to register a handler's entry point
with the \c{safeseh} directive will have undesired side effects at
run-time. If an exception is raised and an unregistered handler is to be
executed, the application is abruptly terminated without any notification
whatsoever. One can argue that the system should at least log some kind
of "non-safe exception handler in x.exe at address n" message in the
event log, but unfortunately the user is left without any clue as to
what might have caused the crash.
Finally, all mentions of linker in this paragraph refer to Microsoft
linker version 7.x and later. Presence of \c{@feat.00} symbol and input
@@ -749,7 +749,7 @@ references. Consider a switch dispatch table:
\c ...
Even a novice Win64 assembler programmer will soon realize that the code
is not 64-bit savvy. Most notably linker will refuse to link it with
is not 64-bit savvy. Most notably the linker will refuse to link it, showing:
\c 'ADDR32' relocation to '.text' invalid without /LARGEADDRESSAWARE:NO
@@ -758,15 +758,15 @@ So [s]he will have to split jmp instruction as following:
\c lea rbx,[rel dsptch]
\c jmp qword [rbx+rax*8]
What happens behind the scene is that effective address in \c{lea} is
encoded relative to instruction pointer, or in perfectly
What happens behind the scenes is that the effective address in \c{lea}
is encoded relative to instruction pointer, in a perfectly
position-independent manner. But this is only part of the problem!
Trouble is that in .dll context \c{caseN} relocations will make their
way to the final module and might have to be adjusted at .dll load
time. To be specific when it can't be loaded at preferred address. And
when this occurs, pages with such relocations will be rendered private
to current process, which kind of undermines the idea of sharing .dll.
But no worry, it's trivial to fix:
The issue is that in a .dll context, the \c{caseN} relocations will make
their way to the final module and might have to be adjusted at .dll load
time (specifically, when it can't be loaded at the preferred address).
When this occurs, pages with such relocations will be rendered private
to current process, which kind of undermines the idea of a shared .dll.
But not to worry, it's trivial to fix:
\c lea rbx,[rel dsptch]
\c add rbx,[rbx+rax*8]
@@ -777,11 +777,11 @@ But no worry, it's trivial to fix:
\c ...
NASM version 2.03 and later provides another alternative, \c{wrt
..imagebase} operator, which returns offset from base address of the
current image, be it .exe or .dll module, therefore the name. For those
acquainted with PE-COFF format base address denotes start of
\c{IMAGE_DOS_HEADER} structure. Here is how to implement switch with
these image-relative references:
..imagebase} operator, which returns an offset from base address of the
current image, be it .exe or .dll module, hence the name. For those
acquainted with PE-COFF format, this base address denotes the start of
the \c{IMAGE_DOS_HEADER} structure. Here is how to implement a switch
statement with these image-relative references:
\c lea rbx,[rel dsptch]
\c mov eax,[rbx+rax*4]
@@ -792,10 +792,10 @@ these image-relative references:
\c dsptch: dd case0 wrt ..imagebase
\c dd case1 wrt ..imagebase
One can argue that the operator is redundant. Indeed, snippet before
last works just fine with any NASM version and is not even Windows
specific... The real reason for implementing \c{wrt ..imagebase} will
become apparent in next paragraph.
That said, the snippet before last works just fine with any NASM version
and is not even Windows specific, which makes this operator unnecessary
in this case. The real reason for the \c{wrt ..imagebase} operator will
become apparent in the next section.
It should be noted that \c{wrt ..imagebase} is defined as 32-bit
operand only:
@@ -814,10 +814,10 @@ functions [in given executable module] is traversed and compared to the
saved program counter. Thus so called \c{UNWIND_INFO} structure is
identified. If it's not found, then offending subroutine is assumed to
be "leaf" and just mentioned lookup procedure is attempted for its
caller. In Win64 leaf function is such function that does not call any
other function \e{nor} modifies any Win64 non-volatile registers,
including stack pointer. The latter ensures that it's possible to
identify leaf function's caller by simply pulling the value from the
caller. In Win64, a leaf function is a function that does not call any
other functions \e{nor} modifies any Win64 non-volatile registers,
including the stack pointer. The latter ensures that it's possible to
identify a leaf function's caller by simply pulling the value from the
top of the stack.
While majority of subroutines written in assembler are not calling any
@@ -1187,8 +1187,8 @@ override that.
\b \i\c{merge} indicates that duplicate data elements in this section
should be merged with data elements from other object files. Data
elements can be either fixed-sized objects or null-terminatedstrings
(with the \c{strings} attribute.) A size specifier is required unless
elements can be either fixed-sized objects or null-terminated strings
(with the \c{strings} attribute). A size specifier is required unless
\c{strings} is specified, in which case the size defaults to \c{byte}.
\b \i\c{tls} defines the section to be one which contains

View File

@@ -612,8 +612,8 @@ It's often useful to be able to handle strings in macros. NASM
supports a few simple string handling macro operators from which
more complex operations can be constructed.
All the string operators define or redefine a value (either a string
or a numeric value) to a single-line macro. When producing a string
All the string operators define or redefine a single-line macro to some
value (either a string or a numeric value). When producing a string
value, it may change the style of quoting of the input string or
strings, and possibly use \c{\\}-escapes inside \c{`}-quoted strings.
@@ -680,7 +680,7 @@ than the description:
As with \c{%strlen} (see \k{strlen}), the first parameter is the
single-line macro to be created and the second is the string. The
third parameter specifies the first character to be selected, and the
optional fourth parameter preceded by comma) is the length. Note
optional fourth parameter (preceded by comma) is the length. Note
that the first index is 1, not 0 and the last index is equal to the
value that \c{%strlen} would assign given the same string. Index
values out of range result in an empty string. A negative length
@@ -1411,7 +1411,7 @@ iterated through in reverse order.
\S{concat} \i{Concatenating Macro Parameters}
NASM can concatenate macro parameters and macro indirection constructs
on to other text surrounding them. This allows you to declare a family
with other surrounding text. This allows you to declare a family
of symbols, for example, in a macro definition. If, for example, you
wanted to generate a table of key codes along with offsets into the
table, you could code something like
@@ -2065,7 +2065,7 @@ This pushes a new context called \c{foobar} on the stack. You can have
several contexts on the stack with the same name: they can still be
distinguished. If no name is given, the context is unnamed (this is
normally used when both the \c{%push} and the \c{%pop} are inside a
single macro definition.)
single macro definition).
The directive \c{%pop}, taking one optional argument, removes the top
context from the context stack and destroys it, along with any
@@ -2237,7 +2237,7 @@ implement a block IF statement as a set of macros.
This code is more robust than the \c{REPEAT} and \c{UNTIL} macros
given in \k{ctxlocal}, because it uses conditional assembly to check
that the macros are issued in the right order (for example, not
calling \c{endif} before \c{if}) and issues a \c{%error} if they're
calling \c{endif} before \c{if}) and issues an \c{%error} if they're
not.
In addition, the \c{endif} macro has to be able to cope with the two

View File

@@ -526,17 +526,17 @@ accepted for this option starting in NASM version 2.11.05.
\S{opt-pfix} The \i\c{--(g|l)prefix}, \i\c{--(g|l)postfix} Options.
The \c{--(g)prefix} options prepend the given argument
The \c{--gprefix} option prepends the given argument
to all \c{extern}, \c{common}, \c{static}, and \c{global} symbols, and the
\c{--lprefix} option prepends to all other symbols. Similarly,
\c{--(g)postfix} and \c{--lpostfix} options append
the argument in the exactly same way as the \c{--xxprefix} options does.
\c{--gpostfix} and \c{--lpostfix} options append
the argument, in a manner similar to the \c{--(g|l)prefix} options.
Running this:
\c nasm -f macho --gprefix _
is equivalent to place the directive with \c{%pragma macho gprefix _}
is equivalent to placing the directive \c{%pragma macho gprefix _}
at the start of the file (\k{mangling}). It will prepend the underscore
to all global and external variables, as C requires it in some, but not all,
system calling conventions.
@@ -585,7 +585,7 @@ before returning to the top-level input. Default is 100000.
\b\c{--limit-rep}: Maximum number of allowed preprocessor loop, defined
under \c{%rep}. Default is 1000000.
\b\c{--limit-eval}: This number sets the boundary condition of allowed
\b\c{--limit-eval}: This number sets the maximum allowed
expression length. Default is 8192 on most systems.
\b\c{--limit-lines}: Total number of source lines allowed to be