0
0
mirror of https://github.com/netwide-assembler/nasm.git synced 2025-09-22 10:43:39 -04:00
Files
nasm/doc/64bit.src
H. Peter Anvin e0d5333a47 doc: add 2.17 release notes and document [dollarhex]
Add the beginnings (at least) of release notes for 2.17, and document
the [dollarhex] directive.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2025-09-03 12:31:03 -07:00

185 lines
7.8 KiB
Plaintext

\C{64bit} Writing 64-bit Code (Unix, Win64)
This chapter attempts to cover some of the common issues involved when
writing 64-bit code, to run under \i{Win64} or Unix. It covers how to
write assembly code to interface with 64-bit C routines, and how to
write position-independent code for shared libraries.
All 64-bit code uses a flat memory model, since segmentation is not
available in 64-bit mode. The one exception is the \c{FS} and \c{GS}
registers, which still add their bases.
Position independence in 64-bit mode is significantly simpler, since
the processor supports \c{RIP}-relative addressing directly; see the
\c{REL} keyword (\k{effaddr}). On most 64-bit platforms, it is
probably desirable to make that the default, using the directive
\c{DEFAULT REL} (\k{default}).
\c{DEFAULT REL} is likely to become the default in a future version of NASM.
64-bit programming is relatively similar to 32-bit programming, but
of course pointers are 64 bits long; additionally, all existing
platforms pass arguments in registers rather than on the stack.
Furthermore, 64-bit platforms use SSE2 by default for floating point.
Please see the ABI documentation for your platform.
64-bit platforms differ in the sizes of the C/C++ fundamental
datatypes, not just from 32-bit platforms but from each other. If a
specific size data type is desired, it is probably best to use the
types defined in the standard C header \c{<inttypes.h>}.
All known 64-bit platforms except some embedded platforms require that
the stack is 16-byte aligned at the entry to a function. In order to
enforce that, the stack pointer (\c{RSP}) needs to be aligned on an
\c{odd} multiple of 8 bytes before the \c{CALL} instruction.
In 64-bit mode, the default instruction size is still 32 bits. When
loading a value into a 32-bit register (but not an 8- or 16-bit
register), the upper 32 bits of the corresponding 64-bit register are
set to zero.
\H{reg64} Register Names in 64-bit Mode
NASM uses the following names for general-purpose registers in 64-bit
mode, for 8-, 16-, 32- and 64-bit references, respectively:
\c AL/AH, CL/CH, DL/DH, BL/BH, SPL, BPL, SIL, DIL, R8B-R15B
\c AX, CX, DX, BX, SP, BP, SI, DI, R8W-R15W
\c EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI, R8D-R15D
\c RAX, RCX, RDX, RBX, RSP, RBP, RSI, RDI, R8-R15
This is consistent with the AMD documentation and most other
assemblers. The Intel documentation, however, uses the names
\c{R8L-R15L} for 8-bit references to the higher registers. It is
possible to use those names by definiting them as macros; similarly,
if one wants to use numeric names for the low 8 registers, define them
as macros. The standard macro package \c{altreg} (see \k{pkg_altreg})
can be used for this purpose.
\H{id64} Immediates and Displacements in 64-bit Mode
In 64-bit mode, immediates and displacements are generally only 32
bits wide. NASM will therefore truncate most displacements and
immediates to 32 bits.
The only instruction which takes a full \i{64-bit immediate} is:
\c MOV reg64,imm64
NASM will produce this instruction whenever the programmer uses
\c{MOV} with an immediate into a 64-bit register. If this is not
desirable, simply specify the equivalent 32-bit register, which will
be automatically zero-extended by the processor, or specify the
immediate as \c{DWORD}:
\c mov rax,foo ; 64-bit immediate
\c mov rax,qword foo ; (identical)
\c mov eax,foo ; 32-bit immediate, zero-extended
\c mov rax,dword foo ; 32-bit immediate, sign-extended
The length of these instructions are 10, 5 and 7 bytes, respectively.
If optimization is enabled and NASM can determine at assembly time
that a shorter instruction will suffice, the shorter instruction will
be emitted unless of course \c{STRICT QWORD} or \c{STRICT DWORD} is
specified (see \k{strict}):
\c mov rax,1 ; Assembles as "mov eax,1" (5 bytes)
\c mov rax,strict qword 1 ; Full 10-byte instruction
\c mov rax,strict dword 1 ; 7-byte instruction
\c mov rax,symbol ; 10 bytes, not known at assembly time
\c lea rax,[rel symbol] ; 7 bytes, usually preferred by the ABI
Note that \c{lea rax,[rel symbol]} is position-independent, whereas
\c{mov rax,symbol} is not. Most ABIs prefer or even require
position-independent code in 64-bit mode. However, the \c{MOV}
instruction is able to reference a symbol anywhere in the 64-bit
address space, whereas \c{LEA} is only able to access a symbol within
within 2 GB of the instruction itself (see below.)
The only instructions which take a full \I{64-bit displacement}64-bit
\e{displacement} is loading or storing, using \c{MOV}, \c{AL}, \c{AX},
\c{EAX} or \c{RAX} (but no other registers) to an absolute 64-bit address.
Since this is a relatively rarely used instruction (64-bit code generally uses
relative addressing), the programmer has to explicitly declare the
displacement size as \c{ABS QWORD}:
\c default abs
\c
\c mov eax,[foo] ; 32-bit absolute disp, sign-extended
\c mov eax,[a32 foo] ; 32-bit absolute disp, zero-extended
\c mov eax,[qword foo] ; 64-bit absolute disp
\c
\c default rel
\c
\c mov eax,[foo] ; 32-bit relative disp
\c mov eax,[a32 foo] ; d:o, address truncated to 32 bits(!)
\c mov eax,[qword foo] ; error
\c mov eax,[abs qword foo] ; 64-bit absolute disp
A sign-extended absolute displacement can access from -2 GB to +2 GB;
a zero-extended absolute displacement can access from 0 to 4 GB.
\H{unix64} Interfacing to 64-bit C Programs (Unix)
On Unix, the 64-bit ABI as well as the x32 ABI (32-bit ABI with the
CPU in 64-bit mode) is defined by the documents at:
\W{https://www.nasm.us/abi/unix64}\c{https://www.nasm.us/abi/unix64}
Although written for AT&T-syntax assembly, the concepts apply equally
well for NASM-style assembly. What follows is a simplified summary.
The first six integer arguments (from the left) are passed in \c{RDI},
\c{RSI}, \c{RDX}, \c{RCX}, \c{R8}, and \c{R9}, in that order.
Additional integer arguments are passed on the stack. These
registers, plus \c{RAX}, \c{R10} and \c{R11} are destroyed by function
calls, and thus are available for use by the function without saving.
Integer return values are passed in \c{RAX} and \c{RDX}, in that order.
Floating point is done using SSE registers, except for \c{long
double}, which is 80 bits (\c{TWORD}) on most platforms (Android is
one exception; there \c{long double} is 64 bits and treated the same
as \c{double}.) Floating-point arguments are passed in \c{XMM0} to
\c{XMM7}; return is \c{XMM0} and \c{XMM1}. \c{long double} are passed
on the stack, and returned in \c{ST0} and \c{ST1}.
All SSE and x87 registers are destroyed by function calls.
On 64-bit Unix, \c{long} is 64 bits.
Integer and SSE register arguments are counted separately, so for the case of
\c void foo(long a, double b, int c)
\c{a} is passed in \c{RDI}, \c{b} in \c{XMM0}, and \c{c} in \c{ESI}.
\H{win64} Interfacing to 64-bit C Programs (Win64)
The Win64 ABI is described by the document at:
\W{https://www.nasm.us/abi/win64}\c{https://www.nasm.us/abi/win64}
What follows is a simplified summary.
The first four integer arguments are passed in \c{RCX}, \c{RDX},
\c{R8} and \c{R9}, in that order. Additional integer arguments are
passed on the stack. These registers, plus \c{RAX}, \c{R10} and
\c{R11} are destroyed by function calls, and thus are available for
use by the function without saving.
Integer return values are passed in \c{RAX} only.
Floating point is done using SSE registers, except for \c{long
double}. Floating-point arguments are passed in \c{XMM0} to \c{XMM3};
return is \c{XMM0} only.
On Win64, \c{long} is 32 bits; \c{long long} or \c{_int64} is 64 bits.
Integer and SSE register arguments are counted together, so for the case of
\c void foo(long long a, double b, int c)
\c{a} is passed in \c{RCX}, \c{b} in \c{XMM1}, and \c{c} in \c{R8D}.