mirror of
https://github.com/netwide-assembler/nasm.git
synced 2025-09-22 10:43:39 -04:00
Add the beginnings (at least) of release notes for 2.17, and document the [dollarhex] directive. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
185 lines
7.8 KiB
Plaintext
185 lines
7.8 KiB
Plaintext
\C{64bit} Writing 64-bit Code (Unix, Win64)
|
|
|
|
This chapter attempts to cover some of the common issues involved when
|
|
writing 64-bit code, to run under \i{Win64} or Unix. It covers how to
|
|
write assembly code to interface with 64-bit C routines, and how to
|
|
write position-independent code for shared libraries.
|
|
|
|
All 64-bit code uses a flat memory model, since segmentation is not
|
|
available in 64-bit mode. The one exception is the \c{FS} and \c{GS}
|
|
registers, which still add their bases.
|
|
|
|
Position independence in 64-bit mode is significantly simpler, since
|
|
the processor supports \c{RIP}-relative addressing directly; see the
|
|
\c{REL} keyword (\k{effaddr}). On most 64-bit platforms, it is
|
|
probably desirable to make that the default, using the directive
|
|
\c{DEFAULT REL} (\k{default}).
|
|
|
|
\c{DEFAULT REL} is likely to become the default in a future version of NASM.
|
|
|
|
64-bit programming is relatively similar to 32-bit programming, but
|
|
of course pointers are 64 bits long; additionally, all existing
|
|
platforms pass arguments in registers rather than on the stack.
|
|
Furthermore, 64-bit platforms use SSE2 by default for floating point.
|
|
Please see the ABI documentation for your platform.
|
|
|
|
64-bit platforms differ in the sizes of the C/C++ fundamental
|
|
datatypes, not just from 32-bit platforms but from each other. If a
|
|
specific size data type is desired, it is probably best to use the
|
|
types defined in the standard C header \c{<inttypes.h>}.
|
|
|
|
All known 64-bit platforms except some embedded platforms require that
|
|
the stack is 16-byte aligned at the entry to a function. In order to
|
|
enforce that, the stack pointer (\c{RSP}) needs to be aligned on an
|
|
\c{odd} multiple of 8 bytes before the \c{CALL} instruction.
|
|
|
|
In 64-bit mode, the default instruction size is still 32 bits. When
|
|
loading a value into a 32-bit register (but not an 8- or 16-bit
|
|
register), the upper 32 bits of the corresponding 64-bit register are
|
|
set to zero.
|
|
|
|
\H{reg64} Register Names in 64-bit Mode
|
|
|
|
NASM uses the following names for general-purpose registers in 64-bit
|
|
mode, for 8-, 16-, 32- and 64-bit references, respectively:
|
|
|
|
\c AL/AH, CL/CH, DL/DH, BL/BH, SPL, BPL, SIL, DIL, R8B-R15B
|
|
\c AX, CX, DX, BX, SP, BP, SI, DI, R8W-R15W
|
|
\c EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI, R8D-R15D
|
|
\c RAX, RCX, RDX, RBX, RSP, RBP, RSI, RDI, R8-R15
|
|
|
|
This is consistent with the AMD documentation and most other
|
|
assemblers. The Intel documentation, however, uses the names
|
|
\c{R8L-R15L} for 8-bit references to the higher registers. It is
|
|
possible to use those names by definiting them as macros; similarly,
|
|
if one wants to use numeric names for the low 8 registers, define them
|
|
as macros. The standard macro package \c{altreg} (see \k{pkg_altreg})
|
|
can be used for this purpose.
|
|
|
|
\H{id64} Immediates and Displacements in 64-bit Mode
|
|
|
|
In 64-bit mode, immediates and displacements are generally only 32
|
|
bits wide. NASM will therefore truncate most displacements and
|
|
immediates to 32 bits.
|
|
|
|
The only instruction which takes a full \i{64-bit immediate} is:
|
|
|
|
\c MOV reg64,imm64
|
|
|
|
NASM will produce this instruction whenever the programmer uses
|
|
\c{MOV} with an immediate into a 64-bit register. If this is not
|
|
desirable, simply specify the equivalent 32-bit register, which will
|
|
be automatically zero-extended by the processor, or specify the
|
|
immediate as \c{DWORD}:
|
|
|
|
\c mov rax,foo ; 64-bit immediate
|
|
\c mov rax,qword foo ; (identical)
|
|
\c mov eax,foo ; 32-bit immediate, zero-extended
|
|
\c mov rax,dword foo ; 32-bit immediate, sign-extended
|
|
|
|
The length of these instructions are 10, 5 and 7 bytes, respectively.
|
|
|
|
If optimization is enabled and NASM can determine at assembly time
|
|
that a shorter instruction will suffice, the shorter instruction will
|
|
be emitted unless of course \c{STRICT QWORD} or \c{STRICT DWORD} is
|
|
specified (see \k{strict}):
|
|
|
|
\c mov rax,1 ; Assembles as "mov eax,1" (5 bytes)
|
|
\c mov rax,strict qword 1 ; Full 10-byte instruction
|
|
\c mov rax,strict dword 1 ; 7-byte instruction
|
|
\c mov rax,symbol ; 10 bytes, not known at assembly time
|
|
\c lea rax,[rel symbol] ; 7 bytes, usually preferred by the ABI
|
|
|
|
Note that \c{lea rax,[rel symbol]} is position-independent, whereas
|
|
\c{mov rax,symbol} is not. Most ABIs prefer or even require
|
|
position-independent code in 64-bit mode. However, the \c{MOV}
|
|
instruction is able to reference a symbol anywhere in the 64-bit
|
|
address space, whereas \c{LEA} is only able to access a symbol within
|
|
within 2 GB of the instruction itself (see below.)
|
|
|
|
The only instructions which take a full \I{64-bit displacement}64-bit
|
|
\e{displacement} is loading or storing, using \c{MOV}, \c{AL}, \c{AX},
|
|
\c{EAX} or \c{RAX} (but no other registers) to an absolute 64-bit address.
|
|
Since this is a relatively rarely used instruction (64-bit code generally uses
|
|
relative addressing), the programmer has to explicitly declare the
|
|
displacement size as \c{ABS QWORD}:
|
|
|
|
\c default abs
|
|
\c
|
|
\c mov eax,[foo] ; 32-bit absolute disp, sign-extended
|
|
\c mov eax,[a32 foo] ; 32-bit absolute disp, zero-extended
|
|
\c mov eax,[qword foo] ; 64-bit absolute disp
|
|
\c
|
|
\c default rel
|
|
\c
|
|
\c mov eax,[foo] ; 32-bit relative disp
|
|
\c mov eax,[a32 foo] ; d:o, address truncated to 32 bits(!)
|
|
\c mov eax,[qword foo] ; error
|
|
\c mov eax,[abs qword foo] ; 64-bit absolute disp
|
|
|
|
A sign-extended absolute displacement can access from -2 GB to +2 GB;
|
|
a zero-extended absolute displacement can access from 0 to 4 GB.
|
|
|
|
\H{unix64} Interfacing to 64-bit C Programs (Unix)
|
|
|
|
On Unix, the 64-bit ABI as well as the x32 ABI (32-bit ABI with the
|
|
CPU in 64-bit mode) is defined by the documents at:
|
|
|
|
\W{https://www.nasm.us/abi/unix64}\c{https://www.nasm.us/abi/unix64}
|
|
|
|
Although written for AT&T-syntax assembly, the concepts apply equally
|
|
well for NASM-style assembly. What follows is a simplified summary.
|
|
|
|
The first six integer arguments (from the left) are passed in \c{RDI},
|
|
\c{RSI}, \c{RDX}, \c{RCX}, \c{R8}, and \c{R9}, in that order.
|
|
Additional integer arguments are passed on the stack. These
|
|
registers, plus \c{RAX}, \c{R10} and \c{R11} are destroyed by function
|
|
calls, and thus are available for use by the function without saving.
|
|
|
|
Integer return values are passed in \c{RAX} and \c{RDX}, in that order.
|
|
|
|
Floating point is done using SSE registers, except for \c{long
|
|
double}, which is 80 bits (\c{TWORD}) on most platforms (Android is
|
|
one exception; there \c{long double} is 64 bits and treated the same
|
|
as \c{double}.) Floating-point arguments are passed in \c{XMM0} to
|
|
\c{XMM7}; return is \c{XMM0} and \c{XMM1}. \c{long double} are passed
|
|
on the stack, and returned in \c{ST0} and \c{ST1}.
|
|
|
|
All SSE and x87 registers are destroyed by function calls.
|
|
|
|
On 64-bit Unix, \c{long} is 64 bits.
|
|
|
|
Integer and SSE register arguments are counted separately, so for the case of
|
|
|
|
\c void foo(long a, double b, int c)
|
|
|
|
\c{a} is passed in \c{RDI}, \c{b} in \c{XMM0}, and \c{c} in \c{ESI}.
|
|
|
|
\H{win64} Interfacing to 64-bit C Programs (Win64)
|
|
|
|
The Win64 ABI is described by the document at:
|
|
|
|
\W{https://www.nasm.us/abi/win64}\c{https://www.nasm.us/abi/win64}
|
|
|
|
What follows is a simplified summary.
|
|
|
|
The first four integer arguments are passed in \c{RCX}, \c{RDX},
|
|
\c{R8} and \c{R9}, in that order. Additional integer arguments are
|
|
passed on the stack. These registers, plus \c{RAX}, \c{R10} and
|
|
\c{R11} are destroyed by function calls, and thus are available for
|
|
use by the function without saving.
|
|
|
|
Integer return values are passed in \c{RAX} only.
|
|
|
|
Floating point is done using SSE registers, except for \c{long
|
|
double}. Floating-point arguments are passed in \c{XMM0} to \c{XMM3};
|
|
return is \c{XMM0} only.
|
|
|
|
On Win64, \c{long} is 32 bits; \c{long long} or \c{_int64} is 64 bits.
|
|
|
|
Integer and SSE register arguments are counted together, so for the case of
|
|
|
|
\c void foo(long long a, double b, int c)
|
|
|
|
\c{a} is passed in \c{RCX}, \c{b} in \c{XMM1}, and \c{c} in \c{R8D}.
|