mirror of
				https://github.com/netwide-assembler/nasm.git
				synced 2025-10-10 00:25:06 -04:00 
			
		
		
		
	
		
			
				
	
	
		
			979 lines
		
	
	
		
			39 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			979 lines
		
	
	
		
			39 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| \C{lang} The NASM Language
 | |
| 
 | |
| \H{syntax} Layout of a NASM Source Line
 | |
| 
 | |
| Like most assemblers, each NASM source line contains (unless it
 | |
| is a macro, a preprocessor directive or an assembler directive: see
 | |
| \k{preproc} and \k{directive}) some combination of the four fields
 | |
| 
 | |
| \c label:    instruction operands        ; comment
 | |
| 
 | |
| As usual, most of these fields are optional; the presence or absence
 | |
| of any combination of a label, an instruction and a \i{comment} is
 | |
| allowed.  Of course, the operand field is either required or forbidden
 | |
| by the presence and nature of the instruction field.
 | |
| 
 | |
| NASM uses backslash (\\) as the line continuation character; if a line
 | |
| ends with backslash, the next line is considered to be a part of the
 | |
| backslash-ended line.
 | |
| 
 | |
| NASM places no restrictions on white space within a line: labels may
 | |
| have white space before them, or instructions may have no space
 | |
| before them, or anything. The \i{colon} after a label is also
 | |
| optional. (Note that this means that if you intend to code \c{lodsb}
 | |
| alone on a line, and type \c{lodab} by accident, then that's still a
 | |
| valid source line which does nothing but define a label. Running
 | |
| NASM with the command-line option
 | |
| \I{label-orphan}\c{-w+orphan-labels} will cause it to warn you if
 | |
| you define a label alone on a line without a \i{trailing colon}.)
 | |
| 
 | |
| \i{Valid characters} in labels are letters, numbers, \c{_}, \c{$},
 | |
| \c{#}, \c{@}, \c{~}, \c{.}, and \c{?}. The only characters which may
 | |
| be used as the \e{first} character of an identifier are letters,
 | |
| \c{.} (with special meaning: see \k{locallab}), \c{_} and \c{?}.
 | |
| An identifier may also be prefixed with a \I{$, prefix}\c{$} to
 | |
| indicate that it is intended to be read as an identifier and not a
 | |
| reserved word; thus, if some other module you are linking with
 | |
| defines a symbol called \c{eax}, you can refer to \c{$eax} in NASM
 | |
| code to distinguish the symbol from the register. Maximum length of
 | |
| an identifier is 4095 characters.
 | |
| 
 | |
| The instruction field may contain any machine instruction: Pentium and
 | |
| P6 instructions, FPU instructions, MMX instructions and even
 | |
| undocumented instructions are all supported. The instruction may be
 | |
| prefixed by \c{LOCK}, \c{REP}, \c{REPE}/\c{REPZ}, \c{REPNE}/\c{REPNZ},
 | |
| \c{XACQUIRE}/\c{XRELEASE} or \c{BND}/\c{NOBND}, in the usual
 | |
| way. Explicit \I{address-size prefixes}address-size and
 | |
| \i{operand-size prefixes} \i\c{A16}, \i\c{A32}, \i\c{A64}, \i\c{O16}
 | |
| and \i\c{O32}, \i\c{O64} are provided - one example of their use is
 | |
| given in \k{mixsize}. You can also use the name of a \I{segment
 | |
| override}segment register as an instruction prefix: coding \c{es mov
 | |
| [bx],ax} is equivalent to coding \c{mov [es:bx],ax}. We recommend the
 | |
| latter syntax, since it is consistent with other syntactic features of
 | |
| the language, but for instructions such as \c{LODSB}, which has no
 | |
| operands and yet can require a segment override, there is no clean
 | |
| syntactic way to proceed apart from \c{es lodsb}.
 | |
| 
 | |
| An instruction is not required to use a prefix: prefixes such as
 | |
| \c{CS}, \c{A32}, \c{LOCK} or \c{REPE} can appear on a line by
 | |
| themselves, and NASM will just generate the prefix bytes.
 | |
| 
 | |
| In addition to actual machine instructions, NASM also supports a
 | |
| number of pseudo-instructions, described in \k{pseudop}.
 | |
| 
 | |
| Instruction \i{operands} may take a number of forms: they can be
 | |
| registers, described simply by the register name (e.g. \c{ax},
 | |
| \c{bp}, \c{ebx}, \c{cr0}: NASM does not use the \c{gas}-style
 | |
| syntax in which register names must be prefixed by a \c{%} sign), or
 | |
| they can be \i{effective addresses} (see \k{effaddr}), constants
 | |
| (\k{const}) or expressions (\k{expr}).
 | |
| 
 | |
| For x87 \i{floating-point} instructions, NASM accepts a wide range of
 | |
| syntaxes: you can use two-operand forms like MASM supports, or you
 | |
| can use NASM's native single-operand forms in most cases.
 | |
| \# Details of
 | |
| \# all forms of each supported instruction are given in
 | |
| \# \k{iref}.
 | |
| For example, you can code:
 | |
| 
 | |
| \c         fadd    st1             ; this sets st0 := st0 + st1
 | |
| \c         fadd    st0,st1         ; so does this
 | |
| \c
 | |
| \c         fadd    st1,st0         ; this sets st1 := st1 + st0
 | |
| \c         fadd    to st1          ; so does this
 | |
| 
 | |
| Almost any x87 floating-point instruction that references memory must
 | |
| use one of the prefixes \i\c{DWORD}, \i\c{QWORD} or \i\c{TWORD} to
 | |
| indicate what size of \i{memory operand} it refers to.
 | |
| 
 | |
| 
 | |
| \H{pseudop} \i{Pseudo-Instructions}
 | |
| 
 | |
| Pseudo-instructions are things which, though not real x86 machine
 | |
| instructions, are used in the instruction field anyway because that's
 | |
| the most convenient place to put them. The current pseudo-instructions
 | |
| are \i\c{DB}, \i\c{DW}, \i\c{DD}, \i\c{DQ}, \i\c{DT}, \i\c{DO},
 | |
| \i\c{DY} and \i\c\{DZ}; their \i{uninitialized} counterparts
 | |
| \i\c{RESB}, \i\c{RESW}, \i\c{RESD}, \i\c{RESQ}, \i\c{REST},
 | |
| \i\c{RESO}, \i\c{RESY} and \i\c\{RESZ}; the \i\c{INCBIN} command, the
 | |
| \i\c{EQU} command, and the \i\c{TIMES} prefix.
 | |
| 
 | |
| In this documentation, the notation "\c{D}\e{x}" and "\c{RES}\e{x}" is
 | |
| used to indicate all the \c{DB} and \c{RESB} type directives,
 | |
| respectively.
 | |
| 
 | |
| 
 | |
| \S{db} \c{D}\e{x}: Declaring Initialized Data
 | |
| 
 | |
| \i\c{DB}, \i\c{DW}, \i\c{DD}, \i\c{DQ}, \i\c{DT}, \i\c{DO}, \i\c{DY}
 | |
| and \i\c{DZ} (collectively "\c{D}\e{x}" in this documentation) are used,
 | |
| much as in MASM, to declare initialized data in the output file. They
 | |
| can be invoked in a wide range of ways:
 | |
| \I{floating-point constant}\I{character constant}\I{string constant}
 | |
| 
 | |
| \c       db    0x55                ; just the byte 0x55
 | |
| \c       db    0x55,0x56,0x57      ; three bytes in succession
 | |
| \c       db    'a',0x55            ; character constants are OK
 | |
| \c       db    'hello',13,10,'$'   ; so are string constants
 | |
| \c       dw    0x1234              ; 0x34 0x12
 | |
| \c       dw    'a'                 ; 0x61 0x00 (it's just a number)
 | |
| \c       dw    'ab'                ; 0x61 0x62 (character constant)
 | |
| \c       dw    'abc'               ; 0x61 0x62 0x63 0x00 (string)
 | |
| \c       dd    0x12345678          ; 0x78 0x56 0x34 0x12
 | |
| \c       dd    1.234567e20         ; floating-point constant
 | |
| \c       dq    0x123456789abcdef0  ; eight byte constant
 | |
| \c       dq    1.234567e20         ; double-precision float
 | |
| \c       dt    1.234567e20         ; extended-precision float
 | |
| 
 | |
| \c{DT}, \c{DO}, \c{DY} and \c{DZ} do not accept integer
 | |
| \i{numeric constants} as operands.
 | |
| 
 | |
| \I{masmdb} Starting in NASM 2.15, a the following \i{MASM}-like features
 | |
| have been implemented:
 | |
| 
 | |
| \b A \I{?db}\c{?} argument to declare \i{uninitialized} storage:
 | |
| 
 | |
| \c       db    ?                   ; uninitialized
 | |
| 
 | |
| \b A superset of the \i\c{DUP} syntax. The NASM version of this has
 | |
| the following syntax specification; capital letters indicate literal
 | |
| keywords:
 | |
| 
 | |
| \c      dx      := DB | DW | DD | DQ | DT | DO | DY | DZ
 | |
| \c      type    := BYTE | WORD | DWORD | QWORD | TWORD | OWORD | YWORD | ZWORD
 | |
| \c      atom    := expression | string | float | '?'
 | |
| \c      parlist := '(' value [',' value ...] ')'
 | |
| \c      duplist := expression DUP [type] ['%'] parlist
 | |
| \c      list    := duplist | '%' parlist | type ['%'] parlist
 | |
| \c      value   := [type] atom | list
 | |
| \c
 | |
| \c      stmt    := dx value [',' value ...]
 | |
| 
 | |
| \> Note that a \e{list} needs to be prefixed with a \I{%db}\c{%} sign unless
 | |
| prefixed by either \c{DUP} or a \e{type} in order to avoid confusing it with
 | |
| a parenthesis starting an expression. The following expressions are all
 | |
| valid:
 | |
| 
 | |
| \c        db 33
 | |
| \c        db (44)		; Integer expression
 | |
| \c      ; db (44,55)		; Invalid - error
 | |
| \c        db %(44,55)
 | |
| \c        db %('XX','YY')
 | |
| \c        db ('AA')		; Integer expression - outputs single byte
 | |
| \c        db %('BB')		; List, containing a string
 | |
| \c        db ?
 | |
| \c        db 6 dup (33)
 | |
| \c        db 6 dup (33, 34)
 | |
| \c        db 6 dup (33, 34), 35
 | |
| \c        db 7 dup (99)
 | |
| \c        db 7 dup dword (?, word ?, ?)
 | |
| \c        dw byte (?,44)
 | |
| \c        dw 3 dup (0xcc, 4 dup byte ('PQR'), ?), 0xabcd
 | |
| \c        dd 16 dup (0xaaaa, ?, 0xbbbbbb)
 | |
| \c        dd 64 dup (?)
 | |
| 
 | |
| \I{baddb} The use of \c{$} (current address) in a \c{D}\e{x} statement is
 | |
| undefined in the current version of NASM, \e{except in the following
 | |
| cases}:
 | |
| 
 | |
| \b For the first expression in the statement, either a \c{DUP} or a data
 | |
| item.
 | |
| 
 | |
| \b An expression of the form "\e{value}\c{ - $}", which is converted
 | |
| to a self-relative relocation.
 | |
| 
 | |
| Future versions of NASM is likely to produce a different result or
 | |
| issue an error this case.
 | |
| 
 | |
| There is no such restriction on using \c{$$} or section-relative
 | |
| symbols.
 | |
| 
 | |
| \S{resb} \c{RESB} and Friends: Declaring \i{Uninitialized} Data
 | |
| 
 | |
| \i\c{RESB}, \i\c{RESW}, \i\c{RESD}, \i\c{RESQ}, \i\c{REST},
 | |
| \i\c{RESO}, \i\c{RESY} and \i\c\{RESZ} are designed to be used in the
 | |
| BSS section of a module: they declare \e{uninitialized} storage
 | |
| space. Each takes a single operand, which is the number of bytes,
 | |
| words, doublewords or whatever to reserve. The operand to a
 | |
| \c{RESB}-type pseudo-instruction \e{would} be a \i\e{critical
 | |
| expression} (see \k{crit}), except that for legacy compatibility
 | |
| reasons forward references are permitted, however \e{the code will be
 | |
| extremely fragile and this should be considered a severe programming
 | |
| error.} A warning will be issued; code generating this warning should
 | |
| be remedied as quickly as possible (see the \c{forward} class in
 | |
| \k{warnings}.)
 | |
| 
 | |
| For example:
 | |
| 
 | |
| \c buffer:         resb    64              ; reserve 64 bytes
 | |
| \c wordvar:        resw    1               ; reserve a word
 | |
| \c realarray       resq    10              ; array of ten reals
 | |
| \c ymmval:         resy    1               ; one YMM register
 | |
| \c zmmvals:        resz    32              ; 32 ZMM registers
 | |
| 
 | |
| \I{masmdb} Since NASM 2.15, the MASM syntax of using \I{?db}\c{?}
 | |
| and \i\c{DUP} in the \c{D}\e{x} directives is also supported. Thus,
 | |
| the above example could also be written:
 | |
| 
 | |
| \c buffer:         db      64 dup (?)      ; reserve 64 bytes
 | |
| \c wordvar:        dw      ?               ; reserve a word
 | |
| \c realarray       dq      10 dup (?)      ; array of ten reals
 | |
| \c ymmval:         dy      ?               ; one YMM register
 | |
| \c zmmvals:        dz      32 dup (?)      ; 32 ZMM registers
 | |
| 
 | |
| 
 | |
| \S{incbin} \i\c{INCBIN}: Including External \i{Binary Files}
 | |
| 
 | |
| \c{INCBIN} includes binary file data verbatim into the output
 | |
| file. This can be handy for (for example) including \i{graphics} and
 | |
| \i{sound} data directly into a game executable file. It can be called
 | |
| in one of these three ways:
 | |
| 
 | |
| \c     incbin  "file.dat"             ; include the whole file
 | |
| \c     incbin  "file.dat",1024        ; skip the first 1024 bytes
 | |
| \c     incbin  "file.dat",1024,512    ; skip the first 1024, and
 | |
| \c                                    ; actually include at most 512
 | |
| 
 | |
| \c{INCBIN} is both a directive and a standard macro; the standard
 | |
| macro version searches for the file in the include file search path
 | |
| and adds the file to the dependency lists.  This macro can be
 | |
| overridden if desired.
 | |
| 
 | |
| 
 | |
| \S{equ} \i\c{EQU}: Defining Constants
 | |
| 
 | |
| \c{EQU} defines a symbol to a given constant value: when \c{EQU} is
 | |
| used, the source line must contain a label. The action of \c{EQU} is
 | |
| to define the given label name to the value of its (only) operand.
 | |
| This definition is absolute, and cannot change later. So, for
 | |
| example,
 | |
| 
 | |
| \c message         db      'hello, world'
 | |
| \c msglen          equ     $-message
 | |
| 
 | |
| defines \c{msglen} to be the constant 12. \c{msglen} may not then be
 | |
| redefined later. This is not a \i{preprocessor} definition either:
 | |
| the value of \c{msglen} is evaluated \e{once}, using the value of
 | |
| \c{$} (see \k{expr} for an explanation of \c{$}) at the point of
 | |
| definition, rather than being evaluated wherever it is referenced
 | |
| and using the value of \c{$} at the point of reference.
 | |
| 
 | |
| 
 | |
| \S{times} \i\c{TIMES}: \i{Repeating} Instructions or Data
 | |
| 
 | |
| The \c{TIMES} prefix causes the instruction to be assembled multiple
 | |
| times. This is partly present as NASM's equivalent of the \i\c{DUP}
 | |
| syntax supported by \i{MASM}-compatible assemblers, in that you can
 | |
| code
 | |
| 
 | |
| \c zerobuf:        times 64 db 0
 | |
| 
 | |
| or similar things; but \c{TIMES} is more versatile than that. The
 | |
| argument to \c{TIMES} is not just a numeric constant, but a numeric
 | |
| \e{expression}, so you can do things like
 | |
| 
 | |
| \c buffer: db      'hello, world'
 | |
| \c         times 64-$+buffer db ' '
 | |
| 
 | |
| which will store exactly enough spaces to make the total length of
 | |
| \c{buffer} up to 64. Finally, \c{TIMES} can be applied to ordinary
 | |
| instructions, so you can code trivial \i{unrolled loops} in it:
 | |
| 
 | |
| \c         times 100 movsb
 | |
| 
 | |
| Note that there is no effective difference between \c{times 100 resb
 | |
| 1} and \c{resb 100}, except that the latter will be assembled about
 | |
| 100 times faster due to the internal structure of the assembler.
 | |
| 
 | |
| The operand to \c{TIMES} is a critical expression (\k{crit}).
 | |
| 
 | |
| Note also that \c{TIMES} can't be applied to \i{macros}: the reason
 | |
| for this is that \c{TIMES} is processed after the macro phase, which
 | |
| allows the argument to \c{TIMES} to contain expressions such as
 | |
| \c{64-$+buffer} as above. To repeat more than one line of code, or a
 | |
| complex macro, use the preprocessor \i\c{%rep} directive.
 | |
| 
 | |
| 
 | |
| \H{effaddr} Effective Addresses
 | |
| 
 | |
| An \i{effective address} is any operand to an instruction which
 | |
| \I{memory reference}references memory. Effective addresses, in NASM,
 | |
| have a very simple syntax: they consist of an expression evaluating
 | |
| to the desired address, enclosed in \i{square brackets}. For
 | |
| example:
 | |
| 
 | |
| \c wordvar dw      123
 | |
| \c         mov     ax,[wordvar]
 | |
| \c         mov     ax,[wordvar+1]
 | |
| \c         mov     ax,[es:wordvar+bx]
 | |
| 
 | |
| Anything not conforming to this simple system is not a valid memory
 | |
| reference in NASM, for example \c{es:wordvar[bx]}.
 | |
| 
 | |
| More complicated effective addresses, such as those involving more
 | |
| than one register, work in exactly the same way:
 | |
| 
 | |
| \c         mov     eax,[ebx*2+ecx+offset]
 | |
| \c         mov     ax,[bp+di+8]
 | |
| 
 | |
| NASM is capable of doing \i{algebra} on these effective addresses,
 | |
| so that things which don't necessarily \e{look} legal are perfectly
 | |
| all right:
 | |
| 
 | |
| \c     mov     eax,[ebx*5]             ; assembles as [ebx*4+ebx]
 | |
| \c     mov     eax,[label1*2-label2]   ; ie [label1+(label1-label2)]
 | |
| 
 | |
| Some forms of effective address have more than one assembled form;
 | |
| in most such cases NASM will generate the smallest form it can. For
 | |
| example, there are distinct assembled forms for the 32-bit effective
 | |
| addresses \c{[eax*2+0]} and \c{[eax+eax]}, and NASM will generally
 | |
| generate the latter on the grounds that the former requires four
 | |
| bytes to store a zero offset.
 | |
| 
 | |
| NASM has a hinting mechanism which will cause \c{[eax+ebx]} and
 | |
| \c{[ebx+eax]} to generate different opcodes; this is occasionally
 | |
| useful because \c{[esi+ebp]} and \c{[ebp+esi]} have different
 | |
| default segment registers.
 | |
| 
 | |
| However, you can force NASM to generate an effective address in a
 | |
| particular form by the use of the keywords \c{BYTE}, \c{WORD},
 | |
| \c{DWORD} and \c{NOSPLIT}. If you need \c{[eax+3]} to be assembled
 | |
| using a double-word offset field instead of the one byte NASM will
 | |
| normally generate, you can code \c{[dword eax+3]}. Similarly, you
 | |
| can force NASM to use a byte offset for a small value which it
 | |
| hasn't seen on the first pass (see \k{crit} for an example of such a
 | |
| code fragment) by using \c{[byte eax+offset]}. As special cases,
 | |
| \c{[byte eax]} will code \c{[eax+0]} with a byte offset of zero, and
 | |
| \c{[dword eax]} will code it with a double-word offset of zero. The
 | |
| normal form, \c{[eax]}, will be coded with no offset field.
 | |
| 
 | |
| The form described in the previous paragraph is also useful if you
 | |
| are trying to access data in a 32-bit segment from within 16 bit code.
 | |
| For more information on this see the section on mixed-size addressing
 | |
| (\k{mixaddr}). In particular, if you need to access data with a known
 | |
| offset that is larger than will fit in a 16-bit value, if you don't
 | |
| specify that it is a dword offset, nasm will cause the high word of
 | |
| the offset to be lost.
 | |
| 
 | |
| Similarly, NASM will split \c{[eax*2]} into \c{[eax+eax]} because
 | |
| that allows the offset field to be absent and space to be saved; in
 | |
| fact, it will also split \c{[eax*2+offset]} into
 | |
| \c{[eax+eax+offset]}. You can combat this behaviour by the use of
 | |
| the \c{NOSPLIT} keyword: \c{[nosplit eax*2]} will force
 | |
| \c{[eax*2+0]} to be generated literally. \c{[nosplit eax*1]} also has the
 | |
| same effect. In another way, a split EA form \c{[0, eax*2]} can be used, too.
 | |
| However, \c{NOSPLIT} in \c{[nosplit eax+eax]} will be ignored because user's
 | |
| intention here is considered as \c{[eax+eax]}.
 | |
| 
 | |
| In 64-bit mode, NASM will by default generate absolute addresses.  The
 | |
| \i\c{REL} keyword makes it produce \c{RIP}-relative addresses. Since
 | |
| this is frequently the normally desired behaviour, see the \c{DEFAULT}
 | |
| directive (\k{default}). The keyword \i\c{ABS} overrides \i\c{REL}.
 | |
| 
 | |
| A new syntax for split EA (effective addressing) is also supported in NASM. It's
 | |
| mainly intended for MPX instructions that use the MIB operands, but it can be
 | |
| used for any memory reference. It's described in more detail in \k{spliteas}.
 | |
| 
 | |
| When broadcasting decorator is used, the opsize keyword should match
 | |
| the size of each element.
 | |
| 
 | |
| \c      VDIVPS zmm4, zmm5, dword [rbx]{1to16}   ; single-precision float
 | |
| \c      VDIVPS zmm4, zmm5, zword [rbx]          ; packed 512 bit memory
 | |
| 
 | |
| 
 | |
| \H{const} \i{Constants}
 | |
| 
 | |
| NASM understands four different types of constant: numeric,
 | |
| character, string and floating-point.
 | |
| 
 | |
| 
 | |
| \S{numconst} \i{Numeric Constants}
 | |
| 
 | |
| A numeric constant is simply a number. NASM allows you to specify
 | |
| numbers in a variety of number bases, in a variety of ways: you can
 | |
| suffix \c{H} or \c{X}, \c{D} or \c{T}, \c{Q} or \c{O}, and \c{B} or
 | |
| \c{Y} for \i{hexadecimal}, \i{decimal}, \i{octal} and \i{binary}
 | |
| respectively, or you can prefix \c{0h} or \c{0x}, \c{0d} or \c{0t},
 | |
| \c{0q} or \c{0o}, and \c{0b} or \c{0y}) in the style of C.  Please note
 | |
| that unlike C, a \c{0} prefix by itself does \e{not} imply an octal
 | |
| constant (this is deprecated in C23.)
 | |
| 
 | |
| Previous versions of NASM allowed prefixing \c{$} for hexadecimal in
 | |
| the style of Borland Pascal or Motorola Assemblers.  Unfortunately
 | |
| though, the \I{$, prefix}\c{$} prefix does double duty as a prefix on
 | |
| identifiers (see \k{syntax}), so a hex number prefixed with a \c{$}
 | |
| sign would have to have a digit after the \c{$} rather than a letter,
 | |
| which is \e{not} what users would typically expect. This syntax is
 | |
| strongly deprecated, and can be disabled entirely with the
 | |
| \c{[DOLLARHEX]} directive, see \k{dollarhex}.
 | |
| 
 | |
| Numeric constants can have underscores (\c{_}) interspersed to break
 | |
| up long strings.
 | |
| 
 | |
| Some examples (all producing exactly the same code):
 | |
| 
 | |
| \c         mov     ax,200          ; decimal
 | |
| \c         mov     ax,0200         ; still decimal
 | |
| \c         mov     ax,0200d        ; explicitly decimal
 | |
| \c         mov     ax,0d200        ; also decimal
 | |
| \c         mov     ax,0c8h         ; hex
 | |
| \c         mov     ax,0xc8         ; hex yet again
 | |
| \c         mov     ax,0hc8         ; still hex
 | |
| \c         mov     ax,310q         ; octal
 | |
| \c         mov     ax,310o         ; octal again
 | |
| \c         mov     ax,0o310        ; octal yet again
 | |
| \c         mov     ax,0q310        ; octal yet again
 | |
| \c         mov     ax,11001000b    ; binary
 | |
| \c         mov     ax,1100_1000b   ; same binary constant
 | |
| \c         mov     ax,1100_1000y   ; same binary constant once more
 | |
| \c         mov     ax,0b1100_1000  ; same binary constant yet again
 | |
| \c         mov     ax,0y1100_1000  ; same binary constant yet again
 | |
| \c
 | |
| \c         ; Deprecated syntax:
 | |
| \c         mov     ax,$0c8         ; hex again: the 0 is required
 | |
| 
 | |
| \S{strings} \I{string}\I{string constants}\i{Character Strings}
 | |
| 
 | |
| A character string consists of up to eight characters enclosed in
 | |
| either single quotes (\c{'...'}), double quotes (\c{"..."}) or
 | |
| backquotes (\c{`...`}).  Single or double quotes are equivalent in
 | |
| NASM (except of course that surrounding the constant with single
 | |
| quotes allows double quotes to appear within it and vice versa); the
 | |
| contents of those are represented verbatim.  Strings enclosed in
 | |
| backquotes support C-style \c{\\}-escapes for special characters.
 | |
| 
 | |
| 
 | |
| The following \i{escape sequences} are recognized by backquoted strings:
 | |
| 
 | |
| \c       \'          single quote (')
 | |
| \c       \"          double quote (")
 | |
| \c       \`          backquote (`)
 | |
| \c       \\\          backslash (\)
 | |
| \c       \?          question mark (?)
 | |
| \c       \a          BEL (ASCII 7)
 | |
| \c       \b          BS  (ASCII 8)
 | |
| \c       \t          TAB (ASCII 9)
 | |
| \c       \n          LF  (ASCII 10)
 | |
| \c       \v          VT  (ASCII 11)
 | |
| \c       \f          FF  (ASCII 12)
 | |
| \c       \r          CR  (ASCII 13)
 | |
| \c       \e          ESC (ASCII 27)
 | |
| \c       \377        Up to 3 octal digits - literal byte
 | |
| \c       \xFF        Up to 2 hexadecimal digits - literal byte
 | |
| \c       \u1234      4 hexadecimal digits - Unicode character
 | |
| \c       \U12345678  8 hexadecimal digits - Unicode character
 | |
| 
 | |
| All other escape sequences are reserved.  Note that \c{\\0}, meaning a
 | |
| \c{NUL} character (ASCII 0), is a special case of the octal escape
 | |
| sequence.
 | |
| 
 | |
| \i{Unicode} characters specified with \c{\\u} or \c{\\U} are converted to
 | |
| \i{UTF-8}.  For example, the following lines are all equivalent:
 | |
| 
 | |
| \c       db `\u263a`            ; UTF-8 smiley face
 | |
| \c       db `\xe2\x98\xba`      ; UTF-8 smiley face
 | |
| \c       db 0E2h, 098h, 0BAh    ; UTF-8 smiley face
 | |
| 
 | |
| 
 | |
| \S{chrconst} \i{Character Constants}
 | |
| 
 | |
| A character constant consists of a string up to eight bytes long, used
 | |
| in an expression context.  It is treated as if it was an integer.
 | |
| 
 | |
| A character constant with more than one byte will be arranged
 | |
| with \i{little-endian} order in mind: if you code
 | |
| 
 | |
| \c           mov eax,'abcd'
 | |
| 
 | |
| then the constant generated is not \c{0x61626364}, but
 | |
| \c{0x64636261}, so that if you were then to store the value into
 | |
| memory, it would read \c{abcd} rather than \c{dcba}. This is also
 | |
| the sense of character constants understood by the Pentium's
 | |
| \i\c{CPUID} instruction.
 | |
| 
 | |
| 
 | |
| \S{strconst} \i{String Constants}
 | |
| 
 | |
| String constants are character strings used in the context of some
 | |
| pseudo-instructions, namely the
 | |
| \I\c{DW}\I\c{DD}\I\c{DQ}\I\c{DT}\I\c{DO}\I\c{DY}\i\c{DB} family and
 | |
| \i\c{INCBIN} (where it represents a filename). They are also used in
 | |
| certain preprocessor directives.
 | |
| 
 | |
| A string constant looks like a character constant, only longer. It
 | |
| is treated as a concatenation of maximum-size character constants
 | |
| for the conditions. So the following are equivalent:
 | |
| 
 | |
| \c       db    'hello'               ; string constant
 | |
| \c       db    'h','e','l','l','o'   ; equivalent character constants
 | |
| 
 | |
| And the following are also equivalent:
 | |
| 
 | |
| \c       dd    'ninechars'           ; doubleword string constant
 | |
| \c       dd    'nine','char','s'     ; becomes three doublewords
 | |
| \c       db    'ninechars',0,0,0     ; and really looks like this
 | |
| 
 | |
| Note that when used in a string-supporting context, quoted strings are
 | |
| treated as a string constants even if they are short enough to be a
 | |
| character constant, because otherwise \c{db 'ab'} would have the same
 | |
| effect as \c{db 'a'}, which would be silly. Similarly, three-character
 | |
| or four-character constants are treated as strings when they are
 | |
| operands to \c{DW}, and so forth.
 | |
| 
 | |
| \S{unicode} \I{UTF-8}\I{UTF-16}\I{UTF-32}\i{Unicode} Strings
 | |
| 
 | |
| The special operators \i\c{__?utf16?__}, \i\c{__?utf16le?__},
 | |
| \i\c{__?utf16be?__}, \i\c{__?utf32?__}, \i\c{__?utf32le?__} and
 | |
| \i\c{__?utf32be?__} allows definition of Unicode strings.  They take a
 | |
| string in UTF-8 format and converts it to UTF-16 or UTF-32,
 | |
| respectively.  Unless the \c{be} forms are specified, the output is
 | |
| little endian.
 | |
| 
 | |
| For example:
 | |
| 
 | |
| \c %define u(x) __?utf16?__(x)
 | |
| \c %define w(x) __?utf32?__(x)
 | |
| \c
 | |
| \c       dw u('C:\WINDOWS'), 0       ; Pathname in UTF-16
 | |
| \c       dd w(`A + B = \u206a`), 0   ; String in UTF-32
 | |
| 
 | |
| The UTF operators can be applied either to strings passed to the
 | |
| \c{DB} family instructions, or to character constants in an expression
 | |
| context.
 | |
| 
 | |
| \S{fltconst} \i{Floating-Point Constants}
 | |
| 
 | |
| \i{Floating-point} constants are acceptable only as arguments to
 | |
| \i\c{DB}, \i\c{DW}, \i\c{DD}, \i\c{DQ}, \i\c{DT}, and \i\c{DO}, or as
 | |
| arguments to the special operators \i\c{__?float8?__},
 | |
| \i\c{__?float16?__}, \i\c{__?bfloat16?__}, \i\c{__?float32?__},
 | |
| \i\c{__?float64?__}, \i\c{__?float80m?__}, \i\c{__?float80e?__},
 | |
| \i\c{__?float128l?__}, and \i\c{__?float128h?__}. See also \k{pkg_fp}.
 | |
| 
 | |
| Floating-point constants are expressed in the traditional form:
 | |
| digits, then a period, then optionally more digits, then optionally an
 | |
| \c{E} followed by an exponent. The period is mandatory, so that NASM
 | |
| can distinguish between \c{dd 1}, which declares an integer constant,
 | |
| and \c{dd 1.0} which declares a floating-point constant.
 | |
| 
 | |
| NASM also support C99-style hexadecimal floating-point: \c{0x},
 | |
| hexadecimal digits, period, optionally more hexadeximal digits, then
 | |
| optionally a \c{P} followed by a \e{binary} (not hexadecimal) exponent
 | |
| in decimal notation.  As an extension, NASM additionally supports the
 | |
| \c{0h} and \c{$} prefixes for hexadecimal, as well binary and octal
 | |
| floating-point, using the \c{0b} or \c{0y} and \c{0o} or \c{0q}
 | |
| prefixes, respectively. As with integers, the \c{$} prefix for
 | |
| hexadecimal is deprecated.
 | |
| 
 | |
| Underscores to break up groups of digits are permitted in
 | |
| floating-point constants as well.
 | |
| 
 | |
| Some examples:
 | |
| 
 | |
| \c       db    -0.2                    ; "Quarter precision"
 | |
| \c       dw    -0.5                    ; IEEE 754r/SSE5 half precision
 | |
| \c       dd    1.2                     ; an easy one
 | |
| \c       dd    1.222_222_222           ; underscores are permitted
 | |
| \c       dd    0x1p+2                  ; 1.0x2^2 = 4.0
 | |
| \c       dq    0x1p+32                 ; 1.0x2^32 = 4 294 967 296.0
 | |
| \c       dq    1.e10                   ; 10 000 000 000.0
 | |
| \c       dq    1.e+10                  ; synonymous with 1.e10
 | |
| \c       dq    1.e-10                  ; 0.000 000 000 1
 | |
| \c       dt    3.141592653589793238462 ; pi
 | |
| \c       do    1.e+4000                ; IEEE 754r quad precision
 | |
| 
 | |
| The 8-bit "quarter-precision" floating-point format is
 | |
| sign:exponent:mantissa = 1:4:3 with an exponent bias of 7.  This
 | |
| appears to be the most frequently used 8-bit floating-point format,
 | |
| although it is not covered by any formal standard.  This is sometimes
 | |
| called a "\i{minifloat}."
 | |
| 
 | |
| The \i\c{bfloat16} format is effectively a compressed version of the
 | |
| 32-bit single precision format, with a reduced mantissa. It is
 | |
| effectively the same as truncating the 32-bit format to the upper 16
 | |
| bits, except for rounding. There is no \c{D}\e{x} directive that
 | |
| corresponds to \c{bfloat16} as it obviously has the same size as the
 | |
| IEEE standard 16-bit half precision format, see however \k{pkg_fp}.
 | |
| 
 | |
| The special operators are used to produce floating-point numbers in
 | |
| other contexts.  They produce the binary representation of a specific
 | |
| floating-point number as an integer, and can use anywhere integer
 | |
| constants are used in an expression.  \c{__?float80m?__} and
 | |
| \c{__?float80e?__} produce the 64-bit mantissa and 16-bit exponent of an
 | |
| 80-bit floating-point number, and \c{__?float128l?__} and
 | |
| \c{__?float128h?__} produce the lower and upper 64-bit halves of a 128-bit
 | |
| floating-point number, respectively.
 | |
| 
 | |
| For example:
 | |
| 
 | |
| \c       mov    rax,__?float64?__(3.141592653589793238462)
 | |
| 
 | |
| ... would assign the binary representation of pi as a 64-bit floating
 | |
| point number into \c{RAX}.  This is exactly equivalent to:
 | |
| 
 | |
| \c       mov    rax,0x400921fb54442d18
 | |
| 
 | |
| NASM cannot do compile-time arithmetic on floating-point constants.
 | |
| This is because NASM is designed to be portable - although it always
 | |
| generates code to run on x86 processors, the assembler itself can
 | |
| run on any system with an ANSI C compiler. Therefore, the assembler
 | |
| cannot guarantee the presence of a floating-point unit capable of
 | |
| handling the \i{Intel number formats}, and so for NASM to be able to
 | |
| do floating arithmetic it would have to include its own complete set
 | |
| of floating-point routines, which would significantly increase the
 | |
| size of the assembler for very little benefit.
 | |
| 
 | |
| The special tokens \i\c{__?Infinity?__}, \i\c{__?QNaN?__} (or
 | |
| \i\c{__?NaN?__}) and \i\c{__?SNaN?__} can be used to generate
 | |
| \I{infinity}infinities, quiet \i{NaN}s, and signalling NaNs,
 | |
| respectively.  These are normally used as macros:
 | |
| 
 | |
| \c %define Inf __?Infinity?__
 | |
| \c %define NaN __?QNaN?__
 | |
| \c
 | |
| \c       dq    +1.5, -Inf, NaN         ; Double-precision constants
 | |
| 
 | |
| The \c{%use fp} standard macro package contains a set of convenience
 | |
| macros.  See \k{pkg_fp}.
 | |
| 
 | |
| \S{bcdconst} \I{floating-point, packed BCD constants}Packed BCD Constants
 | |
| 
 | |
| x87-style packed BCD constants can be used in the same contexts as
 | |
| 80-bit floating-point numbers.  They are suffixed with \c{p} or
 | |
| prefixed with \c{0p}, and can include up to 18 decimal digits.
 | |
| 
 | |
| As with other numeric constants, underscores can be used to separate
 | |
| digits.
 | |
| 
 | |
| For example:
 | |
| 
 | |
| \c       dt 12_345_678_901_245_678p
 | |
| \c       dt -12_345_678_901_245_678p
 | |
| \c       dt +0p33
 | |
| \c       dt 33p
 | |
| 
 | |
| 
 | |
| \H{expr} \i{Expressions}
 | |
| 
 | |
| Expressions in NASM are similar in syntax to those in C.  Expressions
 | |
| are evaluated as 64-bit integers which are then adjusted to the
 | |
| appropriate size.
 | |
| 
 | |
| NASM supports two special tokens in expressions, allowing
 | |
| calculations to involve the current assembly position: the
 | |
| \I{$, here}\c{$} and \i\c{$$} tokens. \c{$} evaluates to the assembly
 | |
| position at the beginning of the line containing the expression; so
 | |
| you can code an \i{infinite loop} using \c{JMP $}. \c{$$} evaluates
 | |
| to the beginning of the current section; so you can tell how far
 | |
| into the section you are by using \c{($-$$)}.
 | |
| 
 | |
| The arithmetic \i{operators} provided by NASM are listed here, in
 | |
| increasing order of \i{precedence}.
 | |
| 
 | |
| A \e{boolean} value is true if nonzero and false if zero. The
 | |
| operators which return a boolean value always return 1 for true and 0
 | |
| for false.
 | |
| 
 | |
| 
 | |
| \S{exptri} \I{?op}\c{?} ... \c{:}: Conditional Operator
 | |
| 
 | |
| The syntax of this operator, similar to the C conditional operator, is:
 | |
| 
 | |
| \e{boolean} \c{?} \e{trueval} \c{:} \e{falseval}
 | |
| 
 | |
| This operator evaluates to \e{trueval} if \e{boolean} is true,
 | |
| otherwise to \e{falseval}.
 | |
| 
 | |
| Note that NASM allows \c{?} characters in symbol names. Therefore, it
 | |
| is highly advisable to always put spaces around the \c{?} and \c{:}
 | |
| characters.
 | |
| 
 | |
| 
 | |
| \S{expbor}: \i\c{||}: \i{Boolean OR} Operator
 | |
| 
 | |
| The \c{||} operator gives a boolean OR: it evaluates to 1 if both sides of
 | |
| the expression are nonzero, otherwise 0.
 | |
| 
 | |
| 
 | |
| \S{expbxor}: \i\c{^^}: \i{Boolean XOR} Operator
 | |
| 
 | |
| The \c{^^} operator gives a boolean XOR: it evaluates to 1 if any one side of
 | |
| the expression is nonzero, otherwise 0.
 | |
| 
 | |
| 
 | |
| \S{expband}: \i\c{&&}: \i{Boolean AND} Operator
 | |
| 
 | |
| The \c{&&} operator gives a boolean AND: it evaluates to 1 if both sides of
 | |
| the expression is nonzero, otherwise 0.
 | |
| 
 | |
| 
 | |
| \S{exprel}: \i{Comparison Operators}
 | |
| 
 | |
| NASM supports the following comparison operators:
 | |
| 
 | |
| \b \i\c{=} or \i\c{==} compare for equality.
 | |
| 
 | |
| \b \i\c{!=} or \i\c{<>} compare for inequality.
 | |
| 
 | |
| \b \i\c{<} compares signed less than.
 | |
| 
 | |
| \b \i\c{<=} compares signed less than or equal.
 | |
| 
 | |
| \b \i\c{>} compares signed greater than.
 | |
| 
 | |
| \b \i\c{>=} compares signed greater than or equal.
 | |
| 
 | |
| These operators evaluate to 0 for false or 1 for true.
 | |
| 
 | |
| \b \i{<=>} does a signed comparison, and evaluates to -1 for less
 | |
| than, 0 for equal, and 1 for greater than.
 | |
| 
 | |
| At this time, NASM does not provide unsigned comparison operators.
 | |
| 
 | |
| 
 | |
| \S{expor} \i\c{|}: \i{Bitwise OR} Operator
 | |
| 
 | |
| The \c{|} operator gives a bitwise OR, exactly as performed by the
 | |
| \c{OR} machine instruction.
 | |
| 
 | |
| 
 | |
| \S{expxor} \i\c{^}: \i{Bitwise XOR} Operator
 | |
| 
 | |
| \c{^} provides the bitwise XOR operation.
 | |
| 
 | |
| 
 | |
| \S{expand} \i\c{&}: \i{Bitwise AND} Operator
 | |
| 
 | |
| \c{&} provides the bitwise AND operation.
 | |
| 
 | |
| 
 | |
| \S{expshift} \i{Bit Shift} Operators
 | |
| 
 | |
| \i\c{<<} gives a bit-shift to the left, just as it does in C. So
 | |
| \c{5<<3} evaluates to 5 times 8, or 40. \i\c{>>} gives an \I{unsigned,
 | |
| bit shift}\e{unsigned} (logical) bit-shift to the right; the bits
 | |
| shifted in from the left are set to zero.
 | |
| 
 | |
| \i\c{<<<} gives a bit-shift to the left, exactly equivalent to the
 | |
| \c{<<} operator; it is included for completeness. \i\c{>>>} gives an
 | |
| \I{signed, bit shift}\e{signed} (arithmetic) bit-shift to the right;
 | |
| the bits shifted in from the left are filled with copies of the most
 | |
| significant (sign) bit.
 | |
| 
 | |
| 
 | |
| \S{expplmi} \I{+ opaddition}\c{+} and \I{- opsubtraction}\c{-}:
 | |
| \i{Addition} and \i{Subtraction} Operators
 | |
| 
 | |
| The \c{+} and \c{-} operators do perfectly ordinary addition and
 | |
| subtraction.
 | |
| 
 | |
| 
 | |
| \S{expmul} \i{Multiplication}, \i{Division} and \i{Modulo}
 | |
| 
 | |
| \i\c{*} is the multiplication operator.
 | |
| 
 | |
| \i\c{/} and \i\c{//} are both division operators: \c{/} is
 | |
| \I{division, unsigned}\I{unsigned, division}unsigned division and \c{//} is
 | |
| \I{division, signed}\I{signed, division}signed division.
 | |
| 
 | |
| Similarly, \i\c{%} and \i\c{%%} provide \I{modulo,
 | |
| unsigned}\I{unsigned, modulo}unsigned and \I{modulo, signed}\I{signed,
 | |
| modulo}signed modulo operators respectively.
 | |
| 
 | |
| Since the \c{%} character is used extensively by the macro
 | |
| \i{preprocessor}, you should ensure that both the signed and unsigned
 | |
| modulo operators are followed by white space wherever they appear.
 | |
| 
 | |
| NASM, like ANSI C, provides no guarantees about the sensible
 | |
| operation of the signed modulo operator. On most systems it will match
 | |
| the signed division operator, such that:
 | |
| 
 | |
| \c      b * (a // b) + (a %% b) = a       (b != 0)
 | |
| 
 | |
| 
 | |
| \S{expmul} \I{operators, unary}\i{Unary Operators}
 | |
| 
 | |
| The highest-priority operators in NASM's expression grammar are those
 | |
| which only apply to one argument.  These are:
 | |
| 
 | |
| \b \I{- opunary}\c{-} \I{arithmetic negation}negates (\i{2's complement}) its
 | |
| operand.
 | |
| 
 | |
| \b \I{+ opunary}\c{+} does nothing; it's provided for symmetry with \c{-}.
 | |
| 
 | |
| \b \I{~ opunary}\c{~} computes the \I{negation, bitwise}\i{bitwise
 | |
| negation} (\i{1's complement}) of its operand.
 | |
| 
 | |
| \b \I{! opunary}\c{!} is the \I{negation, boolean}\i{boolean negation}
 | |
| operator. It evaluates to 1 if the argument is 0, otherwise 0.
 | |
| 
 | |
| \b \c{SEG} provides the \i{segment address} of its operand (explained in
 | |
| more detail in \k{segwrt}).
 | |
| 
 | |
| \b A set of additional operators with leading and trailing double
 | |
| underscores are used to implement the \c{integer functions} of the
 | |
| \c{ifunc} macro package, see \k{pkg_ifunc}.
 | |
| 
 | |
| 
 | |
| \H{segwrt} \i\c{SEG} and \i\c{WRT}
 | |
| 
 | |
| When writing large 16-bit programs, which must be split into
 | |
| multiple \i{segments}, it is often necessary to be able to refer to
 | |
| the \I{segment address}segment part of the address of a symbol. NASM
 | |
| supports the \c{SEG} operator to perform this function.
 | |
| 
 | |
| The \c{SEG} operator evaluates to the \i\e{preferred} segment base of a
 | |
| symbol, defined as the segment base relative to which the offset of
 | |
| the symbol makes sense. So the code
 | |
| 
 | |
| \c         mov     ax,seg symbol
 | |
| \c         mov     es,ax
 | |
| \c         mov     bx,symbol
 | |
| 
 | |
| will load \c{ES:BX} with a valid pointer to the symbol \c{symbol}.
 | |
| 
 | |
| Things can be more complex than this: since 16-bit segments and
 | |
| \i{groups} may \I{overlapping segments}overlap, you might occasionally
 | |
| want to refer to some symbol using a different segment base from the
 | |
| preferred one. NASM lets you do this, by the use of the \c{WRT}
 | |
| (With Reference To) keyword. So you can do things like
 | |
| 
 | |
| \c         mov     ax,weird_seg        ; weird_seg is a segment base
 | |
| \c         mov     es,ax
 | |
| \c         mov     bx,symbol wrt weird_seg
 | |
| 
 | |
| to load \c{ES:BX} with a different, but functionally equivalent,
 | |
| pointer to the symbol \c{symbol}.
 | |
| 
 | |
| The \c{WRT} keyword is also used in far (inter-segment) calls and jumps. It's
 | |
| synonymous to the
 | |
| 
 | |
| \c	call far procedure
 | |
| 
 | |
| syntax which is documented in \k{farcall}.
 | |
| 
 | |
| \H{strict} \i\c{STRICT}: Inhibiting Optimization
 | |
| 
 | |
| When assembling with the optimizer set to level 2 or higher (see
 | |
| \k{opt-O}), NASM will use size specifiers (\c{BYTE}, \c{WORD},
 | |
| \c{DWORD}, \c{QWORD}, \c{TWORD}, \c{OWORD}, \c{YWORD} or \c{ZWORD}),
 | |
| but will give them the smallest possible size. The keyword \c{STRICT}
 | |
| can be used to inhibit optimization and force a particular operand to
 | |
| be emitted in the specified size. For example, with the optimizer on,
 | |
| and in \c{BITS 16} mode,
 | |
| 
 | |
| \c         push dword 33
 | |
| 
 | |
| is encoded in three bytes \c{66 6A 21}, whereas
 | |
| 
 | |
| \c         push strict dword 33
 | |
| 
 | |
| is encoded in six bytes, with a full dword immediate operand \c{66 68
 | |
| 21 00 00 00}.
 | |
| 
 | |
| With the optimizer off, the same code (six bytes) is generated whether
 | |
| the \c{STRICT} keyword was used or not.
 | |
| 
 | |
| 
 | |
| \H{crit} \i{Critical Expressions}
 | |
| 
 | |
| Although NASM has an optional multi-pass optimizer, there are some
 | |
| expressions which must be resolvable on the first pass. These are
 | |
| called \e{Critical Expressions}.
 | |
| 
 | |
| The first pass is used to determine the size of all the assembled
 | |
| code and data, so that the second pass, when generating all the
 | |
| code, knows all the symbol addresses the code refers to. So one
 | |
| thing NASM can't handle is code whose size depends on the value of a
 | |
| symbol declared after the code in question. For example,
 | |
| 
 | |
| \c         times (label-$) db 0
 | |
| \c label:  db      'Where am I?'
 | |
| 
 | |
| The argument to \i\c{TIMES} in this case could equally legally
 | |
| evaluate to anything at all; NASM will reject this example because
 | |
| it cannot tell the size of the \c{TIMES} line when it first sees it.
 | |
| It will just as firmly reject the slightly \I{paradox}paradoxical
 | |
| code
 | |
| 
 | |
| \c         times (label-$+1) db 0
 | |
| \c label:  db      'NOW where am I?'
 | |
| 
 | |
| in which \e{any} value for the \c{TIMES} argument is by definition
 | |
| wrong!
 | |
| 
 | |
| NASM rejects these examples by means of a concept called a
 | |
| \e{critical expression}, which is defined to be an expression whose
 | |
| value is required to be computable in the first pass, and which must
 | |
| therefore depend only on symbols defined before it. The argument to
 | |
| the \c{TIMES} prefix is a critical expression.
 | |
| 
 | |
| \H{locallab} \i{Local Labels}
 | |
| 
 | |
| NASM gives special treatment to symbols beginning with a \i{period}.
 | |
| A label beginning with a single period is treated as a \e{local}
 | |
| label, which means that it is associated with the previous non-local
 | |
| label. So, for example:
 | |
| 
 | |
| \c label1  ; some code
 | |
| \c
 | |
| \c .loop
 | |
| \c         ; some more code
 | |
| \c
 | |
| \c         jne     .loop
 | |
| \c         ret
 | |
| \c
 | |
| \c label2  ; some code
 | |
| \c
 | |
| \c .loop
 | |
| \c         ; some more code
 | |
| \c
 | |
| \c         jne     .loop
 | |
| \c         ret
 | |
| 
 | |
| In the above code fragment, each \c{JNE} instruction jumps to the
 | |
| line immediately before it, because the two definitions of \c{.loop}
 | |
| are kept separate by virtue of each being associated with the
 | |
| previous non-local label.
 | |
| 
 | |
| This form of local label handling is borrowed from the old Amiga
 | |
| assembler \i{DevPac}; however, NASM goes one step further, in
 | |
| allowing access to local labels from other parts of the code. This
 | |
| is achieved by means of \e{defining} a local label in terms of the
 | |
| previous non-local label: the first definition of \c{.loop} above is
 | |
| really defining a symbol called \c{label1.loop}, and the second
 | |
| defines a symbol called \c{label2.loop}. So, if you really needed
 | |
| to, you could write
 | |
| 
 | |
| \c label3  ; some more code
 | |
| \c         ; and some more
 | |
| \c
 | |
| \c         jmp label1.loop
 | |
| 
 | |
| Sometimes it is useful - in a macro, for instance - to be able to
 | |
| define a label which can be referenced from anywhere but which
 | |
| doesn't interfere with the normal local-label mechanism. Such a
 | |
| label can't be non-local because it would interfere with subsequent
 | |
| definitions of, and references to, local labels; and it can't be
 | |
| local because the macro that defined it wouldn't know the label's
 | |
| full name. NASM therefore introduces a third type of label, which is
 | |
| probably only useful in macro definitions: if a label begins with
 | |
| the \I{label prefix}special prefix \i\c{..@}, then it does nothing
 | |
| to the local label mechanism. So you could code
 | |
| 
 | |
| \c label1:                         ; a non-local label
 | |
| \c .local:                         ; this is really label1.local
 | |
| \c ..@foo:                         ; this is a special symbol
 | |
| \c label2:                         ; another non-local label
 | |
| \c .local:                         ; this is really label2.local
 | |
| \c
 | |
| \c         jmp     ..@foo          ; this will jump three lines up
 | |
| 
 | |
| NASM has the capacity to define other special symbols beginning with
 | |
| a double period: for example, \c{..start} is used to specify the
 | |
| entry point in the \c{obj} output format (see \k{dotdotstart}),
 | |
| \c{..imagebase} is used to find out the offset from a base address
 | |
| of the current image in the \c{win64} output format (see \k{win64pic}).
 | |
| So just keep in mind that symbols beginning with a double period are
 | |
| special.
 |