mirror of
				https://github.com/netwide-assembler/nasm.git
				synced 2025-10-10 00:25:06 -04:00 
			
		
		
		
	
		
			
				
	
	
		
			1498 lines
		
	
	
		
			63 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			1498 lines
		
	
	
		
			63 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| \C{outfmt} \i{Output Formats}
 | |
| 
 | |
| NASM is a portable assembler, designed to be able to compile on any
 | |
| ANSI C-supporting platform and produce output to run on a variety of
 | |
| Intel x86 operating systems. For this reason, it has a large number
 | |
| of available output formats, selected using the \i\c{-f} option on
 | |
| the NASM \i{command line}. Each of these formats, along with its
 | |
| extensions to the base NASM syntax, is detailed in this chapter.
 | |
| 
 | |
| As stated in \k{opt-o}, NASM chooses a \i{default name} for your
 | |
| output file based on the input file name and the chosen output
 | |
| format. This will be generated by removing the filename \i{extension}
 | |
| (\c{.asm}, \c{.s}, or whatever you like to use) from the input file
 | |
| name, and substituting an extension defined by the output format.
 | |
| The extensions are given with each format below.
 | |
| 
 | |
| 
 | |
| \H{binfmt} \i\c{bin}: \i{Flat-Form Binary}\I{pure binary} Output
 | |
| 
 | |
| The \c{bin} format does not produce object files: it generates
 | |
| nothing in the output file except the code you wrote. Such `pure
 | |
| binary' files are used by \i{MS-DOS}: \i\c{.COM} executables and
 | |
| \i\c{.SYS} device drivers are pure binary files. Pure binary output
 | |
| is also useful for \i{operating system} and \i{boot loader}
 | |
| development.
 | |
| 
 | |
| The \c{bin} format supports \i{multiple section names}. For details of
 | |
| how NASM handles sections in the \c{bin} format, see \k{multisec}.
 | |
| 
 | |
| Using the \c{bin} format puts NASM by default into 16-bit mode (see
 | |
| \k{bits}). In order to use \c{bin} to write 32-bit or 64-bit code,
 | |
| such as an OS kernel, you need to explicitly issue the \I\c{BITS}\c{BITS 32}
 | |
| or \I\c{BITS}\c{BITS 64} directive.
 | |
| 
 | |
| \c{bin} has no default output file name extension: instead, it
 | |
| leaves your file name as it is once the original extension has been
 | |
| removed. Thus, the default is for NASM to assemble \c{binprog.asm}
 | |
| into a binary file called \c{binprog}.
 | |
| 
 | |
| It is extremely important to understand that the binary output format
 | |
| is simply nothing other than \e{a linker built into the NASM
 | |
| executable.} As such, NASM behaves just as it does when producing any
 | |
| other output format: notably the list file reflects the code output
 | |
| \e{before} relocation, and the addresses in the list file are
 | |
| addresses relative to the start of the current output section.
 | |
| 
 | |
| 
 | |
| \S{org} \i\c{ORG}: Binary File \i{Program Origin}
 | |
| 
 | |
| The \c{bin} format provides an additional directive to the list
 | |
| given in \k{directive}: \c{ORG}. The function of the \c{ORG}
 | |
| directive is to specify the origin address which NASM will assume
 | |
| the program begins at when it is loaded into memory.
 | |
| 
 | |
| For example, the following code will generate the longword
 | |
| \c{0x00000104}:
 | |
| 
 | |
| \c         org     0x100
 | |
| \c         dd      label
 | |
| \c label:
 | |
| 
 | |
| Unlike the \c{ORG} directive provided by MASM-compatible assemblers,
 | |
| which allows you to jump around in the object file and overwrite
 | |
| code you have already generated, NASM's \c{ORG} does exactly what
 | |
| the directive says: \e{origin}. Its sole function is to specify one
 | |
| offset which is added to all internal address references within the
 | |
| section; it does not permit any of the trickery that MASM's version
 | |
| does. See \k{proborg} for further comments.
 | |
| 
 | |
| 
 | |
| \S{binseg} \c{bin} Extensions to the \c{SECTION}
 | |
| Directive\I{section, bin extensions to}
 | |
| 
 | |
| The \c{bin} output format extends the \c{SECTION} (or \c{SEGMENT})
 | |
| directive to allow you to specify the alignment requirements of
 | |
| segments. This is done by appending the \i\c{ALIGN} qualifier to the
 | |
| end of the section-definition line. For example,
 | |
| 
 | |
| \c section .data   align=16
 | |
| 
 | |
| switches to the section \c{.data} and also specifies that it must be
 | |
| aligned on a 16-byte boundary.
 | |
| 
 | |
| The parameter to \c{ALIGN} specifies how many low bits of the
 | |
| section start address must be forced to zero. The alignment value
 | |
| given may be any power of two.\I{section alignment, in
 | |
| bin}\I{segment alignment, in bin}\I{alignment, in bin sections}
 | |
| 
 | |
| 
 | |
| \S{multisec} \i{Multisection}\I{bin, multisection} Support for the \c{bin} Format
 | |
| 
 | |
| The \c{bin} format allows the use of multiple sections, of arbitrary names,
 | |
| besides the "known" \c{.text}, \c{.data}, and \c{.bss} names.
 | |
| 
 | |
| \b Sections may be designated \i\c{progbits} or \i\c{nobits}. Default
 | |
| is \c{progbits} (except \c{.bss}, which defaults to \c{nobits},
 | |
| of course).
 | |
| 
 | |
| \b Sections can be aligned at a specified boundary following the previous
 | |
| section with \c{align=}, or at an arbitrary byte-granular position with
 | |
| \i\c{start=}.
 | |
| 
 | |
| \b Sections can be given a virtual start address, which will be used
 | |
| for the calculation of all memory references within that section
 | |
| with \i\c{vstart=}.
 | |
| 
 | |
| \b Sections can be ordered using \i\c{follows=}\c{<section>} or
 | |
| \i\c{vfollows=}\c{<section>} as an alternative to specifying an explicit
 | |
| start address.
 | |
| 
 | |
| \b Arguments to \c{org}, \c{start}, \c{vstart}, and \c{align=} are
 | |
| critical expressions. See \k{crit}. For example, in the case of
 | |
| \c{align=(1 << ALIGN_SHIFT)}, \c{ALIGN_SHIFT} must be defined before
 | |
| it is used here.
 | |
| 
 | |
| \b Any code which comes before an explicit \c{SECTION} directive
 | |
| is directed by default into the \c{.text} section.
 | |
| 
 | |
| \b If an \c{ORG} statement is not given, \c{ORG 0} is used
 | |
| by default.
 | |
| 
 | |
| \b The \c{.bss} section will be placed after the last \c{progbits}
 | |
| section, unless \c{start=}, \c{vstart=}, \c{follows=}, or \c{vfollows=}
 | |
| has been specified.
 | |
| 
 | |
| \b All sections are aligned on dword boundaries, unless a different
 | |
| alignment has been specified.
 | |
| 
 | |
| \b Sections may not overlap.
 | |
| 
 | |
| \b NASM creates the \c{section.<secname>.start} for each section,
 | |
| which may be used in your code.
 | |
| 
 | |
| \S{map}\i{Map Files}
 | |
| 
 | |
| Map files can be generated in \c{-f bin} format by means of the \c{[map]}
 | |
| option. Map types of \c{all} (default), \c{brief}, \c{sections}, \c{segments},
 | |
| or \c{symbols} may be specified. Output may be directed to \c{stdout}
 | |
| (default), \c{stderr}, or a specified file. E.g.
 | |
| \c{[map symbols myfile.map]}. No "user form" exists, the square
 | |
| brackets must be used.
 | |
| 
 | |
| 
 | |
| \H{ithfmt} \i\c{ith}: \i{Intel Hex} Output
 | |
| 
 | |
| The \c{ith} file format produces Intel hex-format files.  Just as the
 | |
| \c{bin} format, this is a flat memory image format with no support for
 | |
| further relocation or linking.  It is usually used with ROM
 | |
| programmers and similar utilities.
 | |
| 
 | |
| From a programmer point of view, this behaves identically to the
 | |
| \c{.bin} format; the only difference is the encoding of the
 | |
| output. All extensions supported by the \c{bin} file format is also
 | |
| supported by the \c{ith} file format.
 | |
| 
 | |
| \c{ith} provides a default output file-name extension of \c{.ith}.
 | |
| 
 | |
| 
 | |
| \H{srecfmt} \i\c{srec}: \i{Motorola S-Records} Output
 | |
| 
 | |
| The \c{srec} file format produces Motorola S-records files.  Just as the
 | |
| \c{bin} format, this is a flat memory image format with no support for
 | |
| relocation or linking.  It is usually used with ROM programmers and
 | |
| similar utilities.
 | |
| 
 | |
| From a programmer point of view, this behaves identically to the
 | |
| \c{.bin} format; the only difference is the encoding of the
 | |
| output. All extensions supported by the \c{bin} file format is also
 | |
| supported by the \c{srec} file format.
 | |
| 
 | |
| \c{srec} provides a default output file-name extension of \c{.srec}.
 | |
| 
 | |
| 
 | |
| \H{objfmt} \i\c{obj}: \i{Microsoft OMF}\I{OMF} Object Files
 | |
| 
 | |
| The \c{obj} file format (NASM calls it \c{obj} rather than \c{omf}
 | |
| for historical reasons) is the one produced by \i{MASM} and
 | |
| \i{TASM}, which is typically fed to 16-bit DOS linkers to produce
 | |
| \i\c{.EXE} files. It is also the format used by \i{OS/2}.
 | |
| 
 | |
| \c{obj} provides a default output file-name extension of \c{.obj}.
 | |
| 
 | |
| \c{obj} is not exclusively a 16-bit format, though; NASM has full
 | |
| support for the 32-bit extensions to the format. In particular,
 | |
| 32-bit \c{obj} format files are used by \i{Borland's Win32
 | |
| compilers}, instead of using Microsoft's newer \i\c{win32} object
 | |
| file format.
 | |
| 
 | |
| The \c{obj} format does not define any special segment names: you
 | |
| can call your segments anything you like. Typical names for segments
 | |
| in \c{obj} format files are \c{CODE}, \c{DATA} and \c{BSS}.
 | |
| 
 | |
| If your source file contains code before specifying an explicit
 | |
| \c{SEGMENT} directive, then NASM will invent its own segment called
 | |
| \i\c{__NASMDEFSEG} for you.
 | |
| 
 | |
| When you define a segment in an \c{obj} file, NASM defines the
 | |
| segment name as a symbol as well, so that you can access the segment
 | |
| address of the segment. So, for example:
 | |
| 
 | |
| \c segment data
 | |
| \c
 | |
| \c dvar:   dw      1234
 | |
| \c
 | |
| \c segment code
 | |
| \c
 | |
| \c function:
 | |
| \c         mov     ax,data         ; get segment address of data
 | |
| \c         mov     ds,ax           ; and move it into DS
 | |
| \c         inc     word [dvar]     ; now this reference will work
 | |
| \c         ret
 | |
| 
 | |
| The \c{obj} format also enables the use of the \i\c{SEG} and
 | |
| \i\c{WRT} operators, so that you can write code which does things
 | |
| like
 | |
| 
 | |
| \c extern  foo
 | |
| \c
 | |
| \c       mov   ax,seg foo            ; get preferred segment of foo
 | |
| \c       mov   ds,ax
 | |
| \c       mov   ax,data               ; a different segment
 | |
| \c       mov   es,ax
 | |
| \c       mov   ax,[ds:foo]           ; this accesses `foo'
 | |
| \c       mov   [es:foo wrt data],bx  ; so does this
 | |
| 
 | |
| 
 | |
| \S{objseg} \c{obj} Extensions to the \c{SEGMENT}
 | |
| Directive\I{SEGMENT, obj extensions to}
 | |
| 
 | |
| The \c{obj} output format extends the \c{SEGMENT} (or \c{SECTION})
 | |
| directive to allow you to specify various properties of the segment
 | |
| you are defining. This is done by appending extra qualifiers to the
 | |
| end of the segment-definition line. For example,
 | |
| 
 | |
| \c segment code private align=16
 | |
| 
 | |
| defines the segment \c{code}, but also declares it to be a private
 | |
| segment, and requires that the portion of it described in this code
 | |
| module must be aligned on a 16-byte boundary.
 | |
| 
 | |
| The available qualifiers are:
 | |
| 
 | |
| \b \i\c{PRIVATE}, \i\c{PUBLIC}, \i\c{COMMON} and \i\c{STACK} specify
 | |
| the combination characteristics of the segment. \c{PRIVATE} segments
 | |
| do not get combined with any others by the linker; \c{PUBLIC} and
 | |
| \c{STACK} segments get concatenated together at link time; and
 | |
| \c{COMMON} segments all get overlaid on top of each other rather
 | |
| than stuck end-to-end.
 | |
| 
 | |
| \b \i\c{ALIGN} is used, as shown above, to specify how many low bits
 | |
| of the segment start address must be forced to zero. The alignment
 | |
| value given may be any power of two from 1 to 4096; in reality, the
 | |
| only values supported are 1, 2, 4, 16, 256 and 4096, so if 8 is
 | |
| specified it will be rounded up to 16, and 32, 64 and 128 will all
 | |
| be rounded up to 256, and so on. Note that alignment to 4096-byte
 | |
| boundaries is a \i{PharLap} extension to the format and may not be
 | |
| supported by all linkers.\I{section alignment, in OBJ}\I{segment
 | |
| alignment, in OBJ}\I{alignment, in OBJ sections}
 | |
| 
 | |
| \b \i\c{CLASS} can be used to specify the segment class; this feature
 | |
| indicates to the linker that segments of the same class should be
 | |
| placed near each other in the output file. The class name can be any
 | |
| word, e.g. \c{CLASS=CODE}.
 | |
| 
 | |
| \b \i\c{OVERLAY}, like \c{CLASS}, is specified with an arbitrary word
 | |
| as an argument, and provides overlay information to an
 | |
| overlay-capable linker.
 | |
| 
 | |
| \b Segments can be declared as \i\c{USE16} or \i\c{USE32}, which has
 | |
| the effect of recording the choice in the object file and also
 | |
| ensuring that NASM's default assembly mode when assembling in that
 | |
| segment is 16-bit or 32-bit respectively.
 | |
| 
 | |
| \b When writing \i{OS/2} object files, you should declare 32-bit
 | |
| segments as \i\c{FLAT}, which causes the default segment base for
 | |
| anything in the segment to be the special group \c{FLAT}, and also
 | |
| defines the group if it is not already defined.
 | |
| 
 | |
| \b The \c{obj} file format also allows segments to be declared as
 | |
| having a pre-defined absolute segment address, although no linkers
 | |
| are currently known to make sensible use of this feature;
 | |
| nevertheless, NASM allows you to declare a segment such as
 | |
| \c{SEGMENT SCREEN ABSOLUTE=0xB800} if you need to. The \i\c{ABSOLUTE}
 | |
| and \c{ALIGN} keywords are mutually exclusive.
 | |
| 
 | |
| NASM's default segment attributes are \c{PUBLIC}, \c{ALIGN=1}, no
 | |
| class, no overlay, and \c{USE16}.
 | |
| 
 | |
| 
 | |
| \S{group} \i\c{GROUP}: Defining Groups of Segments\I{segments, groups of}
 | |
| 
 | |
| The \c{obj} format also allows segments to be grouped, so that a
 | |
| single segment register can be used to refer to all the segments in
 | |
| a group. NASM therefore supplies the \c{GROUP} directive, whereby
 | |
| you can code
 | |
| 
 | |
| \c segment data
 | |
| \c
 | |
| \c         ; some data
 | |
| \c
 | |
| \c segment bss
 | |
| \c
 | |
| \c         ; some uninitialized data
 | |
| \c
 | |
| \c group dgroup data bss
 | |
| 
 | |
| which will define a group called \c{dgroup} to contain the segments
 | |
| \c{data} and \c{bss}. Like \c{SEGMENT}, \c{GROUP} causes the group
 | |
| name to be defined as a symbol, so that you can refer to a variable
 | |
| \c{var} in the \c{data} segment as \c{var wrt data} or as \c{var wrt
 | |
| dgroup}, depending on which segment value is currently in your
 | |
| segment register.
 | |
| 
 | |
| If you just refer to \c{var}, however, and \c{var} is declared in a
 | |
| segment which is part of a group, then NASM will default to giving
 | |
| you the offset of \c{var} from the beginning of the \e{group}, not
 | |
| the \e{segment}. Therefore \c{SEG var}, also, will return the group
 | |
| base rather than the segment base.
 | |
| 
 | |
| NASM will allow a segment to be part of more than one group, but
 | |
| will generate a warning if you do this. Variables declared in a
 | |
| segment which is part of more than one group will default to being
 | |
| relative to the first group that was defined to contain the segment.
 | |
| 
 | |
| A group does not have to contain any segments; you can still make
 | |
| \c{WRT} references to a group which does not contain the variable
 | |
| you are referring to. OS/2, for example, defines the special group
 | |
| \c{FLAT} with no segments in it.
 | |
| 
 | |
| 
 | |
| \S{uppercase} \i\c{UPPERCASE}: Disabling Case Sensitivity in Output
 | |
| 
 | |
| Although NASM itself is \i{case sensitive}, some OMF linkers are
 | |
| not; therefore it can be useful for NASM to output single-case
 | |
| object files. The \c{UPPERCASE} format-specific directive causes all
 | |
| segment, group and symbol names that are written to the object file
 | |
| to be forced to upper case just before being written. Within a
 | |
| source file, NASM is still case-sensitive; but the object file can
 | |
| be written entirely in upper case if desired.
 | |
| 
 | |
| \c{UPPERCASE} is used alone on a line; it requires no parameters.
 | |
| 
 | |
| 
 | |
| \S{import} \i\c{IMPORT}: Importing DLL Symbols\I{DLL symbols,
 | |
| importing}\I{symbols, importing from DLLs}
 | |
| 
 | |
| The \c{IMPORT} format-specific directive defines a symbol to be
 | |
| imported from a DLL, for use if you are writing a DLL's \i{import
 | |
| library} in NASM. You still need to declare the symbol as \c{EXTERN}
 | |
| as well as using the \c{IMPORT} directive.
 | |
| 
 | |
| The \c{IMPORT} directive takes two required parameters, separated by
 | |
| white space, which are (respectively) the name of the symbol you
 | |
| wish to import and the name of the library you wish to import it
 | |
| from. For example:
 | |
| 
 | |
| \c     import  WSAStartup wsock32.dll
 | |
| 
 | |
| A third optional parameter gives the name by which the symbol is
 | |
| known in the library you are importing it from, in case this is not
 | |
| the same as the name you wish the symbol to be known by to your code
 | |
| once you have imported it. For example:
 | |
| 
 | |
| \c     import  asyncsel wsock32.dll WSAAsyncSelect
 | |
| 
 | |
| 
 | |
| \S{export} \i\c{EXPORT}: Exporting DLL Symbols\I{DLL symbols,
 | |
| exporting}\I{symbols, exporting from DLLs}
 | |
| 
 | |
| The \c{EXPORT} format-specific directive defines a global symbol to
 | |
| be exported as a DLL symbol, for use if you are writing a DLL in
 | |
| NASM. You still need to declare the symbol as \c{GLOBAL} as well as
 | |
| using the \c{EXPORT} directive.
 | |
| 
 | |
| \c{EXPORT} takes one required parameter, which is the name of the
 | |
| symbol you wish to export, as it was defined in your source file. An
 | |
| optional second parameter (separated by white space from the first)
 | |
| gives the \e{external} name of the symbol: the name by which you
 | |
| wish the symbol to be known to programs using the DLL. If this name
 | |
| is the same as the internal name, you may leave the second parameter
 | |
| off.
 | |
| 
 | |
| Further parameters can be given to define attributes of the exported
 | |
| symbol. These parameters, like the second, are separated by white
 | |
| space. If further parameters are given, the external name must also
 | |
| be specified, even if it is the same as the internal name. The
 | |
| available attributes are:
 | |
| 
 | |
| \b \c{resident} indicates that the exported name is to be kept
 | |
| resident by the system loader. This is an optimization for
 | |
| frequently used symbols imported by name.
 | |
| 
 | |
| \b \c{nodata} indicates that the exported symbol is a function which
 | |
| does not make use of any initialized data.
 | |
| 
 | |
| \b \c{parm=NNN}, where \c{NNN} is an integer, sets the number of
 | |
| parameter words for the case in which the symbol is a call gate
 | |
| between 32-bit and 16-bit segments.
 | |
| 
 | |
| \b An attribute which is just a number indicates that the symbol
 | |
| should be exported with an identifying number (ordinal), and gives
 | |
| the desired number.
 | |
| 
 | |
| For example:
 | |
| 
 | |
| \c     export  myfunc
 | |
| \c     export  myfunc TheRealMoreFormalLookingFunctionName
 | |
| \c     export  myfunc myfunc 1234  ; export by ordinal
 | |
| \c     export  myfunc myfunc resident parm=23 nodata
 | |
| 
 | |
| 
 | |
| \S{dotdotstart} \i\c{..start}: Defining the \i{Program Entry
 | |
| Point}
 | |
| 
 | |
| \c{OMF} linkers require exactly one of the object files being linked to
 | |
| define the program entry point, where execution will begin when the
 | |
| program is run. If the object file that defines the entry point is
 | |
| assembled using NASM, you specify the entry point by declaring the
 | |
| special symbol \c{..start} at the point where you wish execution to
 | |
| begin.
 | |
| 
 | |
| 
 | |
| \S{objextern} \c{obj} Extensions to the \c{EXTERN}
 | |
| Directive\I{EXTERN, obj extensions to}
 | |
| 
 | |
| If you declare an external symbol with the directive
 | |
| 
 | |
| \c     extern  foo
 | |
| 
 | |
| then references such as \c{mov ax,foo} will give you the offset of
 | |
| \c{foo} from its preferred segment base (as specified in whichever
 | |
| module \c{foo} is actually defined in). So to access the contents of
 | |
| \c{foo} you will usually need to do something like
 | |
| 
 | |
| \c         mov     ax,seg foo      ; get preferred segment base
 | |
| \c         mov     es,ax           ; move it into ES
 | |
| \c         mov     ax,[es:foo]     ; and use offset `foo' from it
 | |
| 
 | |
| This is a little unwieldy, particularly if you know that an external
 | |
| is going to be accessible from a given segment or group, say
 | |
| \c{dgroup}. So if \c{DS} already contained \c{dgroup}, you could
 | |
| simply code
 | |
| 
 | |
| \c         mov     ax,[foo wrt dgroup]
 | |
| 
 | |
| However, having to type this every time you want to access \c{foo}
 | |
| can be a pain; so NASM allows you to declare \c{foo} in the
 | |
| alternative form
 | |
| 
 | |
| \c     extern  foo:wrt dgroup
 | |
| 
 | |
| This form causes NASM to pretend that the preferred segment base of
 | |
| \c{foo} is in fact \c{dgroup}; so the expression \c{seg foo} will
 | |
| now return \c{dgroup}, and the expression \c{foo} is equivalent to
 | |
| \c{foo wrt dgroup}.
 | |
| 
 | |
| This \I{default-WRT mechanism}default-\c{WRT} mechanism can be used
 | |
| to make externals appear to be relative to any group or segment in
 | |
| your program. It can also be applied to common variables: see
 | |
| \k{objcommon}.
 | |
| 
 | |
| 
 | |
| \S{objcommon} \c{obj} Extensions to the \c{COMMON}
 | |
| Directive\I{COMMON, obj extensions to}
 | |
| 
 | |
| The \c{obj} format allows common variables to be either near\I{near
 | |
| common variables} or far\I{far common variables}; NASM allows you to
 | |
| specify which your variables should be by the use of the syntax
 | |
| 
 | |
| \c common  nearvar 2:near   ; `nearvar' is a near common
 | |
| \c common  farvar  10:far   ; and `farvar' is far
 | |
| 
 | |
| Far common variables may be greater in size than 64Kb, and so the
 | |
| OMF specification says that they are declared as a number of
 | |
| \e{elements} of a given size. So a 10-byte far common variable could
 | |
| be declared as ten one-byte elements, five two-byte elements, two
 | |
| five-byte elements or one ten-byte element.
 | |
| 
 | |
| Some \c{OMF} linkers require the \I{element size, in common
 | |
| variables}\I{common variables, element size}element size, as well as
 | |
| the variable size, to match when resolving common variables declared
 | |
| in more than one module. Therefore NASM must allow you to specify
 | |
| the element size on your far common variables. This is done by the
 | |
| following syntax:
 | |
| 
 | |
| \c common  c_5by2  10:far 5        ; two five-byte elements
 | |
| \c common  c_2by5  10:far 2        ; five two-byte elements
 | |
| 
 | |
| If no element size is specified, the default is 1. Also, the \c{FAR}
 | |
| keyword is not required when an element size is specified, since
 | |
| only far commons may have element sizes at all. So the above
 | |
| declarations could equivalently be
 | |
| 
 | |
| \c common  c_5by2  10:5            ; two five-byte elements
 | |
| \c common  c_2by5  10:2            ; five two-byte elements
 | |
| 
 | |
| In addition to these extensions, the \c{COMMON} directive in \c{obj}
 | |
| also supports default-\c{WRT} specification like \c{EXTERN} does
 | |
| (explained in \k{objextern}). So you can also declare things like
 | |
| 
 | |
| \c common  foo     10:wrt dgroup
 | |
| \c common  bar     16:far 2:wrt data
 | |
| \c common  baz     24:wrt data:6
 | |
| 
 | |
| 
 | |
| \S{objdepend} Embedded File Dependency Information
 | |
| 
 | |
| Since NASM 2.13.02, \c{obj} files contain embedded dependency file
 | |
| information.  To suppress the generation of dependencies, use
 | |
| 
 | |
| \c %pragma obj nodepend
 | |
| 
 | |
| 
 | |
| \H{win32fmt} \i\c{win32}: Microsoft Win32 Object Files
 | |
| 
 | |
| The \c{win32} output format generates Microsoft Win32 object files,
 | |
| suitable for passing to Microsoft linkers such as \i{Visual C++}.
 | |
| Note that Borland Win32 compilers do not use this format, but use
 | |
| \c{obj} instead (see \k{objfmt}).
 | |
| 
 | |
| \c{win32} provides a default output file-name extension of \c{.obj}.
 | |
| 
 | |
| Note that although Microsoft say that Win32 object files follow the
 | |
| \c{COFF} (Common Object File Format) standard, the object files produced
 | |
| by Microsoft Win32 compilers are not compatible with COFF linkers
 | |
| such as DJGPP's, and vice versa. This is due to a difference of
 | |
| opinion over the precise semantics of PC-relative relocations. To
 | |
| produce COFF files suitable for DJGPP, use NASM's \c{coff} output
 | |
| format; conversely, the \c{coff} format does not produce object
 | |
| files that Win32 linkers can generate correct output from.
 | |
| 
 | |
| 
 | |
| \S{win32sect} \c{win32} Extensions to the \c{SECTION}
 | |
| Directive\I{SECTION, Windows extensions to}
 | |
| 
 | |
| Like the \c{obj} format, \c{win32} allows you to specify additional
 | |
| information on the \c{SECTION} directive line, to control the type
 | |
| and properties of sections you declare. Section types and properties
 | |
| are generated automatically by NASM for the \i{standard section names}
 | |
| \c{.text}, \c{.data} and \c{.bss}, but may still be overridden by
 | |
| these qualifiers.
 | |
| 
 | |
| The available qualifiers are:
 | |
| 
 | |
| \b \c{code}, or equivalently \c{text}, defines the section to be a
 | |
| code section. This marks the section as readable and executable, but
 | |
| not writable, and also indicates to the linker that the type of the
 | |
| section is code.
 | |
| 
 | |
| \b \c{data} and \c{bss} define the section to be a data section,
 | |
| analogously to \c{code}. Data sections are marked as readable and
 | |
| writable, but not executable. \c{data} declares an initialized data
 | |
| section, whereas \c{bss} declares an uninitialized data section.
 | |
| 
 | |
| \b \c{rdata} declares an initialized data section that is readable
 | |
| but not writable. Microsoft compilers use this section to place
 | |
| constants in it.
 | |
| 
 | |
| \b \c{info} defines the section to be an \i{informational section},
 | |
| which is not included in the executable file by the linker, but may
 | |
| (for example) pass information \e{to} the linker. For example,
 | |
| declaring an \c{info}-type section called \i\c{.drectve} causes the
 | |
| linker to interpret the contents of the section as command-line
 | |
| options.
 | |
| 
 | |
| \b \c{align=}, used with a trailing number as in \c{obj}, gives the
 | |
| \I{section alignment, in win32}\I{alignment, in win32
 | |
| sections}alignment requirements of the section. The maximum you may
 | |
| specify is 64: the Win32 object file format contains no means to
 | |
| request a greater section alignment than this. If alignment is not
 | |
| explicitly specified, the defaults are 16-byte alignment for code
 | |
| sections, 8-byte alignment for rdata sections and 4-byte alignment
 | |
| for data (and BSS) sections.
 | |
| Informational sections get a default alignment of 1 byte (no
 | |
| alignment), though the value does not matter.
 | |
| 
 | |
| \b \I{comdat, win32 attribute}\c{comdat=}, followed by a number
 | |
| ("selection"), colon (acting as a separator) and a name,
 | |
| marks the section as a \I{COMDAT section, in win32}"COMDAT section".
 | |
| It allows Microsoft linkers to perform function-level linking,
 | |
| to deal with multiply defined symbols, to eliminate dead code/data.
 | |
| 
 | |
| The "selection" number should be one of the
 | |
| \c{IMAGE_COMDAT_SELECT_*} constants from
 | |
| \W{https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src/Debug/pe-format.md#comdat-sections-object-only}\c{COFF format specification};
 | |
| this value controls if the linker allows multiply defined symbols
 | |
| and how it handles them.
 | |
| 
 | |
| The name is the \I{COMDAT symbol, in win32}"COMDAT symbol"
 | |
| - basically a new name for the section. So even though you have one
 | |
| section given by the main name (e.g. \c{.text}), it can actually
 | |
| consist of hundreds of COMDAT sections having their own name
 | |
| (and alignment).
 | |
| 
 | |
| When the "selection" is IMAGE_COMDAT_SELECT_ASSOCIATIVE (5),
 | |
| the following name is the "COMDAT symbol" of the associated COMDAT
 | |
| section; this way you can link a piece of code or data only when
 | |
| another piece of code or data gets actually linked.
 | |
| 
 | |
| \> So, when linking a NASM-compiled file with some C code,
 | |
| the source may be structured as follows.
 | |
| Note that the default \c{.text} section in handled in a special
 | |
| way and it doesn't work well with \c{comdat}; you may want to append
 | |
| a \c{$} character and an arbitrary suffix to the section name.
 | |
| It will get linked into the \c{.text} section anyway - see the info on
 | |
| \W{https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src/Debug/pe-format.md#grouped-sections-object-only}\c{Grouped Sections}.
 | |
| 
 | |
| \c    section .text$1 align=16 comdat=1:FirstFnc
 | |
| \c       ...                                        ; Code linked only if referenced from C
 | |
| \c
 | |
| \c    section .text$1 align=16 comdat=1:SecondFnc
 | |
| \c       ...                                        ; Code linked only if referenced from C
 | |
| \c
 | |
| \c    section .rdata align=32 comdat=5:FirstFnc
 | |
| \c       ...                                        ; Data linked only if the related code (FirstFnc) is linked
 | |
| \c
 | |
| 
 | |
| The defaults assumed by NASM if you do not specify the above
 | |
| qualifiers are:
 | |
| 
 | |
| \c section .text    code  align=16
 | |
| \c section .data    data  align=4
 | |
| \c section .rdata   rdata align=8
 | |
| \c section .bss     bss   align=4
 | |
| 
 | |
| The \c{win64} format also adds:
 | |
| 
 | |
| \c section .pdata   rdata align=4
 | |
| \c section .xdata   rdata align=8
 | |
| 
 | |
| Any other section name is treated by default like \c{.text}.
 | |
| 
 | |
| \S{win32safeseh} \c{win32}: Safe Structured Exception Handling
 | |
| 
 | |
| Among other improvements in Windows XP SP2 and Windows Server 2003,
 | |
| Microsoft has introduced the concept of "safe structured exception
 | |
| handling." The general idea is to collect handlers' entry points
 | |
| in a designated read-only table and have SEH entry points verified
 | |
| against this table before exception control is passed to the
 | |
| corresponding handler. In order for an executable module to be
 | |
| equipped with this read-only table, all object modules on linker
 | |
| command line have to comply with certain criteria. If even a single
 | |
| module among them does not, then the table in question is omitted
 | |
| and above mentioned run-time checks will not be performed for the
 | |
| application in question. Table omission is silent by default and
 | |
| therefore can be easily missed. One can instruct the linker to
 | |
| refuse to produce binary without such table by passing the
 | |
| \c{/safeseh} command line option.
 | |
| 
 | |
| Without regard to this run-time check, it's natural to expect
 | |
| NASM to be capable of generating modules suitable for \c{/safeseh}
 | |
| linking. From the developer's viewpoint the problem is two-fold:
 | |
| 
 | |
| \b how to adapt modules not deploying exception handlers of their own;
 | |
| 
 | |
| \b how to adapt/develop modules utilizing custom exception handling;
 | |
| 
 | |
| The former can be easily achieved with any NASM version by adding the
 | |
| following line to the source code:
 | |
| 
 | |
| \c $@feat.00 equ 1
 | |
| 
 | |
| As of version 2.03 NASM adds this absolute symbol automatically, if
 | |
| it is not already present (in which case the developer can choose to
 | |
| assign another value, if desired, for whatever reason).
 | |
| 
 | |
| Registering a custom exception handler on the other hand requires
 | |
| certain "magic." As of version 2.03, an additional \c{safeseh} directive
 | |
| is implemented, which instructs the assembler to produce appropriately
 | |
| formatted input data for the above-mentioned "safe exception handler
 | |
| table." Its typical use would be:
 | |
| 
 | |
| \c section .text
 | |
| \c extern  _MessageBoxA@16
 | |
| \c %if     __?NASM_VERSION_ID?__ >= 0x02030000
 | |
| \c safeseh handler         ; register handler as "safe handler"
 | |
| \c %endif
 | |
| \c handler:
 | |
| \c         push    DWORD 1 ; MB_OKCANCEL
 | |
| \c         push    DWORD caption
 | |
| \c         push    DWORD text
 | |
| \c         push    DWORD 0
 | |
| \c         call    _MessageBoxA@16
 | |
| \c         sub     eax,1   ; incidentally suits as return value
 | |
| \c                         ; for exception handler
 | |
| \c         ret
 | |
| \c global  _main
 | |
| \c _main:
 | |
| \c         push    DWORD handler
 | |
| \c         push    DWORD [fs:0]
 | |
| \c         mov     DWORD [fs:0],esp ; engage exception handler
 | |
| \c         xor     eax,eax
 | |
| \c         mov     eax,DWORD[eax]   ; cause exception
 | |
| \c         pop     DWORD [fs:0]     ; disengage exception handler
 | |
| \c         add     esp,4
 | |
| \c         ret
 | |
| \c text:   db      'OK to rethrow, CANCEL to generate core dump',0
 | |
| \c caption:db      'SEGV',0
 | |
| \c
 | |
| \c section .drectve info
 | |
| \c         db      '/defaultlib:user32.lib /defaultlib:msvcrt.lib '
 | |
| 
 | |
| As you might imagine, it's perfectly possible to produce an .exe binary
 | |
| with the "safe exception handler table" and yet invoke an unregistered
 | |
| exception handler. A handler is invoked by manipulating \c{[fs:0]}
 | |
| at run-time, something the linker has no power over. It is therefore
 | |
| important to note that such failure to register a handler's entry point
 | |
| with the \c{safeseh} directive will have undesired side effects at
 | |
| run-time. If an exception is raised and an unregistered handler is to be
 | |
| executed, the application is abruptly terminated without any notification
 | |
| whatsoever. One can argue that the system should at least log some kind
 | |
| of "non-safe exception handler in x.exe at address n" message in the
 | |
| event log, but unfortunately the user is left without any clue as to
 | |
| what might have caused the crash.
 | |
| 
 | |
| Finally, all mentions of linker in this paragraph refer to Microsoft
 | |
| linker version 7.x and later. Presence of \c{@feat.00} symbol and input
 | |
| data for "safe exception handler table" causes no backward
 | |
| incompatibilities and "safeseh" modules generated by NASM 2.03 and
 | |
| later can still be linked by earlier versions or non-Microsoft linkers.
 | |
| 
 | |
| \S{codeview} Debugging formats for Windows
 | |
| \I{Windows debugging formats}
 | |
| 
 | |
| The \c{win32} and \c{win64} formats support the Microsoft \i{CodeView
 | |
| debugging format}.  Currently CodeView version 8 format is supported
 | |
| (\i\c{cv8}), but newer versions of the CodeView debugger should be
 | |
| able to handle this format as well.
 | |
| 
 | |
| 
 | |
| \H{win64fmt} \i\c{win64}: Microsoft Win64 Object Files
 | |
| 
 | |
| The \c{win64} output format generates Microsoft Win64 object files,
 | |
| which is nearly 100% identical to the \c{win32} object format (\k{win32fmt})
 | |
| with the exception that it is meant to target 64-bit code and the x86-64
 | |
| platform altogether. This object file is used exactly the same as the \c{win32}
 | |
| object format (\k{win32fmt}), in NASM, with regard to this exception.
 | |
| 
 | |
| \S{win64pic} \c{win64}: Writing Position-Independent Code
 | |
| 
 | |
| While \c{REL} takes good care of RIP-relative addressing, there is one
 | |
| aspect that is easy to overlook for a Win64 programmer: indirect
 | |
| references. Consider a switch dispatch table:
 | |
| 
 | |
| \c         jmp     qword [dsptch+rax*8]
 | |
| \c         ...
 | |
| \c dsptch: dq      case0
 | |
| \c         dq      case1
 | |
| \c         ...
 | |
| 
 | |
| Even a novice Win64 assembler programmer will soon realize that the code
 | |
| is not 64-bit savvy. Most notably the linker will refuse to link it, showing:
 | |
| 
 | |
| \c 'ADDR32' relocation to '.text' invalid without /LARGEADDRESSAWARE:NO
 | |
| 
 | |
| So [s]he will have to split jmp instruction as following:
 | |
| 
 | |
| \c         lea     rbx,[rel dsptch]
 | |
| \c         jmp     qword [rbx+rax*8]
 | |
| 
 | |
| What happens behind the scenes is that the effective address in \c{lea}
 | |
| is encoded relative to instruction pointer, in a perfectly
 | |
| position-independent manner. But this is only part of the problem!
 | |
| The issue is that in a .dll context, the \c{caseN} relocations will make
 | |
| their way to the final module and might have to be adjusted at .dll load
 | |
| time (specifically, when it can't be loaded at the preferred address).
 | |
| When this occurs, pages with such relocations will be rendered private
 | |
| to current process, which kind of undermines the idea of a shared .dll.
 | |
| But not to worry, it's trivial to fix:
 | |
| 
 | |
| \c         lea     rbx,[rel dsptch]
 | |
| \c         add     rbx,[rbx+rax*8]
 | |
| \c         jmp     rbx
 | |
| \c         ...
 | |
| \c dsptch: dq      case0-dsptch
 | |
| \c         dq      case1-dsptch
 | |
| \c         ...
 | |
| 
 | |
| NASM version 2.03 and later provides another alternative, \c{wrt
 | |
| ..imagebase} operator, which returns an offset from base address of the
 | |
| current image, be it .exe or .dll module, hence the name. For those
 | |
| acquainted with PE-COFF format, this base address denotes the start of
 | |
| the \c{IMAGE_DOS_HEADER} structure. Here is how to implement a switch
 | |
| statement with these image-relative references:
 | |
| 
 | |
| \c         lea     rbx,[rel dsptch]
 | |
| \c         mov     eax,[rbx+rax*4]
 | |
| \c         sub     rbx,dsptch wrt ..imagebase
 | |
| \c         add     rbx,rax
 | |
| \c         jmp     rbx
 | |
| \c         ...
 | |
| \c dsptch: dd      case0 wrt ..imagebase
 | |
| \c         dd      case1 wrt ..imagebase
 | |
| 
 | |
| That said, the snippet before last works just fine with any NASM version
 | |
| and is not even Windows specific, which makes this operator unnecessary
 | |
| in this case. The real reason for the \c{wrt ..imagebase} operator will
 | |
| become apparent in the next section.
 | |
| 
 | |
| It should be noted that \c{wrt ..imagebase} is defined as 32-bit
 | |
| operand only:
 | |
| 
 | |
| \c         dd      label wrt ..imagebase           ; ok
 | |
| \c         dq      label wrt ..imagebase           ; bad
 | |
| \c         mov     eax,label wrt ..imagebase       ; ok
 | |
| \c         mov     rax,label wrt ..imagebase       ; bad
 | |
| 
 | |
| \S{win64seh} \c{win64}: Structured Exception Handling
 | |
| 
 | |
| Structured exception handing in Win64 is completely different compared
 | |
| to Win32. When an exception occurs, the program counter is noted, and a
 | |
| linker-generated table containing start and end addresses of all the
 | |
| functions (in a given executable module) is traversed and compared to
 | |
| the saved program counter. This is used to identify the corresponding
 | |
| \c{UNWIND_INFO} structure. If missing, then the offending subroutine is
 | |
| assumed to be "leaf" and this lookup procedure is instead attempted for
 | |
| its caller. In Win64, a leaf function is a function that does not call
 | |
| any other functions \e{nor} modifies any Win64 non-volatile registers,
 | |
| including the stack pointer. The latter ensures that it's possible to
 | |
| identify a leaf function's caller by simply pulling the value from the
 | |
| top of the stack.
 | |
| 
 | |
| While the majority of subroutines written in assembler are not calling
 | |
| any other functions, they may not qualify as "leaf" functions in the
 | |
| Win64 sense. The requirement for non-volatile registers to be
 | |
| unchanged leaves the developer with not more than 7 registers and no
 | |
| stack frame, which is not necessarily what they counted on.
 | |
| Customarily one would meet this requirement by saving non-volatile
 | |
| registers on stack and restoring them upon return. However, if (and
 | |
| only if) an exception is raised at run-time and no \c{UNWIND_INFO}
 | |
| structure is associated with such a "leaf" function, the stack unwind
 | |
| procedure will expect to find the caller's return address on the top of
 | |
| the stack immediately followed by its frame. Given that the developer
 | |
| pushed the caller's non-volatile registers onto the stack, the value
 | |
| on top will no longer point to the right place. The developer can
 | |
| attempt to copy the caller's return address to the top of stack, which
 | |
| would work in some very specific circumstances. But unless the
 | |
| developer can guarantee that these circumstances are always met, it's
 | |
| more appropriate to assume the worst, i.e. the stack unwind procedure
 | |
| goes berserk, abruptly terminating without any notification whatsoever
 | |
| (just like in the the Win32 case).
 | |
| 
 | |
| Now that we understand significance of the \c{UNWIND_INFO} structure,
 | |
| let us discuss what is in it and how it is processed. First, it is
 | |
| checked for the presence of a reference to a custom language-specific
 | |
| exception handler. If there is one, then it is invoked. Depending on
 | |
| the return value, execution flow is resumed (exception is said to be
 | |
| "handled"), \e{or} the rest of the \c{UNWIND_INFO} structure is
 | |
| processed as follows. Aside from an optional reference to a custom
 | |
| handler, it carries information about the current callee's stack frame
 | |
| and where non-volatile registers are saved. The information is detailed
 | |
| enough to be able to reconstruct the contents of the caller's
 | |
| non-volatile registers on entry to the current callee. And so the
 | |
| caller's context is reconstructed, at which point the unwind procedure
 | |
| is repeated, using the \c{UNWIND_INFO} structure associated with the
 | |
| caller's instruction pointer. The procedure is repeated recursively
 | |
| until the exception is handled. As a last resort, the system "handles"
 | |
| it by generating a memory dump and terminating the application.
 | |
| 
 | |
| As of this writing, NASM unfortunately does not facilitate generation
 | |
| of above mentioned detailed information about stack frame layout. But
 | |
| as of version 2.03, it implements building blocks for generating
 | |
| structures involved in stack unwinding. Here is a simple example
 | |
| showing how to deploy a custom exception handler for a leaf function:
 | |
| 
 | |
| \c default rel
 | |
| \c section .text
 | |
| \c extern  MessageBoxA
 | |
| \c handler:
 | |
| \c         sub     rsp,40
 | |
| \c         mov     rcx,0
 | |
| \c         lea     rdx,[text]
 | |
| \c         lea     r8,[caption]
 | |
| \c         mov     r9,1    ; MB_OKCANCEL
 | |
| \c         call    MessageBoxA
 | |
| \c         sub     eax,1   ; incidentally suits as return value
 | |
| \c                         ; for exception handler
 | |
| \c         add     rsp,40
 | |
| \c         ret
 | |
| \c global  main
 | |
| \c main:
 | |
| \c         xor     rax,rax
 | |
| \c         mov     rax,QWORD[rax]  ; cause exception
 | |
| \c         ret
 | |
| \c main_end:
 | |
| \c text:   db      'OK to rethrow, CANCEL to generate core dump',0
 | |
| \c caption:db      'SEGV',0
 | |
| \c
 | |
| \c section .pdata  rdata align=4
 | |
| \c         dd      main wrt ..imagebase
 | |
| \c         dd      main_end wrt ..imagebase
 | |
| \c         dd      xmain wrt ..imagebase
 | |
| \c section .xdata  rdata align=8
 | |
| \c xmain:  db      9,0,0,0
 | |
| \c         dd      handler wrt ..imagebase
 | |
| \c section .drectve info
 | |
| \c         db      '/defaultlib:user32.lib /defaultlib:msvcrt.lib '
 | |
| 
 | |
| What you see is that the \c{.pdata} section contains a single-element
 | |
| table, containing function start and end addresses, along with references
 | |
| to associated \c{UNWIND_INFO} structures (only one in this case). The
 | |
| \c{.xdata} section contains the referenced \c{UNWIND_INFO} structure,
 | |
| describing a function with no frame, but with a designated exception handler.
 | |
| These references are \e{required} to be image-relative, which is the real
 | |
| reason for implementing the \c{wrt ..imagebase} operator). It should be
 | |
| noted that \c{rdata align=n}, as well as \c{wrt ..imagebase}, are actually
 | |
| optional in the context of these two segments (they apply even when omitted);
 | |
| \e{all} 32-bit references placed into these two segments will be image-relative.
 | |
| This is important to understand, as the developer is allowed to append
 | |
| handler-specific data to the \c{UNWIND_INFO} structure, and any 32-bit
 | |
| references that are added may require adjustment to obtain the real pointer.
 | |
| 
 | |
| As already mentioned, in Win64 terms, a leaf function is one that neither
 | |
| calls any other function \e{nor} modifies any non-volatile registers,
 | |
| including the stack pointer. But it is not uncommon for the programmer
 | |
| to intend to utilize every single register and sometimes even have a
 | |
| variable stack frame, requiring a more complicated \c{UNWIND_INFO} structure
 | |
| than in the example above. Is there anything one can do with these simpler
 | |
| building blocks, and avoid manually composing fully-fledged \c{UNWIND_INFO}
 | |
| structures, which would surely be considered error-prone? Yes, there is.
 | |
| Recall that an exception handler is called first, before the stack layout
 | |
| is analyzed. As it turns out, it is perfectly possible to manipulate
 | |
| current callee's context in a custom handler in a manner that permits
 | |
| further stack unwinding. The general idea is that handler would not
 | |
| actually "handle" the exception, but instead restore the callee's context
 | |
| (restore to state at entry point) and thus mimic a Win64 leaf function.
 | |
| In other words, the handler would effectively undertake part of the
 | |
| unwinding procedure. Consider the following example:
 | |
| 
 | |
| \c function:
 | |
| \c         mov     rax,rsp         ; copy rsp to volatile register
 | |
| \c         push    r15             ; save non-volatile registers
 | |
| \c         push    rbx
 | |
| \c         push    rbp
 | |
| \c         mov     r11,rsp         ; prepare variable stack frame
 | |
| \c         sub     r11,rcx
 | |
| \c         and     r11,-64
 | |
| \c         mov     QWORD[r11],rax  ; check for exceptions
 | |
| \c         mov     rsp,r11         ; allocate stack frame
 | |
| \c         mov     QWORD[rsp],rax  ; save original rsp value
 | |
| \c magic_point:
 | |
| \c         ...
 | |
| \c         mov     r11,QWORD[rsp]  ; pull original rsp value
 | |
| \c         mov     rbp,QWORD[r11-24]
 | |
| \c         mov     rbx,QWORD[r11-16]
 | |
| \c         mov     r15,QWORD[r11-8]
 | |
| \c         mov     rsp,r11         ; destroy frame
 | |
| \c         ret
 | |
| 
 | |
| The key is that until \c{magic_point}, the original \c{rsp} value
 | |
| remains in the chosen volatile register, and no non-volatile register
 | |
| except for \c{rsp} is modified. After \c{magic_point}, \c{rsp} remains
 | |
| constant till the very end of the \c{function}. In this case a custom
 | |
| language-specific exception handler would look like this:
 | |
| 
 | |
| \c EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame,
 | |
| \c         CONTEXT *context,DISPATCHER_CONTEXT *disp)
 | |
| \c {   ULONG64 *rsp;
 | |
| \c     if (context->Rip<(ULONG64)magic_point)
 | |
| \c         rsp = (ULONG64 *)context->Rax;
 | |
| \c     else
 | |
| \c     {   rsp = ((ULONG64 **)context->Rsp)[0];
 | |
| \c         context->Rbp = rsp[-3];
 | |
| \c         context->Rbx = rsp[-2];
 | |
| \c         context->R15 = rsp[-1];
 | |
| \c     }
 | |
| \c     context->Rsp = (ULONG64)rsp;
 | |
| \c
 | |
| \c     memcpy (disp->ContextRecord,context,sizeof(CONTEXT));
 | |
| \c     RtlVirtualUnwind(UNW_FLAG_NHANDLER,disp->ImageBase,
 | |
| \c         dips->ControlPc,disp->FunctionEntry,disp->ContextRecord,
 | |
| \c         &disp->HandlerData,&disp->EstablisherFrame,NULL);
 | |
| \c     return ExceptionContinueSearch;
 | |
| \c }
 | |
| 
 | |
| As this custom handler allows the example function to mimic a Win64 leaf
 | |
| function, the corresponding \c{UNWIND_INFO} structure does not need to
 | |
| contain any information about the stack frame and its layout.
 | |
| 
 | |
| \H{cofffmt} \i\c{coff}: \i{Common Object File Format}
 | |
| 
 | |
| The \c{coff} output type produces \c{COFF} object files suitable for
 | |
| linking with the \i{DJGPP} linker.
 | |
| 
 | |
| \c{coff} provides a default output file-name extension of \c{.o}.
 | |
| 
 | |
| The \c{coff} format supports the same extensions to the \c{SECTION}
 | |
| directive as \c{win32} does, except that the \c{align} qualifier and
 | |
| the \c{info} section type are not supported.
 | |
| 
 | |
| \H{machofmt} \I{Mach-O}\i\c{macho32} and \i\c{macho64}: \i{Mach Object File Format}
 | |
| 
 | |
| The \c{macho32} and \c{macho64} output formts produces Mach-O
 | |
| object files suitable for linking with the \i{MacOS X} linker.
 | |
| \i\c{macho} is a synonym for \c{macho32}.
 | |
| 
 | |
| \c{macho} provides a default output file-name extension of \c{.o}.
 | |
| 
 | |
| \S{machosect} \c{macho} extensions to the \c{SECTION} Directive
 | |
| \I{SECTION, macho extensions to}
 | |
| 
 | |
| The \c{macho} output format specifies section names in the format
 | |
| "\e{segment}\c{,}\e{section}".  No spaces are allowed around the
 | |
| comma.  The following flags can also be specified:
 | |
| 
 | |
| \b \c{data} - this section contains initialized data items
 | |
| 
 | |
| \b \c{code} - this section contains code exclusively
 | |
| 
 | |
| \b \c{mixed} - this section contains both code and data
 | |
| 
 | |
| \b \c{bss} - this section is uninitialized and filled with zero
 | |
| 
 | |
| \b \c{zerofill} - same as \c{bss}
 | |
| 
 | |
| \b \c{no_dead_strip} - inhibit dead code stripping for this section
 | |
| 
 | |
| \b \c{live_support} - set the live support flag for this section
 | |
| 
 | |
| \b \c{strip_static_syms} - strip static symbols for this section
 | |
| 
 | |
| \b \c{debug} - this section contains debugging information
 | |
| 
 | |
| \b \c{align=}\e{alignment} - specify section alignment
 | |
| 
 | |
| The default is \c{data}, unless the section name is \c{__text} or
 | |
| \c{__bss} in which case the default is \c{text} or \c{bss},
 | |
| respectively.
 | |
| 
 | |
| For compatibility with other Unix platforms, the following standard
 | |
| names are also supported:
 | |
| 
 | |
| \c .text    = __TEXT,__text  text
 | |
| \c .rodata  = __DATA,__const data
 | |
| \c .data    = __DATA,__data  data
 | |
| \c .bss     = __DATA,__bss   bss
 | |
| 
 | |
| If the \c{.rodata} section contains no relocations, it is instead put
 | |
| into the \c{__TEXT,__const} section unless this section has already
 | |
| been specified explicitly.  However, it is probably better to specify
 | |
| \c{__TEXT,__const} and \c{__DATA,__const} explicitly as appropriate.
 | |
| 
 | |
| \S{machotls} \i{Thread Local Storage in Mach-O}\I{TLS}: \c{macho} special
 | |
| symbols and \i\c{WRT}
 | |
| 
 | |
| Mach-O defines the following special symbols that can be used on the
 | |
| right-hand side of the \c{WRT} operator:
 | |
| 
 | |
| \b \c{..tlvp} is used to specify access to thread-local storage.
 | |
| 
 | |
| \b \c{..gotpcrel} is used to specify references to the Global Offset
 | |
|    Table.  The GOT is supported in the \c{macho64} format only.
 | |
| 
 | |
| \S{macho-ssvs} \c{macho} specific directive \i\c{subsections_via_symbols}
 | |
| 
 | |
| The directive \c{subsections_via_symbols} sets the
 | |
| \c{MH_SUBSECTIONS_VIA_SYMBOLS} flag in the Mach-O header, that effectively
 | |
| separates a block (or a subsection) based on a symbol. It is often used
 | |
| for eliminating dead codes by a linker.
 | |
| 
 | |
| This directive takes no arguments.
 | |
| 
 | |
| This is a macro implemented as a \c{%pragma}.  It can also be
 | |
| specified in its \c{%pragma} form, in which case it will not affect
 | |
| non-Mach-O builds of the same source code:
 | |
| 
 | |
| \c      %pragma macho subsections_via_symbols
 | |
| 
 | |
| \S{macho-ssvs} \c{macho} specific directive \i\c{no_dead_strip}
 | |
| 
 | |
| The directive \c{no_dead_strip} sets the Mach-O \c{SH_NO_DEAD_STRIP}
 | |
| section flag on the section containing a a specific symbol.  This
 | |
| directive takes a list of symbols as its arguments.
 | |
| 
 | |
| This is a macro implemented as a \c{%pragma}.  It can also be
 | |
| specified in its \c{%pragma} form, in which case it will not affect
 | |
| non-Mach-O builds of the same source code:
 | |
| 
 | |
| \c      %pragma macho no_dead_strip symbol...
 | |
| 
 | |
| \S{macho-pext} \c{macho} specific extensions to the \c{GLOBAL}
 | |
| Directive: \i\c{private_extern}
 | |
| 
 | |
| The directive extension to \c{GLOBAL} marks the symbol with limited
 | |
| global scope. For example, you can specify the global symbol with
 | |
| this extension:
 | |
| 
 | |
| \c global foo:private_extern
 | |
| \c foo:
 | |
| \c          ; codes
 | |
| 
 | |
| Using with static linker will clear the private extern attribute.
 | |
| But linker option like \c{-keep_private_externs} can avoid it.
 | |
| 
 | |
| \H{elffmt} \i\c{elf32}, \i\c{elf64}, \i\c{elfx32}:
 | |
| \I{ELF}\I{linux, elf}Executable and Linkable Format Object Files
 | |
| 
 | |
| The \c{elf32}, \c{elf64} and \c{elfx32} output formats generate
 | |
| \c{ELF32} and \c{ELF64} (Executable and Linkable Format) object files, as
 | |
| used by \i{Linux} as well as \i{Unix System V}, including \i{Solaris x86},
 | |
| \i{UnixWare} and \i{SCO Unix}. ELF provides a default output
 | |
| file-name extension of \c{.o}. \c{elf} is a synonym for \c{elf32}.
 | |
| 
 | |
| The \c{elfx32} file format is an ELF32 file containing 64-bit x86
 | |
| code, and is used for the \i{x32} ABI, which runs the CPU in 64-bit
 | |
| mode while using 32-bit values for pointers to reduce memory
 | |
| footprint. Thus, code intended to be used with the x32 ABI should be
 | |
| assembled with \c{BITS 64}.
 | |
| 
 | |
| \S{abisect} ELF specific directive \i\c{osabi}
 | |
| 
 | |
| The ELF header specifies the application binary interface for the
 | |
| target operating system (OSABI).  This field can be set by using the
 | |
| \c{osabi} directive with the numeric value (0-255) of the target
 | |
| system. If this directive is not used, the default value will be "UNIX
 | |
| System V ABI" (0) which will work on most systems which support ELF.
 | |
| 
 | |
| \S{elfsect} ELF extensions to the \c{SECTION} Directive
 | |
| \I{SECTION, ELF extensions to}
 | |
| 
 | |
| Like the \c{obj} format, \c{elf} allows you to specify additional
 | |
| information on the \c{SECTION} directive line, to control the type
 | |
| and properties of sections you declare. Section types and properties
 | |
| are generated automatically by NASM for the \i{standard section
 | |
| names}, but may still be
 | |
| overridden by these qualifiers.
 | |
| 
 | |
| The available qualifiers are:
 | |
| 
 | |
| \b \i\c{alloc} defines the section to be one which is loaded into
 | |
| memory when the program is run. \i\c{noalloc} defines it to be one
 | |
| which is not, such as an informational or comment section.
 | |
| 
 | |
| \b \i\c{exec} defines the section to be one which should have execute
 | |
| permission when the program is run. \i\c{noexec} defines it as one
 | |
| which should not.
 | |
| 
 | |
| \b \i\c{write} defines the section to be one which should be writable
 | |
| when the program is run. \i\c{nowrite} defines it as one which should
 | |
| not.
 | |
| 
 | |
| \b \i\c{progbits} defines the section to be one with explicit contents
 | |
| stored in the object file: an ordinary code or data section, for
 | |
| example.
 | |
| 
 | |
| \b \i\c{nobits} defines the section to be one with no explicit
 | |
| contents given, such as a BSS section.
 | |
| 
 | |
| \b \i\c{note} indicates that this section contains ELF notes. The
 | |
| content of ELF notes are specified using normal assembly instructions;
 | |
| it is up to the programmer to ensure these are valid ELF notes.
 | |
| 
 | |
| \b \i\c{preinit_array} indicates that this section contains function
 | |
| addresses to be called before any other initialization has happened.
 | |
| 
 | |
| \b \i\c{init_array} indicates that this section contains function
 | |
| addresses to be called during initialization.
 | |
| 
 | |
| \b \i\c{fini_array} indicates that this section contains function
 | |
| pointers to be called during termination.
 | |
| 
 | |
| \b \I{align, ELF attribute}\c{align=}, used with a trailing number as in \c{obj}, gives the
 | |
| \I{section alignment, in elf}\I{alignment, in elf sections}alignment
 | |
| requirements of the section.
 | |
| 
 | |
| \b \c{byte}, \c{word}, \c{dword}, \c{qword}, \c{tword}, \c{oword},
 | |
| \c{yword}, or \c{zword} with an optional \c{*}\i{multiplier} specify
 | |
| the fundamental data item size for a section which contains either
 | |
| fixed-sized data structures or strings; it also sets a default
 | |
| alignment. This is generally used with the \c{strings} and \c{merge}
 | |
| attributes (see below.) For example \c{byte*4} defines a unit size of
 | |
| 4 bytes, with a default alignment of 1; \c{dword} also defines a unit
 | |
| size of 4 bytes, but with a default alignment of 4. The \c{align=}
 | |
| attribute, if specified, overrides this default alignment.
 | |
| 
 | |
| \b \I{pointer, ELF attribute}\c{pointer} is equivalent to \c{dword}
 | |
| for \c{elf32} or \c{elfx32}, and \c{qword} for \c{elf64}.
 | |
| 
 | |
| \b \I{strings, ELF attribute}\c{strings} indicate that this section
 | |
| contains exclusively null-terminated strings. By default these are
 | |
| assumed to be byte strings, but a size specifier can be used to
 | |
| override that.
 | |
| 
 | |
| \b \i\c{merge} indicates that duplicate data elements in this section
 | |
| should be merged with data elements from other object files. Data
 | |
| elements can be either fixed-sized objects or null-terminated strings
 | |
| (with the \c{strings} attribute). A size specifier is required unless
 | |
| \c{strings} is specified, in which case the size defaults to \c{byte}.
 | |
| 
 | |
| \b \i\c{tls} defines the section to be one which contains
 | |
| thread local variables.
 | |
| 
 | |
| The defaults assumed by NASM if you do not specify the above
 | |
| qualifiers are:
 | |
| 
 | |
| \I\c{.text} \I\c{.rodata} \I\c{.lrodata} \I\c{.data} \I\c{.ldata}
 | |
| \I\c{.bss} \I\c{.lbss} \I\c{.tdata} \I\c{.tbss} \I\c\{.comment}
 | |
| 
 | |
| \c section .text          progbits      alloc   exec    nowrite  align=16
 | |
| \c section .rodata        progbits      alloc   noexec  nowrite  align=4
 | |
| \c section .lrodata       progbits      alloc   noexec  nowrite  align=4
 | |
| \c section .data          progbits      alloc   noexec  write    align=4
 | |
| \c section .ldata         progbits      alloc   noexec  write    align=4
 | |
| \c section .bss           nobits        alloc   noexec  write    align=4
 | |
| \c section .lbss          nobits        alloc   noexec  write    align=4
 | |
| \c section .tdata         progbits      alloc   noexec  write    align=4   tls
 | |
| \c section .tbss          nobits        alloc   noexec  write    align=4   tls
 | |
| \c section .comment       progbits      noalloc noexec  nowrite  align=1
 | |
| \c section .preinit_array preinit_array alloc   noexec  nowrite  pointer
 | |
| \c section .init_array    init_array    alloc   noexec  nowrite  pointer
 | |
| \c section .fini_array    fini_array    alloc   noexec  nowrite  pointer
 | |
| \c section .note          note          noalloc noexec  nowrite  align=4
 | |
| \c section other          progbits      alloc   noexec  nowrite  align=1
 | |
| 
 | |
| (Any section name other than those in the above table
 | |
|  is treated by default like \c{other} in the above table.
 | |
|  Please note that section names are case sensitive.)
 | |
| 
 | |
| 
 | |
| \S{elfwrt} \i{Position-Independent Code}\I{PIC}: ELF Special
 | |
| Symbols and \i\c{WRT}
 | |
| 
 | |
| Since \c{ELF} does not support segment-base references, the \c{WRT}
 | |
| operator is not used for its normal purpose; therefore NASM's
 | |
| \c{elf} output format makes use of \c{WRT} for a different purpose,
 | |
| namely the PIC-specific \I{relocations, PIC-specific}relocation
 | |
| types.
 | |
| 
 | |
| \c{elf} defines five special symbols which you can use as the
 | |
| right-hand side of the \c{WRT} operator to obtain PIC relocation
 | |
| types. They are \i\c{..gotpc}, \i\c{..gotoff}, \i\c{..got},
 | |
| \i\c{..plt} and \i\c{..sym}. Their functions are summarized here:
 | |
| 
 | |
| \b Referring to the symbol marking the global offset table base
 | |
| using \c{wrt ..gotpc} will end up giving the distance from the
 | |
| beginning of the current section to the global offset table.
 | |
| (\i\c{_GLOBAL_OFFSET_TABLE_} is the standard symbol name used to
 | |
| refer to the \i{GOT}.) So you would then need to add \i\c{$$} to the
 | |
| result to get the real address of the GOT.
 | |
| 
 | |
| \b Referring to a location in one of your own sections using \c{wrt
 | |
| ..gotoff} will give the distance from the beginning of the GOT to
 | |
| the specified location, so that adding on the address of the GOT
 | |
| would give the real address of the location you wanted.
 | |
| 
 | |
| \b Referring to an external or global symbol using \c{wrt ..got}
 | |
| causes the linker to build an entry \e{in} the GOT containing the
 | |
| address of the symbol, and the reference gives the distance from the
 | |
| beginning of the GOT to the entry; so you can add on the address of
 | |
| the GOT, load from the resulting address, and end up with the
 | |
| address of the symbol.
 | |
| 
 | |
| \b Referring to a procedure name using \c{wrt ..plt} causes the
 | |
| linker to build a \i{procedure linkage table} entry for the symbol,
 | |
| and the reference gives the address of the \i{PLT} entry. You can
 | |
| only use this in contexts which would generate a PC-relative
 | |
| relocation normally (i.e. as the destination for \c{CALL} or
 | |
| \c{JMP}), since ELF contains no relocation type to refer to PLT
 | |
| entries absolutely.
 | |
| 
 | |
| \b Referring to a symbol name using \c{wrt ..sym} causes NASM to
 | |
| write an ordinary relocation, but instead of making the relocation
 | |
| relative to the start of the section and then adding on the offset
 | |
| to the symbol, it will write a relocation record aimed directly at
 | |
| the symbol in question. The distinction is a necessary one due to a
 | |
| peculiarity of the dynamic linker.
 | |
| 
 | |
| A fuller explanation of how to use these relocation types to write
 | |
| shared libraries entirely in NASM is given in \k{picdll}.
 | |
| 
 | |
| \S{elftls} \i{Thread Local Storage in ELF}\I{TLS}: \c{elf} Special
 | |
| Symbols and \i\c{WRT}
 | |
| 
 | |
| \b In ELF32 mode, referring to an external or global symbol using
 | |
| \c{wrt ..tlsie} \I\c{..tlsie}
 | |
| causes the linker to build an entry \e{in} the GOT containing the
 | |
| offset of the symbol within the TLS block, so you can access the value
 | |
| of the symbol with code such as:
 | |
| 
 | |
| \c	  mov  eax,[tid wrt ..tlsie]
 | |
| \c	  mov  [gs:eax],ebx
 | |
| 
 | |
| 
 | |
| \b In ELF64 or ELFx32 mode, referring to an external or global symbol using
 | |
| \c{wrt ..gottpoff} \I\c{..gottpoff}
 | |
| causes the linker to build an entry \e{in} the GOT containing the
 | |
| offset of the symbol within the TLS block, so you can access the value
 | |
| of the symbol with code such as:
 | |
| 
 | |
| \c	  mov	rax,[rel tid wrt ..gottpoff]
 | |
| \c	  mov	rcx,[fs:rax]
 | |
| 
 | |
| 
 | |
| \S{elfglob} \c{elf} Extensions to the \c{GLOBAL} Directive\I{GLOBAL,
 | |
| elf extensions to}\I{GLOBAL, aoutb extensions to}
 | |
| 
 | |
| \c{ELF} object files can contain more information about a global
 | |
| symbol than just its address: they can contain the \I{symbols,
 | |
| specifying sizes}\I{size, of symbols}size of the symbol and its
 | |
| \I{symbols, specifying types}\I{type, of symbols}type as well. These
 | |
| are not merely debugger conveniences, but are actually necessary when
 | |
| the program being written is a \I{elf shared library}shared
 | |
| library. NASM therefore supports some extensions to the \c{GLOBAL}
 | |
| directive, allowing you to specify these features.
 | |
| 
 | |
| You can specify whether a global variable is a function or a data
 | |
| object by suffixing the name with a colon and the word
 | |
| \i\c{function} or \i\c{data}. (\i\c{object} is a synonym for
 | |
| \c{data}.) For example:
 | |
| 
 | |
| \c global   hashlookup:function, hashtable:data
 | |
| 
 | |
| exports the global symbol \c{hashlookup} as a function and
 | |
| \c{hashtable} as a data object.
 | |
| 
 | |
| Optionally, you can control the ELF visibility of the symbol.  Just
 | |
| add one of the \I{elf visibility}visibility keywords:
 | |
| \I{default, elf}\c{default},
 | |
| \I{internal, elf}\c{internal},
 | |
| \I{hidden, elf}\c{hidden},
 | |
| or \I{protected, elf}\c{protected}.  The default is
 | |
| \c{default} of course.  For example, to make \c{hashlookup} hidden:
 | |
| 
 | |
| \c global   hashlookup:function hidden
 | |
| 
 | |
| Since version 2.15, it is possible to specify symbols binding. The keywords
 | |
| are: \i\c{weak} to generate weak symbol or \i\c{strong}. The default is \i\c{strong}.
 | |
| 
 | |
| You can also specify the size of the data associated with the
 | |
| symbol, as a numeric expression (which may involve labels, and even
 | |
| forward references) after the type specifier. Like this:
 | |
| 
 | |
| \c global  hashtable:data (hashtable.end - hashtable)
 | |
| \c
 | |
| \c hashtable:
 | |
| \c         db this,that,theother  ; some data here
 | |
| \c .end:
 | |
| 
 | |
| This makes NASM automatically calculate the length of the table and
 | |
| place that information into the \c{ELF} symbol table.
 | |
| 
 | |
| Declaring the type and size of global symbols is necessary when
 | |
| writing shared library code. For more information, see
 | |
| \k{picglobal}.
 | |
| 
 | |
| 
 | |
| \S{elfextrn} \c{elf} Extensions to the \c{EXTERN} Directive\I{EXTERN,
 | |
| elf extensions to}\I{EXTERN, elf extensions to}
 | |
| 
 | |
| Since version 2.15 it is possible to specify keyword \i\c{weak} to generate weak external
 | |
| reference. Example:
 | |
| 
 | |
| \c extern weak_ref:weak
 | |
| 
 | |
| 
 | |
| \S{elfcomm} \c{elf} Extensions to the \c{COMMON} Directive
 | |
| \I{COMMON, elf extensions to}
 | |
| 
 | |
| \c{ELF} also allows you to specify alignment requirements \I{common
 | |
| variables, alignment in elf}\I{alignment, of elf common variables}on
 | |
| common variables. This is done by putting a number (which must be a
 | |
| power of two) after the name and size of the common variable,
 | |
| separated (as usual) by a colon. For example, an array of
 | |
| doublewords would benefit from 4-byte alignment:
 | |
| 
 | |
| \c common  dwordarray 128:4
 | |
| 
 | |
| This declares the total size of the array to be 128 bytes, and
 | |
| requires that it be aligned on a 4-byte boundary.
 | |
| 
 | |
| 
 | |
| \S{elf16} 16-bit code and ELF
 | |
| \I{ELF, 16-bit code}
 | |
| 
 | |
| Older versions of the \c{ELF32} specification did not provide
 | |
| relocations for 8- and 16-bit values. It is now part of the formal
 | |
| specification, and any new enough linker should support them.
 | |
| 
 | |
| ELF has currently no support for segmented programming.
 | |
| 
 | |
| \S{elfdbg} Debug formats and ELF
 | |
| \I{ELF, debug formats}
 | |
| 
 | |
| ELF provides debug information in \c{STABS} and \c{DWARF} formats.
 | |
| Line number information is generated for all executable sections, but please
 | |
| note that only the ".text" section is executable by default.
 | |
| 
 | |
| \H{aoutfmt} \i\c{aout}: Linux \I{a.out, Linux version}\I{linux, a.out}\c{a.out} Object Files
 | |
| 
 | |
| The \c{aout} format generates \c{a.out} object files, in the form used
 | |
| by early \i{Linux} systems (current Linux systems use ELF, see
 | |
| \k{elffmt}.) These differ from other \c{a.out} object files in that
 | |
| the magic number in the first four bytes of the file is
 | |
| different; also, some implementations of \c{a.out}, for example
 | |
| NetBSD's, support position-independent code, which Linux's
 | |
| implementation does not.
 | |
| 
 | |
| \c{a.out} provides a default output file-name extension of \c{.o}.
 | |
| 
 | |
| \c{a.out} is a very simple object format. It supports no special
 | |
| directives, no special symbols, no use of \c{SEG} or \c{WRT}, and no
 | |
| extensions to any standard directives. It supports only the three
 | |
| \i{standard section names} \i\c{.text}, \i\c{.data} and \i\c{.bss}.
 | |
| 
 | |
| 
 | |
| \H{aoutfmt} \i\c{aoutb}: \i{NetBSD}/\i{FreeBSD}/\i{OpenBSD}
 | |
| \I{a.out, BSD version}\c{a.out} Object Files
 | |
| 
 | |
| The \c{aoutb} format generates \c{a.out} object files, in the form
 | |
| used by the various free \c{BSD Unix} clones, \c{NetBSD}, \c{FreeBSD}
 | |
| and \c{OpenBSD}. For simple object files, this object format is exactly
 | |
| the same as \c{aout} except for the magic number in the first four bytes
 | |
| of the file. However, the \c{aoutb} format supports
 | |
| \I{PIC}\i{position-independent code} in the same way as the \c{elf}
 | |
| format, so you can use it to write \c{BSD} \i{shared libraries}.
 | |
| 
 | |
| \c{aoutb} provides a default output file-name extension of \c{.o}.
 | |
| 
 | |
| \c{aoutb} supports no special directives, no special symbols, and
 | |
| only the three \i{standard section names} \i\c{.text}, \i\c{.data}
 | |
| and \i\c{.bss}. However, it also supports the same use of \i\c{WRT} as
 | |
| \c{elf} does, to provide position-independent code relocation types.
 | |
| See \k{elfwrt} for full documentation of this feature.
 | |
| 
 | |
| \c{aoutb} also supports the same extensions to the \c{GLOBAL}
 | |
| directive as \c{elf} does: see \k{elfglob} for documentation of
 | |
| this.
 | |
| 
 | |
| 
 | |
| \H{as86fmt} \c{as86}: \i{Minix}/Linux\I{linux, as86} \i\c{as86} Object Files
 | |
| 
 | |
| The Minix/Linux 16-bit assembler \c{as86} has its own non-standard
 | |
| object file format. Although its companion linker \i\c{ld86} produces
 | |
| something close to ordinary \c{a.out} binaries as output, the object
 | |
| file format used to communicate between \c{as86} and \c{ld86} is not
 | |
| itself \c{a.out}.
 | |
| 
 | |
| NASM supports this format, just in case it is useful, as \c{as86}.
 | |
| \c{as86} provides a default output file-name extension of \c{.o}.
 | |
| 
 | |
| \c{as86} is a very simple object format (from the NASM user's point
 | |
| of view). It supports no special directives, no use of \c{SEG} or \c{WRT},
 | |
| and no extensions to any standard directives. It supports only the three
 | |
| \i{standard section names} \i\c{.text}, \i\c{.data} and \i\c{.bss}.  The
 | |
| only special symbol supported is \c{..start}.
 | |
| 
 | |
| 
 | |
| \H{dbgfmt} \i\c{dbg}: Debugging Format
 | |
| 
 | |
| The \c{dbg} format does not output an object file as such; instead,
 | |
| it outputs a text file which contains a complete list of all the
 | |
| transactions between the main body of NASM and the output-format
 | |
| back end module. It is primarily intended to aid people who want to
 | |
| write their own output drivers, so that they can get a clearer idea
 | |
| of the various requests the main program makes of the output driver,
 | |
| and in what order they happen.
 | |
| 
 | |
| For simple files, one can easily use the \c{dbg} format like this:
 | |
| 
 | |
| \c nasm -f dbg filename.asm
 | |
| 
 | |
| which will generate a diagnostic file called \c{filename.dbg}.
 | |
| However, this will not work well on files which were designed for a
 | |
| different object format, because each object format defines its own
 | |
| macros (usually user-level forms of directives), and those macros
 | |
| will not be defined in the \c{dbg} format. Therefore it can be
 | |
| useful to run NASM twice, in order to do the preprocessing with the
 | |
| native object format selected:
 | |
| 
 | |
| \c nasm -e -f elf32 -o elfprog.i elfprog.asm
 | |
| \c nasm -a -f dbg elfprog.i
 | |
| 
 | |
| This preprocesses \c{elfprog.asm} into \c{elfprog.i}, keeping the
 | |
| \c{elf32} object format selected in order to make sure ELF special
 | |
| directives are converted into primitive form correctly. Then the
 | |
| preprocessed source is fed through the \c{dbg} format to generate the
 | |
| final diagnostic output.
 | |
| 
 | |
| This workaround will still typically not work for programs intended
 | |
| for \c{obj} format, because the \c{obj} \c{SEGMENT} and \c{GROUP}
 | |
| directives have side effects of defining the segment and group names
 | |
| as symbols; \c{dbg} will not do this, so the program will not
 | |
| assemble. You will have to work around that by defining the symbols
 | |
| yourself (using \c{EXTERN}, for example) if you really need to get a
 | |
| \c{dbg} trace of an \c{obj}-specific source file.
 | |
| 
 | |
| \c{dbg} accepts any section name and any directives at all, and logs
 | |
| them all to its output file.
 | |
| 
 | |
| \c{dbg} accepts and logs any \c{%pragma}, but the specific
 | |
| \c{%pragma}:
 | |
| 
 | |
| \c      %pragma dbg maxdump <size>
 | |
| 
 | |
| where \c{<size>} is either a number or \c{unlimited}, can be used to
 | |
| control the maximum size for dumping the full contents of a
 | |
| \c{rawdata} output object.
 |