mirror of
https://github.com/netwide-assembler/nasm.git
synced 2025-09-22 10:43:39 -04:00
Make the source code for the documentation a little easier to deal with by breaking it into individual chapter files. Add support to rdsrc.pl for auto-generating dependencies. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
169 lines
7.1 KiB
Plaintext
169 lines
7.1 KiB
Plaintext
\A{ndisasm} \i{Ndisasm}
|
|
|
|
The Netwide Disassembler, NDISASM
|
|
|
|
\H{ndisintro} Introduction
|
|
|
|
|
|
The Netwide Disassembler is a small companion program to the Netwide
|
|
Assembler, NASM. It seemed a shame to have an x86 assembler,
|
|
complete with a full instruction table, and not make as much use of
|
|
it as possible, so here's a disassembler which shares the
|
|
instruction table (and some other bits of code) with NASM.
|
|
|
|
The Netwide Disassembler does nothing except to produce
|
|
disassemblies of \e{binary} source files. NDISASM does not have any
|
|
understanding of object file formats, like \c{objdump}, and it will
|
|
not understand \c{DOS .EXE} files like \c{debug} will. It just
|
|
disassembles.
|
|
|
|
|
|
\H{ndisrun} Running NDISASM
|
|
|
|
To disassemble a file, you will typically use a command of the form
|
|
|
|
\c ndisasm -b {16|32|64} filename
|
|
|
|
NDISASM can disassemble 16-, 32- or 64-bit code equally easily,
|
|
provided of course that you remember to specify which it is to work
|
|
with. If no \i\c{-b} switch is present, NDISASM works in 16-bit mode
|
|
by default. The \i\c{-u} switch (for USE32) also invokes 32-bit mode.
|
|
|
|
Two more command line options are \i\c{-r} which reports the version
|
|
number of NDISASM you are running, and \i\c{-h} which gives a short
|
|
summary of command line options.
|
|
|
|
|
|
\S{ndiscom} Specifying the Input Origin
|
|
|
|
To disassemble a \c{DOS .COM} file correctly, a disassembler must assume
|
|
that the first instruction in the file is loaded at address \c{0x100},
|
|
rather than at zero. NDISASM, which assumes by default that any file
|
|
you give it is loaded at zero, will therefore need to be informed of
|
|
this.
|
|
|
|
The \i\c{-o} option allows you to declare a different origin for the
|
|
file you are disassembling. Its argument may be expressed in any of
|
|
the NASM numeric formats: decimal by default, if it begins with `\c{$}'
|
|
or `\c{0x}' or ends in `\c{H}' it's \c{hex}, if it ends in `\c{Q}' it's
|
|
\c{octal}, and if it ends in `\c{B}' it's \c{binary}.
|
|
|
|
Hence, to disassemble a \c{.COM} file:
|
|
|
|
\c ndisasm -o100h filename.com
|
|
|
|
will do the trick.
|
|
|
|
|
|
\S{ndissync} Code Following Data: Synchronization
|
|
|
|
Suppose you are disassembling a file which contains some data which
|
|
isn't machine code, and \e{then} contains some machine code. NDISASM
|
|
will faithfully plough through the data section, producing machine
|
|
instructions wherever it can (although most of them will look
|
|
bizarre, and some may have unusual prefixes, e.g. `\c{FS OR AX,0x240A}'),
|
|
and generating `DB' instructions ever so often if it's totally stumped.
|
|
Then it will reach the code section.
|
|
|
|
Supposing NDISASM has just finished generating a strange machine
|
|
instruction from part of the data section, and its file position is
|
|
now one byte \e{before} the beginning of the code section. It's
|
|
entirely possible that another spurious instruction will get
|
|
generated, starting with the final byte of the data section, and
|
|
then the correct first instruction in the code section will not be
|
|
seen because the starting point skipped over it. This isn't really
|
|
ideal.
|
|
|
|
To avoid this, you can specify a `\i{synchronization}' point, or indeed
|
|
as many synchronization points as you like (although NDISASM can
|
|
only handle 2147483647 sync points internally). The definition of a sync
|
|
point is this: NDISASM guarantees to hit sync points exactly during
|
|
disassembly. If it is thinking about generating an instruction which
|
|
would cause it to jump over a sync point, it will discard that
|
|
instruction and output a `\c{db}' instead. So it \e{will} start
|
|
disassembly exactly from the sync point, and so you \e{will} see all
|
|
the instructions in your code section.
|
|
|
|
Sync points are specified using the \i\c{-s} option: they are measured
|
|
in terms of the program origin, not the file position. So if you
|
|
want to synchronize after 32 bytes of a \c{.COM} file, you would have to
|
|
do
|
|
|
|
\c ndisasm -o100h -s120h file.com
|
|
|
|
rather than
|
|
|
|
\c ndisasm -o100h -s20h file.com
|
|
|
|
As stated above, you can specify multiple sync markers if you need
|
|
to, just by repeating the \c{-s} option.
|
|
|
|
|
|
\S{ndisisync} Mixed Code and Data: Automatic (Intelligent) Synchronization
|
|
\I\c{auto-sync}
|
|
|
|
Suppose you are disassembling the boot sector of a \c{DOS} floppy (maybe
|
|
it has a virus, and you need to understand the virus so that you
|
|
know what kinds of damage it might have done you). Typically, this
|
|
will contain a \c{JMP} instruction, then some data, then the rest of the
|
|
code. So there is a very good chance of NDISASM being \e{misaligned}
|
|
when the data ends and the code begins. Hence a sync point is
|
|
needed.
|
|
|
|
On the other hand, why should you have to specify the sync point
|
|
manually? What you'd do in order to find where the sync point would
|
|
be, surely, would be to read the \c{JMP} instruction, and then to use
|
|
its target address as a sync point. So can NDISASM do that for you?
|
|
|
|
The answer, of course, is yes: using either of the synonymous
|
|
switches \i\c{-a} (for automatic sync) or \i\c{-i} (for intelligent
|
|
sync) will enable \c{auto-sync} mode. Auto-sync mode automatically
|
|
generates a sync point for any forward-referring PC-relative jump or
|
|
call instruction that NDISASM encounters. (Since NDISASM is one-pass,
|
|
if it encounters a PC-relative jump whose target has already been
|
|
processed, there isn't much it can do about it...)
|
|
|
|
Only PC-relative jumps are processed, since an absolute jump is
|
|
either through a register (in which case NDISASM doesn't know what
|
|
the register contains) or involves a segment address (in which case
|
|
the target code isn't in the same segment that NDISASM is working
|
|
in, and so the sync point can't be placed anywhere useful).
|
|
|
|
For some kinds of file, this mechanism will automatically put sync
|
|
points in all the right places, and save you from having to place
|
|
any sync points manually. However, it should be stressed that
|
|
auto-sync mode is \e{not} guaranteed to catch all the sync points, and
|
|
you may still have to place some manually.
|
|
|
|
Auto-sync mode doesn't prevent you from declaring manual sync
|
|
points: it just adds automatically generated ones to the ones you
|
|
provide. It's perfectly feasible to specify \c{-i} \e{and} some \c{-s}
|
|
options.
|
|
|
|
Another caveat with auto-sync mode is that if, by some unpleasant
|
|
fluke, something in your data section should disassemble to a
|
|
PC-relative call or jump instruction, NDISASM may obediently place a
|
|
sync point in a totally random place, for example in the middle of
|
|
one of the instructions in your code section. So you may end up with
|
|
a wrong disassembly even if you use auto-sync. Again, there isn't
|
|
much I can do about this. If you have problems, you'll have to use
|
|
manual sync points, or use the \c{-k} option (documented below) to
|
|
suppress disassembly of the data area.
|
|
|
|
|
|
\S{ndisother} Other Options
|
|
|
|
The \i\c{-e} option skips a header on the file, by ignoring the first N
|
|
bytes. This means that the header is \e{not} counted towards the
|
|
disassembly offset: if you give \c{-e10 -o10}, disassembly will start
|
|
at byte 10 in the file, and this will be given offset 10, not 20.
|
|
|
|
The \i\c{-k} option is provided with two comma-separated numeric
|
|
arguments, the first of which is an assembly offset and the second
|
|
is a number of bytes to skip. This \e{will} count the skipped bytes
|
|
towards the assembly offset: its use is to suppress disassembly of a
|
|
data section which wouldn't contain anything you wanted to see
|
|
anyway.
|
|
|
|
|