| harpy-0.2: Runtime code generation for x86 machine code | Contents | Index |
|
Harpy.X86Disassembler | Portability | portable | Stability | provisional | Maintainer | {magr,klee}@cs.tu-berlin.de |
|
|
|
|
|
Description |
Disassembler for x86 machine code.
This is a disassembler for object code for the x86 architecture. It
provides functions for disassembling byte arrays, byte lists and files
containing raw binary code.
Features:
- Disassembles memory blocks, lists or arrays of bytes into lists of
instructions.
- Abstract instructions provide as much information as possible about
opcodes, addressing modes or operand sizes, allowing for detailed
output.
- Provides functions for displaying instructions in Intel or AT&T
style (like the GNU tools)
Differences to GNU tools, like gdb or objdump:
- Displacements are shown in decimal, with sign if negative.
Missing:
- LOCK and repeat prefixes are recognized, but not contained in the
opcodes of instructions.
- Support for 16-bit addressing modes. Could be added when needed.
- Complete disassembly of all 64-bit instructions. I have tried to
disassemble them properly but have been limited to the information
in the docs, because I have no 64-bit machine to test on. This will
probably change when I get GNU as to produce 64-bit object files.
- Not all MMX and SSESSE2SSE3 instructions are decoded yet. This is
just a matter of missing time.
- segment override prefixes are decoded, but not appended to memory
references
On the implementation:
This disassembler uses the Parsec parser combinators, working on byte
lists. This proved to be very convenient, as the combinators keep
track of the current position, etc.
|
|
Synopsis |
|
|
|
|
Types
|
|
data Opcode |
All opcodes are represented by this enumeration type.
| Instances | |
|
|
data Operand |
All operands are in one of the following locations:
- Constants in the instruction stream
- Memory locations
- Registers
Memory locations are referred to by on of several addressing modes:
- Absolute (address in instruction stream)
- Register-indirect (address in register)
- Register-indirect with displacement
- Base-Index with scale
- Base-Index with scale and displacement
Displacements can be encoded as 8 or 32-bit immediates in the
instruction stream, but are encoded as Int in instructions for
simplicity.
| Constructors | |
|
|
data InstrOperandSize |
Some opcodes can operate on data of several widths. This information
is encoded in instructions using the following enumeration type..
| Constructors | OPNONE | No operand size specified
| OP8 | 8-bit integer operand
| OP16 | 16-bit integer operand
| OP32 | 32-bit integer operand
| OP64 | 64-bit integer operand
| OP128 | 128-bit integer operand
| OPF32 | 32-bit floating point operand
| OPF64 | 64-bit floating point operand
| OPF80 | 80-bit floating point operand
|
| Instances | |
|
|
data Instruction |
The disassembly routines return lists of the following datatype. It
encodes both invalid byte sequences (with a useful error message, if
possible), or a valid instruction. Both variants contain the list of
opcode bytes from which the instruction was decoded and the address of
the instruction.
| Constructors | BadInstruction Word8 String Int [Word8] | Invalid instruction
| Instruction | Valid instruction
| opcode :: Opcode | Opcode of the instruction
| opsize :: InstrOperandSize | Operand size, if any
| operands :: [Operand] | Instruction operands
| address :: Int | Start address of instruction
| bytes :: [Word8] | Instruction bytes
|
|
| Instances | |
|
|
data ShowStyle |
Instructions can be displayed either in Intel or AT&T style (like in
GNU tools).
Intel style:
- Destination operand comes first, source second.
- No register or immediate prefixes.
- Memory operands are annotated with operand size.
- Hexadecimal numbers are suffixed with H and prefixed with 0 if
necessary.
AT&T style:
- Source operand comes first, destination second.
- Register names are prefixes with %.
- Immediates are prefixed with $.
- Hexadecimal numbers are prefixes with 0x
- Opcodes are suffixed with operand size, when ambiguous otherwise.
| Constructors | IntelStyle | Show in Intel style
| AttStyle | Show in AT&T style
|
|
|
|
Functions
|
|
disassembleBlock :: Ptr Word8 -> Int -> IO (Either ParseError [Instruction]) |
Disassemble a block of memory. Starting at the location
pointed to by the given pointer, the given number of bytes are
disassembled.
|
|
disassembleList :: Monad m => [Word8] -> m (Either ParseError [Instruction]) |
Disassemble the contents of the given list.
|
|
disassembleArray :: (Monad m, IArray a Word8, Ix i) => a i Word8 -> m (Either ParseError [Instruction]) |
Disassemble the contents of the given array.
|
|
showIntel :: Instruction -> [Char] |
Show an instruction in Intel style.
|
|
showAtt :: Instruction -> [Char] |
Show an instruction in AT&T style.
|
|
testFile :: FilePath -> ShowStyle -> IO () |
Test function for disassembling the contents of a binary file and
displaying it in the provided style (IntelStyle or AttStyle).
|
|
Produced by Haddock version 0.8 |