 | harpy-0.2: Runtime code generation for x86 machine code | Contents | Index |
|
| Harpy.X86Disassembler | | Portability | portable | | Stability | provisional | | Maintainer | {magr,klee}@cs.tu-berlin.de |
|
|
|
|
|
| Description |
Disassembler for x86 machine code.
This is a disassembler for object code for the x86 architecture. It
provides functions for disassembling byte arrays, byte lists and files
containing raw binary code.
Features:
- Disassembles memory blocks, lists or arrays of bytes into lists of
instructions.
- Abstract instructions provide as much information as possible about
opcodes, addressing modes or operand sizes, allowing for detailed
output.
- Provides functions for displaying instructions in Intel or AT&T
style (like the GNU tools)
Differences to GNU tools, like gdb or objdump:
- Displacements are shown in decimal, with sign if negative.
Missing:
- LOCK and repeat prefixes are recognized, but not contained in the
opcodes of instructions.
- Support for 16-bit addressing modes. Could be added when needed.
- Complete disassembly of all 64-bit instructions. I have tried to
disassemble them properly but have been limited to the information
in the docs, because I have no 64-bit machine to test on. This will
probably change when I get GNU as to produce 64-bit object files.
- Not all MMX and SSESSE2SSE3 instructions are decoded yet. This is
just a matter of missing time.
- segment override prefixes are decoded, but not appended to memory
references
On the implementation:
This disassembler uses the Parsec parser combinators, working on byte
lists. This proved to be very convenient, as the combinators keep
track of the current position, etc.
|
|
| Synopsis |
|
|
|
|
| Types
|
|
| data Opcode |
| All opcodes are represented by this enumeration type.
| Instances | |
|
|
| data Operand |
All operands are in one of the following locations:
- Constants in the instruction stream
- Memory locations
- Registers
Memory locations are referred to by on of several addressing modes:
- Absolute (address in instruction stream)
- Register-indirect (address in register)
- Register-indirect with displacement
- Base-Index with scale
- Base-Index with scale and displacement
Displacements can be encoded as 8 or 32-bit immediates in the
instruction stream, but are encoded as Int in instructions for
simplicity.
| | Constructors | |
|
|
| data InstrOperandSize |
| Some opcodes can operate on data of several widths. This information
is encoded in instructions using the following enumeration type..
| | Constructors | | OPNONE | No operand size specified
| | OP8 | 8-bit integer operand
| | OP16 | 16-bit integer operand
| | OP32 | 32-bit integer operand
| | OP64 | 64-bit integer operand
| | OP128 | 128-bit integer operand
| | OPF32 | 32-bit floating point operand
| | OPF64 | 64-bit floating point operand
| | OPF80 | 80-bit floating point operand
|
| Instances | |
|
|
| data Instruction |
| The disassembly routines return lists of the following datatype. It
encodes both invalid byte sequences (with a useful error message, if
possible), or a valid instruction. Both variants contain the list of
opcode bytes from which the instruction was decoded and the address of
the instruction.
| | Constructors | | BadInstruction Word8 String Int [Word8] | Invalid instruction
| | Instruction | Valid instruction
| | opcode :: Opcode | Opcode of the instruction
| | opsize :: InstrOperandSize | Operand size, if any
| | operands :: [Operand] | Instruction operands
| | address :: Int | Start address of instruction
| | bytes :: [Word8] | Instruction bytes
|
|
| Instances | |
|
|
| data ShowStyle |
Instructions can be displayed either in Intel or AT&T style (like in
GNU tools).
Intel style:
- Destination operand comes first, source second.
- No register or immediate prefixes.
- Memory operands are annotated with operand size.
- Hexadecimal numbers are suffixed with H and prefixed with 0 if
necessary.
AT&T style:
- Source operand comes first, destination second.
- Register names are prefixes with %.
- Immediates are prefixed with $.
- Hexadecimal numbers are prefixes with 0x
- Opcodes are suffixed with operand size, when ambiguous otherwise.
| | Constructors | | IntelStyle | Show in Intel style
| | AttStyle | Show in AT&T style
|
|
|
|
| Functions
|
|
| disassembleBlock :: Ptr Word8 -> Int -> IO (Either ParseError [Instruction]) |
| Disassemble a block of memory. Starting at the location
pointed to by the given pointer, the given number of bytes are
disassembled.
|
|
| disassembleList :: Monad m => [Word8] -> m (Either ParseError [Instruction]) |
| Disassemble the contents of the given list.
|
|
| disassembleArray :: (Monad m, IArray a Word8, Ix i) => a i Word8 -> m (Either ParseError [Instruction]) |
| Disassemble the contents of the given array.
|
|
| showIntel :: Instruction -> [Char] |
| Show an instruction in Intel style.
|
|
| showAtt :: Instruction -> [Char] |
| Show an instruction in AT&T style.
|
|
| testFile :: FilePath -> ShowStyle -> IO () |
| Test function for disassembling the contents of a binary file and
displaying it in the provided style (IntelStyle or AttStyle).
|
|
| Produced by Haddock version 0.8 |