Copyright | (c) Martin Grabmueller and Dirk Kleeblatt |
---|---|
License | BSD3 |
Maintainer | martin@grabmueller.de,klee@cs.tu-berlin.de |
Stability | provisional |
Portability | portable |
Safe Haskell | None |
Language | Haskell98 |
Text.Disassembler.X86Disassembler
Description
Disassembler for x86 machine code.
This is a disassembler for object code for the x86 architecture. It provides functions for disassembling byte arrays, byte lists and memory blocks containing raw binary code.
Features:
- Disassembles memory blocks, lists or arrays of bytes into lists of instructions.
- Abstract instructions provide as much information as possible about opcodes, addressing modes or operand sizes, allowing for detailed output.
- Provides functions for displaying instructions in Intel or AT&T style (like the GNU tools)
Differences to GNU tools, like gdb or objdump:
- Displacements are shown in decimal, with sign if negative.
Missing:
- LOCK and repeat prefixes are recognized, but not contained in the opcodes of instructions.
- Support for 16-bit addressing modes. Could be added when needed.
- Complete disassembly of all 64-bit instructions. I have tried to disassemble them properly but have been limited to the information in the docs, because I have no 64-bit machine to test on. This will probably change when I get GNU as to produce 64-bit object files.
- Not all MMX and SSESSE2SSE3 instructions are decoded yet. This is just a matter of missing time.
- segment override prefixes are decoded, but not appended to memory references
On the implementation:
This disassembler uses the Parsec parser combinators, working on byte lists. This proved to be very convenient, as the combinators keep track of the current position, etc.
- data Opcode
- data Operand
- data InstrOperandSize
- data Instruction
- = BadInstruction Word8 String Int [Word8]
- | PseudoInstruction Int String
- | Instruction { }
- data ShowStyle
- data Config = Config {
- confDefaultBitMode :: OperandSize
- confOperandBitMode :: OperandSize
- confAddressBitMode :: OperandSize
- confIn64BitMode :: Bool
- confStartAddr :: Word32
- disassembleBlock :: Ptr Word8 -> Int -> IO (Either ParseError [Instruction])
- disassembleList :: Monad m => [Word8] -> m (Either ParseError [Instruction])
- disassembleArray :: (Monad m, IArray a Word8, Ix i) => a i Word8 -> m (Either ParseError [Instruction])
- disassembleFile :: FilePath -> IO (Either ParseError [Instruction])
- disassembleBlockWithConfig :: Config -> Ptr Word8 -> Int -> IO (Either ParseError [Instruction])
- disassembleListWithConfig :: Monad m => Config -> [Word8] -> m (Either ParseError [Instruction])
- disassembleArrayWithConfig :: (Monad m, IArray a Word8, Ix i) => Config -> a i Word8 -> m (Either ParseError [Instruction])
- disassembleFileWithConfig :: Config -> FilePath -> IO (Either ParseError [Instruction])
- showIntel :: Instruction -> [Char]
- showAtt :: Instruction -> [Char]
- defaultConfig :: Config
Types
All opcodes are represented by this enumeration type.
All operands are in one of the following locations:
- Constants in the instruction stream
- Memory locations
- Registers
Memory locations are referred to by on of several addressing modes:
- Absolute (address in instruction stream)
- Register-indirect (address in register)
- Register-indirect with displacement
- Base-Index with scale
- Base-Index with scale and displacement
Displacements can be encoded as 8 or 32-bit immediates in the instruction stream, but are encoded as Int in instructions for simplicity.
Constructors
OpImm Word32 | Immediate value |
OpAddr Word32 InstrOperandSize | Absolute address |
OpReg String Int | Register |
OpFPReg Int | Floating-point register |
OpInd String InstrOperandSize | Register-indirect |
OpIndDisp String Int InstrOperandSize | Register-indirect with displacement |
OpBaseIndex String String Int InstrOperandSize | Base plus scaled index |
OpIndexDisp String Int Int InstrOperandSize | Scaled index with displacement |
OpBaseIndexDisp String String Int Int InstrOperandSize | Base plus scaled index with displacement |
data InstrOperandSize Source
Some opcodes can operate on data of several widths. This information is encoded in instructions using the following enumeration type..
Constructors
OPNONE | No operand size specified |
OP8 | 8-bit integer operand |
OP16 | 16-bit integer operand |
OP32 | 32-bit integer operand |
OP64 | 64-bit integer operand |
OP128 | 128-bit integer operand |
OPF32 | 32-bit floating point operand |
OPF64 | 64-bit floating point operand |
OPF80 | 80-bit floating point operand |
Instances
data Instruction Source
The disassembly routines return lists of the following datatype. It encodes both invalid byte sequences (with a useful error message, if possible), or a valid instruction. Both variants contain the list of opcode bytes from which the instruction was decoded and the address of the instruction.
Constructors
BadInstruction Word8 String Int [Word8] | Invalid instruction |
PseudoInstruction Int String | Pseudo instruction, e.g. label |
Instruction | Valid instruction |
Instances
Instructions can be displayed either in Intel or AT&T style (like in GNU tools).
Intel style:
- Destination operand comes first, source second.
- No register or immediate prefixes.
- Memory operands are annotated with operand size.
- Hexadecimal numbers are suffixed with
H
and prefixed with0
if necessary.
AT&T style:
- Source operand comes first, destination second.
- Register names are prefixes with
%
. - Immediates are prefixed with
$
. - Hexadecimal numbers are prefixes with
0x
- Opcodes are suffixed with operand size, when ambiguous otherwise.
Constructors
IntelStyle | Show in Intel style |
AttStyle | Show in AT&T style |
Constructors
Config | |
Fields
|
Functions
disassembleBlock :: Ptr Word8 -> Int -> IO (Either ParseError [Instruction]) Source
Disassemble a block of memory. Starting at the location pointed to by the given pointer, the given number of bytes are disassembled.
disassembleList :: Monad m => [Word8] -> m (Either ParseError [Instruction]) Source
Disassemble the contents of the given list.
disassembleArray :: (Monad m, IArray a Word8, Ix i) => a i Word8 -> m (Either ParseError [Instruction]) Source
Disassemble the contents of the given array.
disassembleFile :: FilePath -> IO (Either ParseError [Instruction]) Source
disassembleBlockWithConfig :: Config -> Ptr Word8 -> Int -> IO (Either ParseError [Instruction]) Source
disassembleListWithConfig :: Monad m => Config -> [Word8] -> m (Either ParseError [Instruction]) Source
disassembleArrayWithConfig :: (Monad m, IArray a Word8, Ix i) => Config -> a i Word8 -> m (Either ParseError [Instruction]) Source
disassembleFileWithConfig :: Config -> FilePath -> IO (Either ParseError [Instruction]) Source
showIntel :: Instruction -> [Char] Source
Show an instruction in Intel style.
showAtt :: Instruction -> [Char] Source
Show an instruction in AT&T style.