harpy-0.2: Runtime code generation for x86 machine codeContentsIndex
Harpy.X86Disassembler
Portabilityportable
Stabilityprovisional
Maintainer{magr,klee}@cs.tu-berlin.de
Contents
Types
Functions
Description

Disassembler for x86 machine code.

This is a disassembler for object code for the x86 architecture. It provides functions for disassembling byte arrays, byte lists and files containing raw binary code.

Features:

  • Disassembles memory blocks, lists or arrays of bytes into lists of instructions.
  • Abstract instructions provide as much information as possible about opcodes, addressing modes or operand sizes, allowing for detailed output.
  • Provides functions for displaying instructions in Intel or AT&T style (like the GNU tools)

Differences to GNU tools, like gdb or objdump:

  • Displacements are shown in decimal, with sign if negative.

Missing:

  • LOCK and repeat prefixes are recognized, but not contained in the opcodes of instructions.
  • Support for 16-bit addressing modes. Could be added when needed.
  • Complete disassembly of all 64-bit instructions. I have tried to disassemble them properly but have been limited to the information in the docs, because I have no 64-bit machine to test on. This will probably change when I get GNU as to produce 64-bit object files.
  • Not all MMX and SSESSE2SSE3 instructions are decoded yet. This is just a matter of missing time.
  • segment override prefixes are decoded, but not appended to memory references

On the implementation:

This disassembler uses the Parsec parser combinators, working on byte lists. This proved to be very convenient, as the combinators keep track of the current position, etc.

Synopsis
data Opcode
data Operand
= OpImm Word32
| OpAddr Word32 InstrOperandSize
| OpReg String Int
| OpFPReg Int
| OpInd String InstrOperandSize
| OpIndDisp String Int InstrOperandSize
| OpBaseIndex String String Int InstrOperandSize
| OpIndexDisp String Int Int InstrOperandSize
| OpBaseIndexDisp String String Int Int InstrOperandSize
data InstrOperandSize
= OPNONE
| OP8
| OP16
| OP32
| OP64
| OP128
| OPF32
| OPF64
| OPF80
data Instruction
= BadInstruction Word8 String Int [Word8]
| Instruction {
opcode :: Opcode
opsize :: InstrOperandSize
operands :: [Operand]
address :: Int
bytes :: [Word8]
}
data ShowStyle
= IntelStyle
| AttStyle
disassembleBlock :: Ptr Word8 -> Int -> IO (Either ParseError [Instruction])
disassembleList :: Monad m => [Word8] -> m (Either ParseError [Instruction])
disassembleArray :: (Monad m, IArray a Word8, Ix i) => a i Word8 -> m (Either ParseError [Instruction])
showIntel :: Instruction -> [Char]
showAtt :: Instruction -> [Char]
testFile :: FilePath -> ShowStyle -> IO ()
Types
data Opcode
All opcodes are represented by this enumeration type.
show/hide Instances
data Operand

All operands are in one of the following locations:

  • Constants in the instruction stream - Memory locations - Registers

Memory locations are referred to by on of several addressing modes:

  • Absolute (address in instruction stream) - Register-indirect (address in register) - Register-indirect with displacement - Base-Index with scale - Base-Index with scale and displacement

Displacements can be encoded as 8 or 32-bit immediates in the instruction stream, but are encoded as Int in instructions for simplicity.

Constructors
OpImm Word32Immediate value
OpAddr Word32 InstrOperandSizeAbsolute address
OpReg String IntRegister
OpFPReg IntFloating-point register
OpInd String InstrOperandSizeRegister-indirect
OpIndDisp String Int InstrOperandSizeRegister-indirect with displacement
OpBaseIndex String String Int InstrOperandSizeBase plus scaled index
OpIndexDisp String Int Int InstrOperandSizeScaled index with displacement
OpBaseIndexDisp String String Int Int InstrOperandSizeBase plus scaled index with displacement
data InstrOperandSize
Some opcodes can operate on data of several widths. This information is encoded in instructions using the following enumeration type..
Constructors
OPNONENo operand size specified
OP88-bit integer operand
OP1616-bit integer operand
OP3232-bit integer operand
OP6464-bit integer operand
OP128128-bit integer operand
OPF3232-bit floating point operand
OPF6464-bit floating point operand
OPF8080-bit floating point operand
show/hide Instances
data Instruction
The disassembly routines return lists of the following datatype. It encodes both invalid byte sequences (with a useful error message, if possible), or a valid instruction. Both variants contain the list of opcode bytes from which the instruction was decoded and the address of the instruction.
Constructors
BadInstruction Word8 String Int [Word8]Invalid instruction
InstructionValid instruction
opcode :: OpcodeOpcode of the instruction
opsize :: InstrOperandSizeOperand size, if any
operands :: [Operand]Instruction operands
address :: IntStart address of instruction
bytes :: [Word8]Instruction bytes
show/hide Instances
data ShowStyle

Instructions can be displayed either in Intel or AT&T style (like in GNU tools).

Intel style:

  • Destination operand comes first, source second. - No register or immediate prefixes. - Memory operands are annotated with operand size. - Hexadecimal numbers are suffixed with H and prefixed with 0 if necessary.

AT&T style:

  • Source operand comes first, destination second. - Register names are prefixes with %. - Immediates are prefixed with $. - Hexadecimal numbers are prefixes with 0x - Opcodes are suffixed with operand size, when ambiguous otherwise.
Constructors
IntelStyleShow in Intel style
AttStyleShow in AT&T style
Functions
disassembleBlock :: Ptr Word8 -> Int -> IO (Either ParseError [Instruction])
Disassemble a block of memory. Starting at the location pointed to by the given pointer, the given number of bytes are disassembled.
disassembleList :: Monad m => [Word8] -> m (Either ParseError [Instruction])
Disassemble the contents of the given list.
disassembleArray :: (Monad m, IArray a Word8, Ix i) => a i Word8 -> m (Either ParseError [Instruction])
Disassemble the contents of the given array.
showIntel :: Instruction -> [Char]
Show an instruction in Intel style.
showAtt :: Instruction -> [Char]
Show an instruction in AT&T style.
testFile :: FilePath -> ShowStyle -> IO ()
Test function for disassembling the contents of a binary file and displaying it in the provided style (IntelStyle or AttStyle).
Produced by Haddock version 0.8