llvm-analysis-0.3.0: A Haskell library for analyzing LLVM bitcode

Safe HaskellNone




This module defines an abstraction over field accesses of structures called AccessPaths. A concrete access path is rooted at a value, while an abstract access path is rooted at a type. Both include a list of AccessTypes that denote dereferences of pointers, field accesses, and array references.



data AccessPath Source




accessPathBaseValue :: Value
accessPathBaseType :: Type

If there are some wonky bitcasts in play, this type records the real type of this path, even if the base was something unrelated and bitcast. The real type is the type casted to.

accessPathEndType :: Type
accessPathTaggedComponents :: [(Type, AccessType)]

data AccessType Source


AccessField !Int

Field access of the field with this index


A union access. The union discriminator is the type that this AccessType is tagged with in the AccessPath. Unions in LLVM do not have an explicit representation of their fields, so there is no index possible here.


An array access; all array elements are treated as a unit


A plain pointer dereference


accessPath :: Failure AccessPathError m => Instruction -> m AccessPathSource

For Store, RMW, and CmpXchg instructions, the returned access path describes the field stored to. For Load instructions, the returned access path describes the field loaded. For GetElementPtrInsts, the returned access path describes the field whose address was taken/computed.

reduceAccessPath :: Failure AccessPathError m => AbstractAccessPath -> m AbstractAccessPathSource

If the access path has more than one field access component, take the first field access and the base type to compute a new base type (the type of the indicated field) and the rest of the path components. Also allows for the discarding of array accesses.

Each call reduces the access path by one component

externalizeAccessPath :: Failure AccessPathError m => AbstractAccessPath -> m (String, [AccessType])Source

Convert an AbstractAccessPath to a format that can be written to disk and read back into another process. The format is the pair of the base name of the structure field being accessed (with struct. stripped off) and with any numeric suffixes (which are added by llvm) chopped off. The actually list of AccessTypes is preserved.

The struct name mangling here basically assumes that the types exposed via the access path abstraction have the same definition in all compilation units. Ensuring this between runs is basically impossible, but it is pretty much always the case.