|Version 4 (modified by simonmar, 6 years ago)|
In GHC we "tag" pointers to heap objects with information about the object they point to. The tag goes in the low 2 bits of the pointer, which would normally be zero since heap objects are always word-aligned (3 bits on a 64-bit platform).
The way the tag bits are used depends on the type of object pointed to:
- If the object is a constructor, the tag bits contain the constructor tag, if the number of constructors in the datatype is less than 4 (less than 8 on a 64-bit platform). If the number of constructors in the datatype is more than 4 (resp 8), then the tag bits have the value 1.
- If the object is a function, the tag bits contain the arity of the function, if the arity fits in the tag bits.
- For a pointer to any other object, the tag bits are always zero.
Pointer tagging is optional: if all the tags were zero, everything would still work. The presence of tag bits enables certain optimisations, however:
- In a case-expression, if the variable being scrutinised has non-zero tag bits, then we know that it points directly to a constructor and we can avoid entering it to evaluate it. Furthermore, for datatypes with only a few constructors, the tag bits will tell us which constructor it is, eliminating a further memory load to extract the constructor tag from the info table.
- In a generic apply, if the function being applied has a tag value that indicates it has exactly the right arity for the number of arguments being applied, we can jump directly to the function, instead of inspecting its info table first.
Pointer-tagging is a fairly significant optimisation: we measured 10-14% depending on platform. A large proportion of this comes from eliminating the indirect jumps in a case expression, which are hard to predict by branch-prediction. The paper has full results and analysis.
The garbage collector maintains tag bits on the pointers it traverses; additionally when it eliminates an indirection it takes the tag bits from the pointer inside the indirection.
In the current implementation, all pointers are guaranteed to be tagged, except for pointers to static constructors or functions. We cannot guarantee to correctly tag a reference to a static closure, because the compiler does not necessarily know what the tag value should be if the static closure resides in another module.
When optimisation is on, we do know the arities of external functions, and this information is indeed used to tag pointers to imported functions, but when optimisations is off we do not have this information. For constructors, the interface doesn't contain information about the constructor tag, except that there may be an unfolding, but the unfolding is not necessarily reliable (the unfolding may be a constructor application, but in reality the closure may be a CAF, e.g. if any of the fields are references outside the current shared library).
Do we ever assume that a pointer is tagged? Yes, in the following places:
- In the continuation of an algebraic case, R1 is assumed tagged
- On entry to a non-top-level function, R1 is assumed tagged
These assumptions make the code faster: when extracting values from the closure pointed to be R1, we just subtract the (known) tag from the offset.
So each place that enters one of these code fragments must ensure that it is correctly tagging R1. Here are the cases for an algebraic case alternative:
- the scrutinee of the case jumps directly to the alternative, if R1 is already tagged
- the constructor entry code returns to an alternative. This code adds the correct tag.
- if the case alternative fails a heap or stack check, then the RTS will re-enter the alternative after GC. In this case, our re-entry arranges to enter the constructor, so we get the correct tag by virtue of going through the constructor entry code.
Here are the cases for a function:
- we can assume that pointers to non-top-level functions are always tagged, so entering directly is safe.
- unknown function application goes via stg_ap_XXX (see [wiki:Commentary/Rts/HaskellExecution/FunctionCalls#GenericApply Generic Apply). The generic apply functions must therefore arrange to correctly tag R1 before entering the function.
Compacting GC also needs to tag pointers, because it needs to distinguish between a heap pointer and an info pointer quickly. Unfortunately, this means we lose one tag value (not tag bit) in our space of tags. Therefore we have 3 tag values (including 0) available on a 32-bit machine, and 7 on a 64-bit machine.
Dealing with tags in the code
Every time we dereference a pointer to a heap object, we must first zero the tag bits. In the RTS, this is done with the macro UNTAG_CLOSURE(); in .cmm code this is done with the UNTAG() macro. Surprisingly few places needed untagging to be added.