|Version 64 (modified by ezyang, 2 years ago)|
GHC's glorious new code generator
Last updated: March 2010.
This page summarises work that Norman Ramsey, Simon M, Simon PJ, and John Dias are doing on re-architecting GHC's back end. Here is the state of play; see also work on the LLVM back end.
- John D has built a complete new codegen pipeline, running alongside the old one, enabled by -fuse-new-codegen. It is described here: Commentary/Compiler/NewCodeGenPipeline. It uses a new representation for Cmm, mostly with "Z" in the name. (Let's call the original Cmm OldCmm and this new one CmmZ.) It has a new conversion STG->CmmZ, and then sequence of passes that optimise and cps-convert the Cmm. Finally, it is converted back to the old Cmm so that it can flow to the old code generators.
- Compiling through the new pipeline passes all tests and GHC is bootstrappable. There is a patch pending for a pointer tagging bug that manifests under rare conditions with the compacting GC.
- Separately, we have developed yet another, and still better, Cmm representation, the subject of an upcoming ICFP 2010 submission. It uses phantom types and GADTs to add very useful open/closed invariants. This isn't in GHC at all yet. I'll call it CmmGADT for easy reference.
Generally we want to keep old and new pipelines working simultaneously, so that we can swith only when we are sure the new stuff works. Next steps in this grand plan are:
- Check the impact on compilation time of the new route.
- Finalise CmmGADT and make the new pipeline use it.
- Make the Cmm parser (which parses .cmm files from the RTS) produce CmmGADT, and push that down the new pipeline.
- Implement the many refactorings and improvements to the new pipeline described in http://darcs.haskell.org/ghc/compiler/cmm/cmm-notes.
- Instead of converting new Cmm to old Cmm, make the downstream code generators consume CmmGADT, and convert old Cmm to CmmGADT.
- Expand the capability of the new pipeline so that it does native code generation too, and we can ultimately discard the existing code generators. The design of this stage is here: Commentary/Compiler/IntegratedCodeGen
Workflow for the new code generator and Hoopl
We have the following repositories:
- HEAD: the main GHC Darcs repo. http://darcs.haskell.org/ghc
- HooplMaster: the master Hoopl Git repository.
(Physical location: linux.cs.tufts.edu:/r/ghc/www/hoopl/hoopl.git)
- HooplMirror: a Darcs mirror of the HooplMaster repo. Any push
to HooplMaster is automatically mirrored to HooplMirror. Do not (ever)
push to HooplMirror.
(Physical location: 'linux.cs.tufts.edu:/r/ghc/darcs-mirrors/hoopl')
- HooplLag: a Darcs repo that is guaranteed to work with GHC HEAD. It is
not automatically updated by pushes to HooplMaster. Instead a manual
process (below) updates it; hence "lag".
Normal GHC developers, who are uninterested in Hoopl, ignore all this. If they download HEAD, and then do darcs-all get they'll get HooplLag, which is always guaranteed to work with HEAD.
Developers who work on GHC and also need to modify Hoopl have a more torrid time.
- Create two trees:
- Development tree (mixed Darcs and Git):
- Download HEAD and darcs-all get
- Remove libraries/hoopl and do a git clone of HooplMaster instead.
- Validation tree (pure Darcs):
- Download HEAD and darcs-all get
- Development tree (mixed Darcs and Git):
- Hack away in the development tree. You can pull freely:
- Subsequent darcs-all pull operations will [check!] no-op in the libraries/hoopl directory.
- For hoopl you have to manually do git pull in `libraries/hoopl to pull patches from HooplMaster.
- Record Darcs patches and Hoopl patches.
- Validate changes as follows:
- Run validate in the development tree
- git push patches to HooplMaster
- Wait for HooplMirror to see these patches
- In the validation tree,
- darcs-all pull patches from the development tree (this will [check!] no-op for hoopl, because the dev-tree has a Git repo)
- Manually pull patches from HooplMirror
- Validate in the validation tree
- Push the validation tree to HEAD and the HooplLag Darcs repo
Bug list (code-gen related bugs that we may be able to fix):
- #1498 (avoid redundant heap check on the fast path)
- #3552 (unreachable code)
- #3462 (a feature)
Below here is old stuff, mostly out of date
Loose ends as at October 2008
- Check nofib performance regressions (one or two allocation (odd)); several on time.
- Compile libraries and test
- Nuke CmmCPSGen, CmmCPS, and the two Maybes in CmmInfo data type.
- Remove optionallyConvertAndOrCPS from main pipeline.
- Fix if-then-else special case in StgCmmExpr
- Perhaps some obvious CSE opportunities
- We only use one GC entry point, but there should be a bunch of canned ones.
- To Convention add StaticNative (which does not use R1), and use it appropriately. Replace GC by gc = Native.
- Allow .cmx files to come in before the new pipeline, to aid graceful migration of RTS.
- Migrate RTS to use procedures
- Explore new calling conventions
- Get rid of SRTs and live-variable info from STG, and from the Core-to-Stg phase.
- Do not split proc-points into separate CmmProc. Not a trivial change, because it involves attaching info tables to blocks, not just to CmmProcs.
- Nuke old code gen, and associated Cmm cruft
- Simplify dataflow framework, by doing deep rewriting only. If possible kill LastExit (Simon's favourite hate).
Notes about the state of play in August, 2008
These notes are largely out of date, but I don't want to dump them till we're sure that we've sucked all the juice out of them.
- Code generator: first draft done.
- Control-flow opt: simple ones done
- Common block elimination: done
- Block concatenation: done
- Adams optimisation: currently done in compiler/cmm/CmmProcPointZ.hs. The Adams optimization should be divorced from this module and replaced with common-block elimination, to be done after the proc-point transformation. In principle this combination may be slightly less effective than the current code, since the selection of proc-point protocols is guided by Adams's criteria, but NR thinks it will be easy to get the common, important cases nailed. Note that the Adams optimization is designed to avoid generating extra continuations in the case of heap checks after a function call.
- Proc-point analysis and transformation: working, although there is plenty of room for experimentation with the calling conventions at proc points. In practice NR recommends the following procedure:
- All optional proc points to be generated with no parameters (all live variables on the stack)
- This situation to be remedied when the code generator is reorganized along the lines NR proposed in July 2007, i.e., the register allocator runs on C-- with calls (as opposed to C-- with jumps only) and therefore before proc-point analysis
- Bypassing proc-point analysis for native back ends: not done.
- Add spill/reload: Implemented to NR's satisfaction in compiler/cmm/CmmSpillReload.hs, with the proviso that spilling is done to abstract stack slots rather than real stack positions (see comments below on stack-slot allocation)
- Stack slot allocation: implemented with a greedy algorithm. There is room for improvement here.
- Make stack explicit: done.
- Split into multiple CmmProcs: done.
ToDo: main issues
- How do we write continuations in the RTS? E.g. the update-frame continuation? Michael Adams had a syntax with two sets of parameters, the the ones on the stack and the return values.
- Review code gen for calls with lots of args. In the existing codegen we push magic continuations that say "apply the return value to N more args". Do we want to do this? ToDo: how rare is it to have too many args?
- Figure out how PAPs work. This may interact with the GC check and stack check at the start of a function call. The concern is how to enter the garbage collector with an infotable that properly describes the live variables. Now that we generate info tables on demand at the end of the pipeline, we can enter the gc with a regular procedure call and expect that the proper info table will be generated.
- Was there something about sinking spills and hoisting reloads?
ToDo: small issues
- Shall we rename Branch to GoTo?!
- Change the C-- parser (which parses RTS .cmm files) to directly construct CmmGraph.
- (SLPJ) See let-no-escape todos in StgCmmExpr.