GHC's glorious new code generator
This page summarises work that Norman Ramsey, Simon M, Simon PJ, and John Dias are doing on re-architecting GHC's back end. Here is the state of play; see also work on the LLVM back end.
- Bug list (code-gen related bugs that we may be able to fix):
- (Sept 12) New code generator is live. Here's the page listing clean-up tasks that we can now do.
- Simon M added a Blog Post about the new code generator status
- Link to Krzysztof Wos's project fixed link, in which he reports great performance improvements by turning tail recursion into loops in C--.
- Norman added a Hoopl performance page
- Edward Yang has a wiki page that describes shortcomings of the code generated by the new pipeline: Commentary/Compiler/NewCodeGenStupidity
- John D has built a complete new codegen pipeline, running alongside the old one, enabled by -fuse-new-codegen. It is described here: Commentary/Compiler/NewCodeGenPipeline. It uses a new representation for Cmm, mostly with "Z" in the name. (Let's call the original Cmm OldCmm and this new one CmmZ.) It has a new conversion STG->CmmZ, and then sequence of passes that optimise and cps-convert the Cmm. Finally, it is converted back to the old Cmm so that it can flow to the old code generators.
- Compiling through the new pipeline passes all tests and GHC is bootstrappable.
- Separately, we have developed yet another, and still better, Cmm representation, the subject of an upcoming ICFP 2010 submission. It uses phantom types and GADTs to add very useful open/closed invariants. This isn't in GHC at all yet. I'll call it CmmGADT for easy reference.
Generally we want to keep old and new pipelines working simultaneously, so that we can switch only when we are sure the new stuff works. Next steps in this grand plan are:
- Check the impact on compilation time of the new route.
- Finalise CmmGADT and make the new pipeline use it.
- Make the Cmm parser (which parses .cmm files from the RTS) produce CmmGADT, and push that down the new pipeline.
- Implement the many refactorings and improvements to the new pipeline described in http://darcs.haskell.org/ghc/compiler/cmm/cmm-notes. See also: Commentary/Compiler/NewCodeGenStupidity
- Instead of converting new Cmm to old Cmm, make the downstream code generators consume CmmGADT, and convert old Cmm to CmmGADT.
- Expand the capability of the new pipeline so that it does native code generation too, and we can ultimately discard the existing code generators. The design of this stage is here: Commentary/Compiler/IntegratedCodeGen
Workflow for the new code generator and Hoopl
We have the following repositories:
- HEAD: the main GHC git repo. http://darcs.haskell.org/ghc.git
- HooplMaster: the master Hoopl Git repository.
(Physical location: linux.cs.tufts.edu:/r/ghc/www/hoopl/hoopl.git)
- HooplLag: a Git repo that is guaranteed to work with GHC HEAD. It is
not automatically updated by pushes to HooplMaster. Instead a manual
process (below) updates it; hence "lag".
Normal GHC developers, who are uninterested in Hoopl, ignore all this. If they download HEAD, and then do ./sync-all get they'll get HooplLag, which is always guaranteed to work with HEAD.
Developers who work on GHC and also need to modify Hoopl need to ensure their changes end up in both repositories.
- In your hoopl directory in your development tree, add HooplMaster as a remote and update your reference there.
- Hack away in the development tree.
- Record Hoopl commits.
- Run validate in the development tree
- Push the commits in hoopl to the HooplMaster Git repo
- Wait for the mirrors to update (the impatient can run /srv/darcs/do_mirrors on darcs.haskell.org)
- Push the commits in hoopl to the HooplLag Git repo (probably the origin remote)
Status report April 2011
Term’s starting up for me again, and as a result, I will have less time to throw at the new code generator. As such, here is the state of play, as of April 2011.
We currently have a code generator that is correct, but slow and stupid. The primary optimization problem that I was tackling at the end of Spring Break was the following: newly compiled programs used a lot more stack space than their old equivalents, due to poor usage of call areas for register reloading and an over-conservative stack layout algorithm. Our new plan is to overhaul the spiller and the stack layout engine in the following manner:
1. We teach the register spiller about another location: a value may be in a register, on the stack in a register slot, OR (and this is the new bit) inside a call area. If it’s in a call area, reload from that instead of of its slot location. This might require borrowing some code from the code motion improvements in my private branch ( http://github.com/ezyang/ghc).
2. We change the stack layout code to perform interference on a per-word, rather than per-area level. Because this may cause a very huge interference graph, we bound the number of conflicts we remember; instead of recording every slot a slot may conflict with, we remember some number, and if there are more, we just say it conflicts with everything. (There is some potential for statistics and tuning parameters here.)