Support for Generic Ghc Api Ast Traversals
This Wiki page is a spin off of GhcApiStatus, focussing on information related to generic traversals (transformations/queries) over the Ghc Ast types.
Data.Traversable
David Waern has been working on deriving Data.Traversable for the Ast types, to get mapM for use in Haddock 2. He points out that, by default, SYB-style traversals do not seem to support type-changing maps.
Data.Generics
Since Ghc does support (standalone) deriving of Data and Typeable, it seems natural to want those classes instantiated for the Ghc Api types. In case that would lead to blowup of standard releases, it would be even better if those instances could be provided as an add-on to a release.
- for recent code/documentation, see the syb-utils darcs repo (rss feed of darcs changes):
darcs get http://www.cs.kent.ac.uk/~cr3/toolbox/haskell/syb-utils
- until recently, standalone deriving of Data/Typeable was broken (#2378, fixed in HEAD, 01.07.08)
- standalone deriving of Typeable fails if data constructors are not in scope - this looks like a bug (#2433, fixed in HEAD, 04.08.08)
- standalone deriving of Data fails if data constructors are not in scope - this means that Data instances are in conflict with the intended data abstraction for some Ghc Ast types!
- using the internal CPP hackery of Data.Generics for Typeable, standalone deriving of Data for non-abstract types, and the traditional manual dummy Data instances for abstract types, it seems we can get initial Data.Generics support over the Ghc Ast types!
- since I want to be able to identify the abstract types, I've replaced the traditional toConstr _ = error "toConstr":
abstractConstr n = mkConstr (abstractDataType n) ("{abstract:"++n++"}") [] Prefix
abstractDataType n = mkDataType n [abstractConstr n]
-- TypeableINSTANCE_TYPEABLE0(SrcSpan,srcSpanTc,"SrcSpan")
instance Data SrcSpan where
toConstr _ = abstractConstr "SrcSpan"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNorepType "SrcSpan"
TODO: Is that ok, or is it going to cause problems?
- putting all those instances in a single file seems to trigger a memory bug in the compiler (#2438), but splitting them into two files appears to avoid that. See the attached Instances and Instances0
- for starters, here's a Data-based show that shows the constructors/abstract types instead of pretty-printing them:
-- generic Data-based show, with special cases for GHC Ast types,
-- showing abstract types abstractly and avoiding known potholes
showData :: Data a => Stage -> Int -> a -> String
showData stage n =
generic `ext1Q` list `extQ` string `extQ` fastString `extQ` srcSpan
`extQ` name `extQ` occName `extQ` moduleName `extQ` var `extQ` dataCon
`extQ` bagName `extQ` bagRdrName `extQ` bagVar `extQ` nameSet
`extQ` postTcType `extQ` fixity
where generic :: Data a => a -> String
generic t = indent n ++ "(" ++ showConstr (toConstr t)
++ space (concat (intersperse " " (gmapQ (showData stage (n+1)) t))) ++ ")"
space "" = ""
space s = ' ':s
indent n = "\n" ++ replicate n ' '
string = show :: String -> String
fastString = ("{FastString: "++) . (++"}") . show :: FastString -> String
list l = indent n ++ "["
++ concat (intersperse "," (map (showData stage (n+1)) l)) ++ "]"
name = ("{Name: "++) . (++"}") . showSDoc . ppr :: Name -> String
occName = ("{OccName: "++) . (++"}") . OccName.occNameString
moduleName = ("{ModuleName: "++) . (++"}") . showSDoc . ppr :: ModuleName -> String
srcSpan = ("{"++) . (++"}") . showSDoc . ppr :: SrcSpan -> String
var = ("{Var: "++) . (++"}") . showSDoc . ppr :: Var -> String
dataCon = ("{DataCon: "++) . (++"}") . showSDoc . ppr :: DataCon -> String
bagRdrName:: Bag (Located (HsBind RdrName)) -> String
bagRdrName = ("{Bag(Located (HsBind RdrName)): "++) . (++"}") . list . bagToList
bagName :: Bag (Located (HsBind Name)) -> String
bagName = ("{Bag(Located (HsBind Name)): "++) . (++"}") . list . bagToList
bagVar :: Bag (Located (HsBind Var)) -> String
bagVar = ("{Bag(Located (HsBind Var)): "++) . (++"}") . list . bagToList
nameSet | stage `elem` [Parser,TypeChecker]
= const ("{!NameSet placeholder here!}") :: NameSet -> String
| otherwise
= ("{NameSet: "++) . (++"}") . list . nameSetToList
postTcType | stage<TypeChecker = const "{!type placeholder here?!}" :: PostTcType -> String
| otherwise = showSDoc . ppr :: Type -> String
fixity | stage<Renamer = const "{!fixity placeholder here?!}" :: GHC.Fixity -> String
| otherwise = ("{Fixity: "++) . (++"}") . showSDoc . ppr :: GHC.Fixity -> String
For example usage, see the attached APISybTesting: it parses a TestModule, prettyprints and shows, does an identity transform, and an example query (extract classes and family declarations).
- some of the predefined instances in Data.Generics.Instances are dummies, preventing traversals of substructures or pretending that non-concrete types are Data - this should probably be changed by splitting that module and not re-exporting the dummy part from Data.Generics by default. See http://www.haskell.org/pipermail/generics/2008-June/000347.html, http://www.haskell.org/pipermail/generics/2008-June/000346.html, http://www.haskell.org/pipermail/generics/2008-June/000342.html for more info.
- it seems as if the issue of type-changing traversals can be addressed within SYB, although those dummy instances cause trouble here.
- Traversable Functor Data,or: X marks the spot, a technique based on gfoldl plus one nasty and two non-nasty uses of unsafeCoerce. The nasty use interacts badly with the dummy instances.
- gMap in SYB1, a technique based on gfoldl, gunfoldl, and some Dynamic types in the middle. Avoids unsafeCoerce, and uses the fact that the dummy instances are slightly more honest about gunfoldl than about gfoldl, but gMap cannot distinguish functor parameters from equivalent types (no fmap) and its type is too general for direct use (runtime type errors instead of compile time type errors)
- http://www.haskell.org/pipermail/generics/2008-July/000351.html, combining the advantages of the previous two techniques to define a Data-based fmap. The remaining uses of unsafeCoerce only distinguish the functor parameter type from equivalent types occuring elsewhere in the functor, so seem to be ok?
- performance: Syb traversals are sometimes associated with bad performance. Unfortunately, none of the anecdotes I'm aware of really investigates these performance issues in any detail (if you have useful data to add here, please do!). So I can only say here that
- there are various more or less obvious ways in which naive use of Syb traversal schemes can lead to very inefficient traversals
- until proven wrong, I'll assume that all of these traps can be avoided by careful tuning of traversal schemes!-)
- three example issues are listed in Data.Generics with GPS (using Maps to avoid getting lost in Data), which also provides a generalisation of the main Uniplate PlateData (Uniplate over Syb) optimisation, in a form that can be used to address one of the performance issues of everywhere and everything (traversal of irrelevant substructures) - this gives the expected speedup of naive traversal schemes
- I've been trying to address another of those three issues (too many runtime type checks, specifically linear lookup), but the overhead of working a map of functions of different types means that this isn't really a win (for the typical few specific cases, linear lookup is faster, and as the number of specific cases grows, they tend to be organised so as to reduce type checks and traversals, so having a map is still not necessarily a win)
Attachments
- APItest.hs (45 bytes) -
oops, wrong file, please delete
, added by claus on 07/11/08 07:06:18. - APISybTesting.2.hs (31 bytes) -
please delete this attachment
, added by claus on 07/13/08 15:00:46. - Utils.hs (3.7 kB) -
Data-based show and queries
, added by claus on 07/15/08 06:07:32. - Instances0.hs (4.8 kB) -
derived instances Data/Typeable, part1
, added by claus on 07/15/08 06:08:20. - Instances.hs (5.6 kB) -
derived instances Data/Typeable, part2
, added by claus on 07/15/08 06:09:09. - TestModule.hs (441 bytes) -
a test input module for APISybTesting
, added by claus on 07/15/08 06:09:59. - APISybTesting.hs (2.3 kB) -
some example uses of SYB traversals, using derived instances and Data-based Utils over TestModule?
, added by claus on 07/15/08 06:11:18. - CoreSynTraverse.hs (2.7 kB) -
Slightly eccentric CoreSyn? instances
, added by batterseapower on 07/16/08 06:39:36.
