kb1      All experimental/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk>KMacro expansion text is divided into sections, each of which is classified F as one of three kinds: a formal argument (Arg), plain text (Text), * or a stringised formal argument (Str).  !"#$%&'smart; constructor to avoid warnings from ghc (undefined fields) (Expand an instance of a macro. 0 Precondition: got a match on the macro name. )Parse a # define, or #undef, ignoring other # directives  !"#$%&() %$!&&&"#& # !"#$%&()All experimental/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> *:Source positions contain a filename, line, column, and an = inclusion point, which is itself another source position,  recursively. +, Constructor -Updates ./012 Projections 345cpp-style printing 6BStrip non-directory suffix from file name (analogous to the shell  command of the same name). *+,-./012345 *++,-./012345All experimental/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk>7IAttempt to read the given file from any location within the search path. I The first location found is returned, together with the file content. E (The directory of the calling file is always searched first, then ? the current directory, finally any specified search path.)  filename inclusion point  search path report warnings? 'discovered filepath, and file contents 77All experimental/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk>89Each token is classified as one of Ident, Other, or Cmd: @ * Ident is a word that could potentially match a macro name. & * Cmd is a complete cpp directive (# define etc).  * Other is anything else. 9:;<@Submodes are required to deal correctly with nesting of lexical  structures. =>?@ABCFA Mode value describes whether to tokenise a la Haskell, or a la Cpp. D The main difference is that in Cpp mode we should recognise line  continuation characters. DEF:linesCpp is, broadly speaking, Prelude.lines, except that  on a line beginning with a ##, line continuation characters are A recognised. In a line continuation, the newline character is ( preserved, but the backslash is not. G+Put back the line-continuation characters. HIJ;tokenise is, broadly-speaking, Prelude.words, except that: . * the input is already divided into lines  * each word-like token, is categorised as one of {Ident,Other,Cmd}  * #define'<s are parsed and returned out-of-band using the Cmd variant 4 * All whitespace is preserved intact as tokens. I * C-comments are converted to white-space (depending on first param) 7 * Parens and commas are tokens in their own right. 0 * Any cpp line continuations are respected.  No errors can be raised. 7 The inverse of tokenise is (concatMap deWordStyle). KIParse a possible macro call, returning argument list and remaining input 89:;FGIJK 8;:99:;FGIJK%LThe parser monad MNOPQRSTUVWXYZ[\]^_`abcdefghijklmnop$LMOPQRSTUVWXYZ[\]^_`abcdefghijklmnop$LMMOPQRSTUVWXYZ[\]^_`abcdefghijklmnopAllStable/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk>qrst(Index Trees (storing indexes at nodes). uvw?Symbol Table. Stored values are polymorphic, but the keys are  always strings. xyz{|}~twxyz{|twxyz{| : takes a filename (for error reports), and transforms the K given string, to eliminate the literate comments from the program text.  All experimental/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk>%ARaw command-line options. This is an internal intermediate data / structure, used during option parsing only. #Options representable as Booleans. Leave # define and #undef in output of ifdef? Place #line droppings in output? Keep #pragma in final output? Remove C comments everywhere? Lex input as Haskell code? Permit stringise  and catenate # operators? %Retain newlines in macro expansions? Remove literate markup? Issue warnings? Cpphs options structure. Default options. %Default settings of boolean options. GParse a single raw command-line option. Parse failure is indicated by  result Nothing. 8Convert a list of RawOption to a BoolOptions structure.  Parse all command-line options.        All experimental/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk>LWalk through the document, replacing calls of macros with the expanded RHS. %Pre-defined symbols and their values $Options that alter processing style The input file content The file after processing $Turn command-line definitions (from -D) into s. 5Turn a string representing a macro definition into a . FTrundle through the document, one word at a time, using the WordStyle  classification introduced by J to decide whether to expand a " word or macro. Encountering a # define or #undef causes that symbol to K be overwritten in the symbol table. Any other remaining cpp directives 6 are discarded and replaced with blanks, except for #line markers. F All valid identifiers are checked for the presence of a definition H of that name in the symbol table, and if so, expanded appropriately. J (Bool arguments are: keep pragmas? retain layout? haskell language?)  All experimental/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk><Internal state for whether lines are being kept or dropped.  In Drop n b, n is the depth of nesting, b is whether A we have already succeeded in keeping some lines in a chain of  elif's $Run a first pass of cpp, evaluating #ifdef's and processing #include's,  whilst taking account of #define's and #undef's as we encounter them. File for error reports %Pre-defined symbols and their values Search path for # includes !Options controlling output style The input file content %The file after processing (in lines) FReturn just the list of lines that the real cpp would decide to keep. Determine filename in #include  All experimental/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk>                        ! " #$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~         & /      cpphs-1.5Language.Preprocessor.UnlitLanguage.Preprocessor.Cpphs&Language.Preprocessor.Cpphs.HashDefine$Language.Preprocessor.Cpphs.Position%Language.Preprocessor.Cpphs.ReadFirst$Language.Preprocessor.Cpphs.Tokenise#Text.ParserCombinators.HuttonMeijer"Language.Preprocessor.Cpphs.SymTab#Language.Preprocessor.Cpphs.Options%Language.Preprocessor.Cpphs.MacroPass$Language.Preprocessor.Cpphs.CppIfdef$Language.Preprocessor.Cpphs.RunCpphsunlit BoolOptionsmacros locationspragmastriplangansilayoutliteratewarnings CpphsOptionsinfilesoutfilesdefinesincludesbooloptsdefaultCpphsOptionsdefaultBoolOptions parseOptions macroPasscppIfdefrunCpphs ArgOrTextStrTextArg HashDefineMacroExpansion arguments expansionSymbolReplacement replacement linebreaksPragmaLineDropnamesymbolReplacement expandMacroparseHashDefinePosnPnnewfileaddcolnewlinetabnewlinesnewposlinenofilename directorycpplinedirname readFirst WordStyleCmdOtherIdentSubModeCComment NestComment LineCommentStringPredAnyModeCppHaskelllinesCppreslashother deWordStyletokeniseparseMacroCallParserPTokenitemfirstpapply+++satmanymany1sepbysepby1chainlchainl1chainrchainr1opsbracketchardigitlowerupperletteralphanumstringidentnatintspacescommentjunkskiptokennaturalintegersymbol identifierHashable hashWithMaxhashIndTreeForkLeafSymTabemptySTinsertSTdeleteSTlookupST definedSTitgenitiapitindmaxHash ClassifiedPreIncludeCommentBlankProgramclassify unclassifyadjacentmessageinlines RawOptionPathMacroSuppressWarningsUnlitLayoutAnsiStripNoLineNoMacroflags rawOptiontrailingboolOptsnoPos preDefine defineMacro macroProcess KeepStateDropKeepcpp gatherDefined parseBoolExp parseExp1 parseExp0parseOpparseSymOrCallrecursivelyExpandparseSymfile