j-      !"#$%&'()*+, 2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAll Safe-Inferred-Macro expansion text is divided into sections, each of which is classified as one of three kinds: a formal argument (Arg), plain text (Text), or a stringised formal argument (Str)..smart: constructor to avoid warnings from ghc (undefined fields)/NExpand an instance of a macro. Precondition: got a match on the macro name.07Parse a #define, or #undef, ignoring other # directives1BPretty-print hash defines to a simpler format, as key-value pairs.-23456789:;<=>?./01-23456789:;<=>?/01-4325>=;96???<?:<?78<./012000-2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk>StableAll Safe-Inferred@'Index Trees (storing indexes at nodes).AQSymbol Table. Stored values are polymorphic, but the keys are always strings.BCD@EFAGHIJKLMNOPQR@AGHIJKLBCD@FEAGHIJKLMNOPQR Safe-Inferred takes a filename (for error reports), and transforms the given string, to eliminate the literate comments from the program text. STUVWXYZ[\]SXWVUTYZ[\]2006 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAll Safe-Inferred^oRaw command-line options. This is an internal intermediate data structure, used during option parsing only."Options representable as Booleans.,Leave #define and #undef in output of ifdef? Place #line droppings in output?Write #line or {-# LINE #-} ?Keep #pragma in final output?&Remove C eol (//) comments everywhere?+Remove C inline (/**/) comments everywhere? Lex input as Haskell code? -Permit stringise # and catenate ## operators? $Retain newlines in macro expansions? Remove literate markup? Issue warnings?Cpphs options structure.&Files to #include before anything elseDefault options.$Default settings of boolean options._YParse a single raw command-line option. Parse failure is indicated by result Nothing.`Trim trailing elements of the second list that match any from the first list. Typically used to remove trailing forward/back slashes from a directory path.a7Convert a list of RawOption to a BoolOptions structure.Parse all command-line options.,^bcdefghijklmnop q_`a ` ^ponmlkjihgfedcb  q_`a2000-2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAll Safe-InferredSource positions contain a filename, line, column, and an inclusion point, which is itself another source position, recursively.#Constructor. Argument is filename.*Increment column number by given quantity.(Increment row number, reset column to 1.5Increment column number, tab stops are every 8 chars.'Increment row number by given quantity. 6Update position with a new row, and possible filename.!Project the line number."Project the filename.#&Project the directory of the filename.$#cpp-style printing of file position%'haskell-style printing of file position&<Conversion from a cpp-style "#line" to haskell-style pragma.r_Strip non-directory suffix from file name (analogous to the shell command of the same name).'dSigh. Mixing Windows filepaths with unix is bad. Make sure there is a canonical path separator. !"#$%&r's !"#$%&' !"#$%&r's2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAll Safe-InferredtAttempt to read the given file from any location within the search path. The first location found is returned, together with the file content. (The directory of the calling file is always searched first, then the current directory, finally any specified search path.)tfilenameinclusion point search pathreport warnings?&discovered filepath, and file contentstt2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAll Safe-InferreduEach token is classified as one of Ident, Other, or Cmd: * Ident is a word that could potentially match a macro name. * Cmd is a complete cpp directive (#define etc). * Other is anything else.vNSubmodes are required to deal correctly with nesting of lexical structures.wA Mode value describes whether to tokenise a la Haskell, or a la Cpp. The main difference is that in Cpp mode we should recognise line continuation characters.xlinesCpp is, broadly speaking, Prelude.lines, except that on a line beginning with a #, line continuation characters are recognised. In a line continuation, the newline character is preserved, but the backslash is not.y*Put back the line-continuation characters.z1tokenise is, broadly-speaking, Prelude.words, except that: * the input is already divided into lines * each word-like "token" is categorised as one of {Ident,Other,Cmd} * #define's are parsed and returned out-of-band using the Cmd variant * All whitespace is preserved intact as tokens. * C-comments are converted to white-space (depending on first param) * Parens and commas are tokens in their own right. * Any cpp line continuations are respected. No errors can be raised. The inverse of tokenise is (concatMap deWordStyle).{HParse a possible macro call, returning argument list and remaining inputu|}~vwxyz{ u|}~xyz{ u~}|vwxyz{ 2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAllNone(KWalk through the document, replacing calls of macros with the expanded RHS. auxiliary)Walk through the document, replacing calls of macros with the expanded RHS. Additionally returns the active symbol table after processing.$Turn command-line definitions (from -D) into 5s.5Turn a string representing a macro definition into a 5.fTrundle through the document, one word at a time, using the WordStyle classification introduced by z to decide whether to expand a word or macro. Encountering a #define or #undef causes that symbol to be overwritten in the symbol table. Any other remaining cpp directives are discarded and replaced with blanks, except for #line markers. All valid identifiers are checked for the presence of a definition of that name in the symbol table, and if so, expanded appropriately. (Bool arguments are: keep pragmas? retain layout? haskell language?) The result lazily intersperses output text with symbol tables. Lines are emitted as they are encountered. A symbol table is emitted after each change to the defined symbols, and always at the end of processing.Useful helper function.Useful helper function. ($Pre-defined symbols and their values#Options that alter processing styleThe input file contentThe file after processing)$Pre-defined symbols and their values#Options that alter processing styleThe input file content*The file and symbol table after processing() () 1999-2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAllNone BInternal state for whether lines are being kept or dropped. In  Drop n b ps, n is the depth of nesting, bP is whether we have already succeeded in keeping some lines in a chain of elif's, and ps# is the stack of positions of open #ifF contexts, used for error messages in case EOF is reached too soon.*Run a first pass of cpp, evaluating #ifdef's and processing #include's, whilst taking account of #define's and #undef's as we encounter them.EReturn just the list of lines that the real cpp would decide to keep.Auxiliary IO functions[The preprocessor must expand all macros (recursively) before evaluating the conditional.Expansion of symbols.5Return the expansion of the symbol (if there is one).HThe standard "parens" parser does not work for us here. Define our own.Determine filename in #include*File for error reports$Pre-defined symbols and their valuesSearch path for #includes Options controlling output styleThe input file content$The file after processing (in lines)** None+,+,+,2000-2006 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAllNone,  !"#$%&'()*+,,+,*()  $%&"!#'    !"#$%&'()*+,-./01 2 3 4 5 6789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu=Gvwxyz{|}~    cpphs-1.18.9Language.Preprocessor.UnlitLanguage.Preprocessor.Cpphs&Language.Preprocessor.Cpphs.HashDefine"Language.Preprocessor.Cpphs.SymTab#Language.Preprocessor.Cpphs.Options$Language.Preprocessor.Cpphs.Position%Language.Preprocessor.Cpphs.ReadFirst$Language.Preprocessor.Cpphs.Tokenise%Language.Preprocessor.Cpphs.MacroPass$Language.Preprocessor.Cpphs.CppIfdef$Language.Preprocessor.Cpphs.RunCpphsunlit BoolOptionsmacros locationshashlinepragmastripEolstripC89langansilayoutliteratewarnings CpphsOptionsinfilesoutfilesdefinesincludes preIncludebooloptsdefaultCpphsOptionsdefaultBoolOptions parseOptionsPosnPnnewfileaddcolnewlinetabnewlinesnewposlinenofilename directorycpplinehasklinecpp2hask cleanPath macroPassmacroPassReturningSymTabcppIfdefrunCpphsrunCpphsReturningSymTab ArgOrTextsymbolReplacement expandMacroparseHashDefinesimplifyHashDefinesStrTextArg HashDefineMacroExpansion arguments expansionSymbolReplacement replacement AntiDefined linebreaksPragmaLineDropnameIndTreeSymTabHashable hashWithMaxhashForkLeafemptySTinsertSTdeleteSTlookupST definedST flattenSTitgenitiapitinditfoldmaxHash $fHashable[] ClassifiedPreIncludeCommentBlankProgramclassify unclassifyadjacentmessageinlines RawOption rawOptiontrailingboolOptsIgnoredForCompatibility PreIncludePathMacroSuppressWarningsUnlitLayoutAnsiStripEolStrip LinePragmaNoLineNoMacroflagsdirname $fShowPosn readFirst WordStyleSubModeModelinesCppreslashtokeniseparseMacroCallCmdOtherIdent CLineCommentCComment NestComment LineCommentStringPredAnyCppHaskellother deWordStyle onlyRights preDefine defineMacro macroProcessemit emitSymTabnoPos KeepStatecppemitOne preExpandexpandSymOrCallparseSymOrCall parenthesisfileDropKeepemitMany gatherDefined parseBoolExp parseExp1 parseExp0parseArithExp1parseArithExp0 parseNumber parseCmpOp parseArithOp1 parseArithOp0recursivelyExpandparseSymnotIdentskip