4      !"#$%&'()*+,-./0123 Safe takes a filename (for error reports), and transforms the given string, to eliminate the literate comments from the program text. 456789:;<=>456789:;<=>2000-2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk>StableAllSafe?'Index Trees (storing indexes at nodes).@QSymbol Table. Stored values are polymorphic, but the keys are always strings.ABC?DE@FGHIJKLMNOPQ?@FGHIJKABC?DE@FGHIJKLMNOPQ2000-2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAllSafeSource positions contain a filename, line, column, and an inclusion point, which is itself another source position, recursively.#Constructor. Argument is filename.*Increment column number by given quantity.(Increment row number, reset column to 1.5Increment column number, tab stops are every 8 chars.'Increment row number by given quantity.6Update position with a new row, and possible filename. Project the line number. Project the filename. &Project the directory of the filename. #cpp-style printing of file position 'haskell-style printing of file position<Conversion from a cpp-style "#line" to haskell-style pragma.R_Strip non-directory suffix from file name (analogous to the shell command of the same name).dSigh. Mixing Windows filepaths with unix is bad. Make sure there is a canonical path separator. RS  RS2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAllSafeTAttempt to read the given file from any location within the search path. The first location found is returned, together with the file content. (The directory of the calling file is always searched first, then the current directory, finally any specified search path.)Tfilenameinclusion point search pathreport warnings?&discovered filepath, and file contentsUVTUVTUV2006 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAllSafeWoRaw command-line options. This is an internal intermediate data structure, used during option parsing only."Options representable as Booleans.,Leave #define and #undef in output of ifdef? Place #line droppings in output?Write #line or {-# LINE #-} ?Keep #pragma in final output?&Remove C eol (//) comments everywhere?+Remove C inline (/**/) comments everywhere?Lex input as Haskell code?-Permit stringise # and catenate ## operators?$Retain newlines in macro expansions?Remove literate markup?Issue warnings?Cpphs options structure.#&Files to #include before anything else%Default options.&$Default settings of boolean options.XYParse a single raw command-line option. Parse failure is indicated by result Nothing.YTrim trailing elements of the second list that match any from the first list. Typically used to remove trailing forward/back slashes from a directory path.Z7Convert a list of RawOption to a BoolOptions structure.'Parse all command-line options.,W[\]^_`abcdefghi !"#$%&jXYZ' !"#$%&Y' W[\]^_`abcdefghi  !"#$%&jXYZ'2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAllSafekMacro expansion text is divided into sections, each of which is classified as one of three kinds: a formal argument (Arg), plain text (Text), or a stringised formal argument (Str).lsmart: constructor to avoid warnings from ghc (undefined fields)mNExpand an instance of a macro. Precondition: got a match on the macro name.n7Parse a #define, or #undef, ignoring other # directivesoBPretty-print hash defines to a simpler format, as key-value pairs.kpqrstuvwxyz{|}lmnokqprsutvwxyz{|}mnokpqrstuvwxyyyzy{zy|}zlmno2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAllSafe(Each token is classified as one of Ident, Other, or Cmd: * Ident is a word that could potentially match a macro name. * Cmd is a complete cpp directive (#define etc). * Other is anything else.~NSubmodes are required to deal correctly with nesting of lexical structures.A Mode value describes whether to tokenise a la Haskell, or a la Cpp. The main difference is that in Cpp mode we should recognise line continuation characters.linesCpp is, broadly speaking, Prelude.lines, except that on a line beginning with a #, line continuation characters are recognised. In a line continuation, the newline character is preserved, but the backslash is not.*Put back the line-continuation characters.,1tokenise is, broadly-speaking, Prelude.words, except that: * the input is already divided into lines * each word-like "token" is categorised as one of {Ident,Other,Cmd} * #define's are parsed and returned out-of-band using the Cmd variant * All whitespace is preserved intact as tokens. * C-comments are converted to white-space (depending on first param) * Parens and commas are tokens in their own right. * Any cpp line continuations are respected. No errors can be raised. The inverse of tokenise is (concatMap deWordStyle).HParse a possible macro call, returning argument list and remaining input()*+~, ()*+, ()*+~, 2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAllNone-KWalk through the document, replacing calls of macros with the expanded RHS. auxiliary.Walk through the document, replacing calls of macros with the expanded RHS. Additionally returns the active symbol table after processing.$Turn command-line definitions (from -D) into ss.5Turn a string representing a macro definition into a s.fTrundle through the document, one word at a time, using the WordStyle classification introduced by , to decide whether to expand a word or macro. Encountering a #define or #undef causes that symbol to be overwritten in the symbol table. Any other remaining cpp directives are discarded and replaced with blanks, except for #line markers. All valid identifiers are checked for the presence of a definition of that name in the symbol table, and if so, expanded appropriately. (Bool arguments are: keep pragmas? retain layout? haskell language?) The result lazily intersperses output text with symbol tables. Lines are emitted as they are encountered. A symbol table is emitted after each change to the defined symbols, and always at the end of processing.Useful helper function.Useful helper function. -$Pre-defined symbols and their values#Options that alter processing styleThe input file contentThe file after processing.$Pre-defined symbols and their values#Options that alter processing styleThe input file content*The file and symbol table after processing-. -. 1999-2004 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAllNone BInternal state for whether lines are being kept or dropped. In  Drop n b ps, n is the depth of nesting, bP is whether we have already succeeded in keeping some lines in a chain of elif's, and ps# is the stack of positions of open #ifF contexts, used for error messages in case EOF is reached too soon./Run a first pass of cpp, evaluating #ifdef's and processing #include's, whilst taking account of #define's and #undef's as we encounter them.EReturn just the list of lines that the real cpp would decide to keep.Auxiliary IO functions[The preprocessor must expand all macros (recursively) before evaluating the conditional.Expansion of symbols.5Return the expansion of the symbol (if there is one).HThe standard "parens" parser does not work for us here. Define our own.Determine filename in #include/File for error reports$Pre-defined symbols and their valuesSearch path for #includes Options controlling output styleThe input file content$The file after processing (in lines)// None0123012301232000-2006 Malcolm WallaceLGPL/Malcolm Wallace <Malcolm.Wallace@cs.york.ac.uk> experimentalAllNone3  !"#$%&'()*+,-./012330123/,()*+-. !"#$'%&     !"#$%&'(()*+,-./0123456 7 8 9 : ; < =>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzi{|}h~    #cpphs-1.20.6-3eXY9mfZ2bxBqmUemUvWApLanguage.Preprocessor.UnlitLanguage.Preprocessor.Cpphs"Language.Preprocessor.Cpphs.SymTab$Language.Preprocessor.Cpphs.Position%Language.Preprocessor.Cpphs.ReadFirst#Language.Preprocessor.Cpphs.Options&Language.Preprocessor.Cpphs.HashDefine$Language.Preprocessor.Cpphs.Tokenise%Language.Preprocessor.Cpphs.MacroPass$Language.Preprocessor.Cpphs.CppIfdef$Language.Preprocessor.Cpphs.RunCpphsunlitPosnPnnewfileaddcolnewlinetabnewlinesnewposlinenofilename directorycpplinehasklinecpp2hask cleanPath BoolOptionsmacros locationshashlinepragmastripEolstripC89langansilayoutliteratewarnings CpphsOptionsinfilesoutfilesdefinesincludes preIncludebooloptsdefaultCpphsOptionsdefaultBoolOptions parseOptions WordStyleIdentOtherCmdtokenise macroPassmacroPassReturningSymTabcppIfdefrunCpphs runCpphsPass1 runCpphsPass2runCpphsReturningSymTab ClassifiedProgramBlankCommentIncludePreclassify unclassifyadjacentmessageinlinesIndTreeSymTabHashable hashWithMaxhashLeafForkemptySTinsertSTdeleteSTlookupST definedST flattenSTitgenitiapitinditfoldmaxHash $fHashable[]dirname $fShowPosn readFirst readFileUTF8 writeFileUTF8 RawOption rawOptiontrailingboolOptsNoMacroNoLine LinePragmaPragmaTextStripStripEolAnsiLayoutUnlitSuppressWarningsMacroPath PreIncludeIgnoredForCompatibilityflags ArgOrTextsymbolReplacement expandMacroparseHashDefinesimplifyHashDefinesArgStr HashDefineLineDrop AntiDefinedSymbolReplacementMacroExpansionname linebreaks replacement arguments expansionSubModeModelinesCppreslashparseMacroCallAnyPredString LineComment NestCommentCComment CLineCommentHaskellCppother deWordStyle onlyRights preDefine defineMacro macroProcessemit emitSymTabnoPos KeepStatecppemitOne preExpandexpandSymOrCallparseSymOrCall parenthesisfileKeepDropemitMany gatherDefined notComment parseBoolExp parseExp1 parseExp0parseArithExp1parseArithExp0 parseNumber parseCmpOp parseArithOp1 parseArithOp0recursivelyExpandparseSymnotIdentskip