A module to construct indentation aware parsers. Many programming language have indentation based syntax rules e.g. python and Haskell. This module exports combinators to create such parsers.
The input source can be thought of as a list of tokens. Abstractly each token occurs at a line and a column and has a width. The column number of a token measures is indentation. If t1 and t2 are two tokens then we say that indentation of t1 is more than t2 if the column number of occurrence of t1 is greater than that of t2.
Currently this module supports two kind of indentation based syntactic structures which we now describe:
- A block of indentation c is a sequence of tokens with indentation at least c. Examples for a block is a where clause of Haskell with no explicit braces.
- Line fold
- A line fold starting at line l and indentation c is a sequence of tokens that start at line l and possibly continue to subsequent lines as long as the indentation is greater than c. Such a sequence of lines need to be folded to a single line. An example is MIME headers. Line folding based binding separation is used in Haskell as well.
The module exports three combinators are
. To construct parsers for indentation based grammars
one typically applies the
. A block can then be
parsed using the combinator
and a line fold using
. Generating indentation aware tokenisers could be tricky.
Given a language description via the
Text.ParserCombinators.Parsec.Language.LanguageDef record use module
to generate its
tokeiser (this will apply
on all tokenisers and then
the user can forget about
Internally indentations are implemented using Parser states. If one
wants to use parser states as well then use the
setState functions exported by this module instead of those exported
from the parsec library. Also use the
function exported from this module instead of the one exported from
- type IndentParser tok st a = GenParser tok (st, IndentState) a
- type IndentCharParser st a = IndentParser Char st a
- data IndentMode
- indentParser :: IndentParser tok st a -> IndentParser tok st a
- noIndent :: IndentParser tok st a -> IndentParser tok st a
- block :: IndentParser tok st a -> IndentParser tok st a
- lineFold :: IndentParser tok st a -> IndentParser tok st a
- betweenOrBlock :: IndentParser tok st open -> IndentParser tok st close -> IndentParser tok st a -> IndentParser tok st a
- betweenOrLineFold :: IndentParser tok st open -> IndentParser tok st close -> IndentParser tok st a -> IndentParser tok st a
- getState :: IndentParser tok st st
- setState :: st -> IndentParser tok st ()
- runParser :: IndentParser tok st a -> st -> IndentMode -> SourceName -> [tok] -> Either ParseError a
- parseTest :: Show a => IndentParser tok () a -> [tok] -> IO ()
The mode of the indentation parser.
The combinator indentParser makes its input parser indentation aware. Usually one would want to make all the tokenisers indentation aware.
p ignoring any indentation based
structure. This can be used to parse for example an explicitly braced
where clause in Haskell.
parses a block of
lineFold p parses a folded line of
Similar to betweenOrBlock but uses lineFold instead of block.