A module to construct indentation aware parsers. Many programming language have indentation based syntax rules e.g. python and Haskell. This module exports combinators to create such parsers.
The input source can be thought of as a list of tokens. Abstractly each token occurs at a line and a column and has a width. The column number of a token measures is indentation. If t1 and t2 are two tokens then we say that indentation of t1 is more than t2 if the column number of occurrence of t1 is greater than that of t2.
Currently this module supports two kind of indentation based syntactic structures which we now describe:
- Block
- A block of indentation c is a sequence of tokens with indentation at least c. Examples for a block is a where clause of Haskell with no explicit braces.
- Line fold
- A line fold starting at line l and indentation c is a sequence of tokens that start at line l and possibly continue to subsequent lines as long as the indentation is greater than c. Such a sequence of lines need to be folded to a single line. An example is MIME headers. Line folding based binding separation is used in Haskell as well.
The module exports three combinators are
, indentParser
and block
. To construct parsers for indentation based grammars
one typically applies the lineFold
. A block can then be
parsed using the combinator indentParser
and a line fold using
block
. Generating indentation aware tokenisers could be tricky.
Given a language description via the
lineFold
Text.ParserCombinators.Parsec.Language.LanguageDef
record use module
to generate its
tokeiser (this will apply Text.ParserCombinators.Parsec.IndentParser.Token
on all tokenisers and then
the user can forget about indentParser
combinator).
indentParser
Warning:
Internally indentations are implemented using Parser states. If one
wants to use parser states as well then use the getState
and
setState
functions exported by this module instead of those exported
from the parsec library. Also use the parseTest
and runParser
function exported from this module instead of the one exported from
Parsec.
- type IndentParser tok st a = GenParser tok (st, IndentState) a
- type IndentCharParser st a = IndentParser Char st a
- data IndentMode
- indentParser :: IndentParser tok st a -> IndentParser tok st a
- noIndent :: IndentParser tok st a -> IndentParser tok st a
- block :: IndentParser tok st a -> IndentParser tok st a
- lineFold :: IndentParser tok st a -> IndentParser tok st a
- betweenOrBlock :: IndentParser tok st open -> IndentParser tok st close -> IndentParser tok st a -> IndentParser tok st a
- betweenOrLineFold :: IndentParser tok st open -> IndentParser tok st close -> IndentParser tok st a -> IndentParser tok st a
- getState :: IndentParser tok st st
- setState :: st -> IndentParser tok st ()
- runParser :: IndentParser tok st a -> st -> IndentMode -> SourceName -> [tok] -> Either ParseError a
- parse :: IndentParser tok () a -> SourceName -> [tok] -> Either ParseError a
- parseFromFile :: IndentCharParser () a -> SourceName -> IO (Either ParseError a)
- parseTest :: Show a => IndentParser tok () a -> [tok] -> IO ()
Parser type.
type IndentParser tok st a = GenParser tok (st, IndentState) aSource
An indentation aware parser.
type IndentCharParser st a = IndentParser Char st aSource
data IndentMode Source
The mode of the indentation parser.
indentParser :: IndentParser tok st a -> IndentParser tok st aSource
The combinator indentParser makes its input parser indentation aware. Usually one would want to make all the tokenisers indentation aware.
noIndent :: IndentParser tok st a -> IndentParser tok st aSource
The parser
runs noIndent
pp
ignoring any indentation based
structure. This can be used to parse for example an explicitly braced
where clause in Haskell.
block :: IndentParser tok st a -> IndentParser tok st aSource
The parser
parses a block of block
pp
.
lineFold :: IndentParser tok st a -> IndentParser tok st aSource
The parser lineFold p
parses a folded line of p
.
betweenOrBlock :: IndentParser tok st open -> IndentParser tok st close -> IndentParser tok st a -> IndentParser tok st aSource
betweenOrLineFold :: IndentParser tok st open -> IndentParser tok st close -> IndentParser tok st a -> IndentParser tok st aSource
Similar to betweenOrBlock but uses lineFold instead of block.
User state manipulation.
getState :: IndentParser tok st stSource
Gets the current user state.
setState :: st -> IndentParser tok st ()Source
Sets the user state.
Running and testing.
:: IndentParser tok st a | the parser to be run |
-> st | the initial state |
-> IndentMode | the indentation mode |
-> SourceName | the source file name |
-> [tok] | the list of tokens |
-> Either ParseError a | the result of parsing |
The most generic way to run an IndentParser. Use parseTest
for
testing your parser instead.
:: IndentParser tok () a | The parser to run |
-> SourceName | The name of the source (to report errors) |
-> [tok] | The input to the parser |
-> Either ParseError a |
Runs the given parser on a given input stream and returns either the result or parse error.
:: IndentCharParser () a | The parser to run |
-> SourceName | The file on which to run. |
-> IO (Either ParseError a) |
Like
but use the contents of parse
SourceName
as the input
tokens.