|
Text.ParserCombinators.Parsec.IndentParser |
|
|
|
|
Description |
A module to construct indentation aware parsers. Many programming
language have indentation based syntax rules e.g. python and Haskell.
This module exports combinators to create such parsers.
The input source can be thought of as a list of tokens. Abstractly
each token occurs at a line and a column and has a width. The column
number of a token measures is indentation. If t1 and t2 are two tokens
then we say that indentation of t1 is more than t2 if the column
number of occurrence of t1 is greater than that of t2.
Currently this module supports two kind of indentation based syntactic
structures which we now describe:
- Block
- A block of indentation c is a sequence of tokens with
indentation at least c. Examples for a block is a where clause of
Haskell with no explicit braces.
- Line fold
- A line fold starting at line l and indentation c is a
sequence of tokens that start at line l and possibly continue to
subsequent lines as long as the indentation is greater than c. Such
a sequence of lines need to be folded to a single line. An example
is MIME headers. Line folding based binding separation is used in
Haskell as well.
Warning:
Internally indentations are implemented using Parser states. If one
wants to use parser states as well then use the getState and
setState functions exported by this module instead of those exported
from the parsec library. Also use the parseTest and runParser
function exported from this module instead of the one exported from
Parsec.
|
|
Synopsis |
|
|
|
|
Parser type
|
|
|
An indentation aware parser. The parser should be of this type to
make it possible to parse indentation based grammatical structure.
|
|
Parser Combinators.
|
|
The module exports three combinators are indentParser, block
and lineFold. To construct parsers for indentation based grammars
one typically applies the indentParser to all tokenisers. In
conjunction with Text.ParserCombinators.Parsec.Token module, one
would want to apply indentParser to all the fields of the
Text.ParserCombinators.Parsec.Token.TokenParser record except
Text.ParserCombinators.Parsec.Token.whiteSpace. A block can then
be parsed using the combinator block and a line fold using
lineFold. To generate indentation aware tokeniser from the
corresponding Text.ParserCombinators.Parsec.Language.LanguageDef
record see the module Text.ParserCombinators.Parsec.IndentToken.
|
|
|
The combinator indentParser makes its input parser indentation
aware. Usually one would want to make all the tokenisers indentation
aware.
|
|
|
The parser noIndent p runs p ignoring any indentation based
structure. This can be used to parse for example an explicitly braced
where clause in Haskell.
|
|
|
The parser block p parses a block of p with the block
indentation being the current column number.
|
|
|
The parser lineFold p parses a folded line of p. The current line
is the starting line. The indentation of the line depends on where in
the source we are. If we are in a block then the indentation is the
indentation of the block. Otherwise the current column is the
indentation.
|
|
|
The parser betweenOrBlock open close p parses p between open
and close. If open is matched p is parsed in NoIndent mode otherwise
a block p is parsed in Block mode. For eg. the parser for parsing
haskell where clause would look like
whereClause = do reserved where; betweenOrBlock bindings
|
|
|
Similar to betweenOrBlock but uses lineFold instead of block.
|
|
Primitive Parsers
|
|
We now describe the primitives that are used to build the combinators
block, noIndent and lineFold. An indentation parser can be in
one of the following modes:
- NoIndent
- In this mode the parser ignores all indentation
constraints. All tokens regardless of their indentation are accepted.
- Block c
- In this mode a parser accepts only tokens which have
indentation at least c. A parser parsing a block that is indented
more than c columns will be this mode.
- LineFold l c
- In this mode a parser accepts tokens as long as it is
in the current line or is indented more than c. When parsing a
folded line starting at l and indentation more than c the parser
will be in this mode.
|
|
|
Constructors | | Instances | |
|
|
|
The parser state used by Indentation Parsers.
|
|
|
|
|
|
|
|
The parser saveIndentMode p saves the current indentation mode and
returns the result of running p. It restores back the old
indentation once p has finished executing.
|
|
|
This parser returns the current indentation mode.
|
|
|
This parser sets the current indentation mode
|
|
User state manipulation.
|
|
Indentation awareness is built into indentation parser by using these
parser states. To distinguish it from the actual user defined state we
call the former the indentation state and the later the user state.
|
|
|
Gets the current user state. Use this instead of the one exported
from Parsec module
|
|
|
This parser sets the current state of the parser to the given input
state. Use this function instead of the one exported by the parsec
library.
|
|
Testing and Running.
|
|
The most generic way to run an IndentParser. Use parseTest for
testing your parser instead.
|
|
|
|
|
|
This is the function analogues to parseTest of the Parsec module.
Given an indent parser p :: IndentParser tok () a and a list of
tokens it runs the parser and prints the result.
|
|
Produced by Haddock version 2.6.0 |