Text.ParserCombinators.Parsec.IndentParser

IndentParser-0.1: Combinators for parsing indentation based syntatic structures

Text.ParserCombinators.Parsec.IndentParser

Contents

Parser type
Parser Combinators.
Primitive Parsers
User state manipulation.
Testing and Running.

Description

A module to construct indentation aware parsers. Many programming language have indentation based syntax rules e.g. python and Haskell. This module exports combinators to create such parsers.

The input source can be thought of as a list of tokens. Abstractly each token occurs at a line and a column and has a width. The column number of a token measures is indentation. If t1 and t2 are two tokens then we say that indentation of t1 is more than t2 if the column number of occurrence of t1 is greater than that of t2.

Currently this module supports two kind of indentation based syntactic structures which we now describe:

Block: A block of indentation c is a sequence of tokens with indentation at least c. Examples for a block is a where clause of Haskell with no explicit braces.
Line fold: A line fold starting at line l and indentation c is a sequence of tokens that start at line l and possibly continue to subsequent lines as long as the indentation is greater than c. Such a sequence of lines need to be folded to a single line. An example is MIME headers. Line folding based binding separation is used in Haskell as well.

Warning:

Internally indentations are implemented using Parser states. If one wants to use parser states as well then use the getState and setState functions exported by this module instead of those exported from the parsec library. Also use the parseTest and runParser function exported from this module instead of the one exported from Parsec.

Synopsis

type IndentParser tok st a = GenParser tok (IndentState st) a

indentParser :: IndentParser tok st a -> IndentParser tok st a

noIndent :: IndentParser tok st a -> IndentParser tok st a

block :: IndentParser tok st a -> IndentParser tok st a

lineFold :: IndentParser tok st a -> IndentParser tok st a

betweenOrBlock :: IndentParser tok st open -> IndentParser tok st close -> IndentParser tok st a -> IndentParser tok st a

betweenOrLineFold :: IndentParser tok st open -> IndentParser tok st close -> IndentParser tok st a -> IndentParser tok st a

data IndentMode

| LineFold Line Column

data IndentState st

state :: IndentState st -> st

indentMode :: IndentState st -> IndentMode

saveIndentMode :: IndentParser tok st a -> IndentParser tok st a

getIndentMode :: IndentParser tok st IndentMode

setIndentMode :: IndentMode -> IndentParser tok st ()

getState :: IndentParser tok st st

setState :: st -> IndentParser tok st ()

runParser :: IndentParser tok st a -> st -> IndentMode -> SourceName -> [tok] -> Either ParseError a

parseTest :: Show a => IndentParser tok () a -> [tok] -> IO ()

type IndentParser tok st a = GenParser tok (IndentState st) a

An indentation aware parser. The parser should be of this type to make it possible to parse indentation based grammatical structure.

Parser Combinators.

The module exports three combinators are indentParser, block and lineFold. To construct parsers for indentation based grammars one typically applies the indentParser to all tokenisers. In conjunction with Text.ParserCombinators.Parsec.Token module, one would want to apply indentParser to all the fields of the Text.ParserCombinators.Parsec.Token.TokenParser record except Text.ParserCombinators.Parsec.Token.whiteSpace. A block can then be parsed using the combinator block and a line fold using lineFold. To generate indentation aware tokeniser from the corresponding Text.ParserCombinators.Parsec.Language.LanguageDef record see the module Text.ParserCombinators.Parsec.IndentToken.

indentParser :: IndentParser tok st a -> IndentParser tok st a

The combinator indentParser makes its input parser indentation aware. Usually one would want to make all the tokenisers indentation aware.

noIndent :: IndentParser tok st a -> IndentParser tok st a

The parser noIndent p runs p ignoring any indentation based structure. This can be used to parse for example an explicitly braced where clause in Haskell.

block :: IndentParser tok st a -> IndentParser tok st a

The parser block p parses a block of p with the block indentation being the current column number.

lineFold :: IndentParser tok st a -> IndentParser tok st a

The parser lineFold p parses a folded line of p. The current line is the starting line. The indentation of the line depends on where in the source we are. If we are in a block then the indentation is the indentation of the block. Otherwise the current column is the indentation.

betweenOrBlock :: IndentParser tok st open -> IndentParser tok st close -> IndentParser tok st a -> IndentParser tok st a

The parser betweenOrBlock open close p parses p between open and close. If open is matched p is parsed in NoIndent mode otherwise a block p is parsed in Block mode. For eg. the parser for parsing haskell where clause would look like

 whereClause = do reserved where; betweenOrBlock bindings

betweenOrLineFold :: IndentParser tok st open -> IndentParser tok st close -> IndentParser tok st a -> IndentParser tok st a

Similar to betweenOrBlock but uses lineFold instead of block.

Primitive Parsers

We now describe the primitives that are used to build the combinators block, noIndent and lineFold. An indentation parser can be in one of the following modes:

NoIndent: In this mode the parser ignores all indentation constraints. All tokens regardless of their indentation are accepted.
Block c: In this mode a parser accepts only tokens which have indentation at least c. A parser parsing a block that is indented more than c columns will be this mode.
LineFold l c: In this mode a parser accepts tokens as long as it is in the current line or is indented more than c. When parsing a folded line starting at l and indentation more than c the parser will be in this mode.

data IndentMode

Constructors

NoIndent
Block Column
LineFold Line Column

show/hide

Instances

Show IndentMode

data IndentState st

The parser state used by Indentation Parsers.

state :: IndentState st -> st

indentMode :: IndentState st -> IndentMode

saveIndentMode :: IndentParser tok st a -> IndentParser tok st a

The parser saveIndentMode p saves the current indentation mode and returns the result of running p. It restores back the old indentation once p has finished executing.

getIndentMode :: IndentParser tok st IndentMode

This parser returns the current indentation mode.

setIndentMode :: IndentMode -> IndentParser tok st ()

This parser sets the current indentation mode

User state manipulation.

Indentation awareness is built into indentation parser by using these parser states. To distinguish it from the actual user defined state we call the former the indentation state and the later the user state.

getState :: IndentParser tok st st

Gets the current user state. Use this instead of the one exported from Parsec module

setState :: st -> IndentParser tok st ()

This parser sets the current state of the parser to the given input state. Use this function instead of the one exported by the parsec library.

Testing and Running.

The most generic way to run an IndentParser. Use parseTest for testing your parser instead.

runParser

::
=> IndentParser tok st a	the initial state
-> st	the indentation mode
-> IndentMode	the source file name
-> SourceName	the list of tokens
-> [tok]	the result of parsing
-> Either ParseError a

parseTest :: Show a => IndentParser tok () a -> [tok] -> IO ()

This is the function analogues to parseTest of the Parsec module. Given an indent parser p :: IndentParser tok () a and a list of tokens it runs the parser and prints the result.

Produced by Haddock version 2.6.0