 | IndentParser-0.1: Combinators for parsing indentation based syntatic structures | Contents | Index |
|
| Text.ParserCombinators.Parsec.IndentToken |
|
|
|
|
| Description |
A module for constructing indentation aware tokeniser that can be used
in conjuction with Token. All the
combinator takes a TokenParser
as its first argument. For every field foo of
TokenParser this module
exports a combinator foo. To define a tokeniser for an indentation
based language a user first defines the appropriate
LanguageDef record, applies
the combinator makeTokenParser
to get a TokenParser record
say tokP and then, instead of selecting the field foo of tokP,
applies the combinator foo exported from this module to tokP. The
semantics of the combinator foo is essentially same as that of the
field foo of TokenParser but
the returned parsers are indentation aware. Apart from these there are
certain new combinators that are defined specifically for parsing
certain indentation based syntactic constructs. (We have not defined
squares use brackets instead)
There are two important classes of parser combinator exported by this
module:
- Grouping Parser Combinator
- A grouping parser combinator takes as
input a parser say p and returns a parser that parses p between
two grouping delimiters. There are three flavours of grouping
parsers: foo, fooOrBlock and fooOrLineFold where foo can be
one of angles, braces, parens, brackets or To
illustrate we take foo to be braces. The parser braces tokP p
parses p delimited by '{' and '}'. In this case p does not care
about indentation (i.e. the parser p is run in
NoIndent mode). The
parser bracesOrBlock tokP p is like braces tokP p but if no
explicit delimiting braces are given parses p within an indented
block. Similarly bracesOrLineFold tokP p parses p between '{'
and '}' and uses line fold when no explicit braces are given. These
can be two varients can be defined as follows
bracesOrBlock tokP p = braces tokP $ noIndent p <|> block p
bracesOrLineFold tokP p = braces tokP $ noIndent p <|> lineFold p
- Seperator Parser Combinator
- A seperator parser combinator takes as
input a parser say p and returns a parser that parses a list of p
seperated by a seperator. The module exports the combinators fooSep,
fooSep1, fooOrNewLineSep and fooOrNewLineSep1, where foo is
either semi (in which case the seperator is a semicolon ';') or
comma (in which case the seperator is a comma ',').
To illustrate the use of this module we now give, as an incomplete
example, a parser that parses a where clause in Haskell which
illustrates the use of this module.
import qualified Text.ParserCombinators.Parsec.Language as L
import qualified Text.ParserCombinators.Parsec.Toke as T
import qualified Text.ParserCombinator.Parsec.IndentToken as IT
tokP = T.makeTokenParser L.haskellDef
semiOrNewLineSep = IT.semiOrNewLineSep tokP
bracesOrBlock = IT.bracesOrBlock tokP
identifier = IT.identifier tokP
....
symbol = IT.symbol tokP
binding = semiOrNewLineSep bind
bind = do id <- identifier
symbol (char '=')
e <- expr
return (id,e)
whereClause = do reserved "where"; braceOrBlock binding
|
|
| Synopsis |
|
|
|
|
| Types
|
|
| type IndentCharParser st a = IndentParser Char st a |
|
| type LanguageDef st = LanguageDef (IndentState st) |
|
| type TokenParser st = TokenParser (IndentState st) |
|
| Combinators
|
|
| identifier :: TokenParser st -> IndentCharParser st String |
| Indentation aware parser to match a valid identifier of the
language.
|
|
| reserved |
|
|
| operator :: TokenParser st -> IndentCharParser st String |
| The parser reserved tokP keyword parses the reserved word keyword.
The string keyword should have been declared as a reserved word in the
LanguageDef
-record.
|
|
| reservedOp |
| :: TokenParser st | | | -> String | The reserved operator to be matched. Should have
been declared as a reserved operator in the
LanguageDef
record.
| | -> IndentCharParser st () | | | Indentation aware parser to match a reserved operator of the language.
|
|
|
| charLiteral :: TokenParser st -> IndentCharParser st Char |
| Indentation aware parser to match a character literal (the syntax
is assumend to be that of Hasekell which matches that of most
programming language).
|
|
| stringLiteral :: TokenParser st -> IndentCharParser st String |
| Indentation aware parser to match a string literal (the syntax is
assumend to be that of Hasekell which matches that of most programming
language).
|
|
| natural :: TokenParser st -> IndentCharParser st Integer |
| Indentation aware parser to match a natural number.
|
|
| integer :: TokenParser st -> IndentCharParser st Integer |
| Indentation aware parser to match an integer.
|
|
| float :: TokenParser st -> IndentCharParser st Double |
| Indentation aware parser to match a floating point number.
|
|
| naturalOrFloat :: TokenParser st -> IndentCharParser st (Either Integer Double) |
| Indentation aware parser to match either a natural number or
Floating point number.
|
|
| decimal :: TokenParser st -> IndentCharParser st Integer |
| Indentation aware parser to match an integer in decimal.
|
|
| hexadecimal :: TokenParser st -> IndentCharParser st Integer |
| Indentation aware parser to match an integer in hexadecimal.
|
|
| octal :: TokenParser st -> IndentCharParser st Integer |
| Indentation aware parser to match an integer in ocatal.
|
|
| semi :: TokenParser st -> IndentCharParser st String |
| Matches a semicolon and returns ';'.
|
|
| colon :: TokenParser st -> IndentCharParser st String |
| Matches a colon and returns :.
|
|
| dot :: TokenParser st -> IndentCharParser st String |
| Matches a dot and returns .
|
|
| lexeme |
|
|
| symbol :: TokenParser st -> String -> IndentCharParser st String |
| Indentation aware parser that is equvalent to string str.
|
|
| whiteSpace :: TokenParser st -> IndentCharParser st () |
| The parser whiteSpace skips spaces and comments. This does not
care about indentation as skipping spaces should be done irrespective
of the indentation.
|
|
| Separator parser combinators
|
|
| semiSep :: TokenParser st -> IndentCharParser st a -> IndentCharParser st [a] |
| Given an indentation aware parser p as argument semiSep tokP
returns a parser that parses zero or more occurances of p
seperated by semicolon (';')
|
|
| semiSep1 |
| :: TokenParser st | | | -> IndentCharParser st a | The input Parser
| | -> IndentCharParser st [a] | | | Given an indentation aware parser p as argument semiSep1 tokP
returns a parser that parses one or more occurances of p seperated
by semicolon (';')
|
|
|
| commaSep |
| :: TokenParser st | | | -> IndentCharParser st a | The input Parser
| | -> IndentCharParser st [a] | | | Given an indentation aware parser p as argument commaSep tokP
returns a parser that parses zero or more occurances of p
seperated by comma (',')
|
|
|
| commaSep1 |
| :: TokenParser st | | | -> IndentCharParser st a | The input Parser
| | -> IndentCharParser st [a] | | | Given an indentation aware parser p as argument commaSep1 tokP
returns a parser that parses one or more occurances of p seperated
by comma (',')
|
|
|
| semiOrNewLineSep |
| :: TokenParser st | | | -> IndentCharParser st a | The input Parser
| | -> IndentCharParser st [a] | | | Given an indentation aware parser p as argument semiOrNewLineSep
tokP returns a parser that parses zero or more occurances of p
seperated by either semicolons (';') or newlines. To seperate multiple
occurance of p in the same line use an explicit semicolon (';').
|
|
|
| semiOrNewLineSep1 |
| :: TokenParser st | | | -> IndentCharParser st a | The input Parser
| | -> IndentCharParser st [a] | | | Given an indentation aware parser p as argument semiOrNewLineSep1
tokP returns a parser that parses one or more occurances of p
seperated by either semicolons (';') or newline. To seperate multiple
occurance of p in the same line use an explicit semicolon (';').
|
|
|
| commaOrNewLineSep |
| :: TokenParser st | | | -> IndentCharParser st a | The input Parser
| | -> IndentCharParser st [a] | | | Given an indentation aware parser p as argument commaOrNewLineSep
tokP returns a parser that parses zero or more occurances of p
seperated by either comma (',') or newlines. To seperate multiple
occurance of p in the same line use an explicit comma (',').
|
|
|
| commaOrNewLineSep1 |
| :: TokenParser st | | | -> IndentCharParser st a | The input Parser
| | -> IndentCharParser st [a] | | | Given an indentation aware parser p as argument commaOrNewLineSep1
tokP returns a parser that parses one or more occurances of p
seperated by either comma (',') or newline. To seperate multiple
occurance of p in the same line use an explicit comma (',').
|
|
|
| Grouping parser combinator
|
|
| parens |
|
|
| parensOrBlock |
|
|
| parensOrLineFold |
|
|
| braces |
|
|
| bracesOrBlock |
|
|
| bracesOrLineFold |
|
|
| angles |
|
|
| anglesOrBlock |
|
|
| anglesOrLineFold |
|
|
| brackets |
|
|
| bracketsOrBlock |
|
|
| bracketsOrLineFold |
|
|
| Produced by Haddock version 0.8 |