luthor-0.0.2: Tools for lexing and utilizing lexemes that integrate with Parsec.

Safe HaskellSafe
LanguageHaskell98

Text.Luthor.Indent

Contents

Description

Provides a set of tools built atop Parsec's user state mechanism to aid in building indentation-sensitive parsers. Also redefines some familiar functions to hide the state tracking from consideration.

The indentation state tracks a stack of indentation depth. It is configured to know about what characters are allowed as indentation and how to count them. It should also be configured with a list of linear whitespace parsers (this including comments). With those, the algorithms in this module will be able to skip over blank lines. Indentation may also be enabled/disabled, such as when parsing between parens or braces.

WARNING: do not attempt to build an indentation-sensitive lexer using Lex and this module. It is fundamentally broken, and I don't want to sink that much time into figuring out what's wrong with it.

Synopsis

Types

type ParsecIT s u m = ParsecT s (u, IndentState s u m) m Source

Type for Parsec parsers tracking indentation.

type ParsecI s u = Parsec s (u, IndentState s u Identity) Source

ParsecIT over the identity monad.

data IndentState s u m Source

Opaque type tracking indentation state.

data IndentPolicy Source

Determine how the depth of indentation is calculated.

Constructors

DontMix [Char]

Any of the passed Chars can be used, but allow only one kind of character in a line. Depth is number of those characters.

Convert [(Char, Int)]

Allow any mix of of the passed Chars. Calculate depth by assigning a number to each of character by kind and summing.

Run Indentation-sensitive Parsers

runParserIT Source

Arguments

:: Stream s m Char 
=> ParsecIT s u m a

the parser to run

-> IndentPolicy

what characters count as leading space and how they should be counted

-> [ParsecIT s u m ()]

a list of linear whitespace parsers

-> u

an initial user state

-> SourceName

name of the source file from which the input was gathered

-> s

input stream

-> m (Either ParseError a) 

The most general way to run a parser. runParserIT p state filePath input runs parser p on the input list of tokens input, obtained from source filePath with the initial user state st. Indentation is initially enabled and the depth stack begins as [0]. The filePath is only used in error messages and may be the empty string. Returns a computation in the underlying monad m that return either a ParseError (Left) or a value of type a (Right).

runParserI Source

Arguments

:: Stream s Identity Char 
=> ParsecI s u a

the parser to run

-> IndentPolicy

what characters count as leading space and how they should be counted

-> [ParsecI s u ()]

a list of linear whitespace parsers

-> u

an initial user state

-> SourceName

name of the source file from which the input was gathered

-> s

input stream

-> Either ParseError a 

As runParserIT, but over the Identity monad.

runPIT :: Stream s m Char => ParsecIT s u m a -> IndentPolicy -> [ParsecIT s u m ()] -> u -> SourceName -> s -> m (Either ParseError a) Source

Shortcut for runParserIT

runPI :: Stream s Identity Char => ParsecI s u a -> IndentPolicy -> [ParsecI s u ()] -> u -> SourceName -> s -> Either ParseError a Source

Shortcut for runParserI

Parse Indentation

plusBlankline :: Stream s m Char => [ParsecIT s u m ()] -> ParsecIT s u m () Source

Take a list of some linear whitespace tokens and return a parser that advances over linear whitespace and blank lines. Also, when indentation is disabled, also advance over lineBreaks.

You will almost always want to use this combinator before indent, nextline and dedent to make sure indentation is always detected. Actual linear whitespace is usually not enough: remember to add parsers for comments as well. Line comments shouldn't eat the newline; the lineComment combinator is acceptable.

indent :: Stream s m Char => ParsecIT s u m () Source

Parse an indent: as dentation ensuring the result is greater than the current indentation level. Pushes the indentation depth stack.

nextline :: Stream s m Char => ParsecIT s u m () Source

Parse an indent: as dentation ensuring the result is equal to the current indentation level.

dedent :: Stream s m Char => ParsecIT s u m () Source

Parse an indent: as dentation ensuring the result is less than the current indentation level. Pops the indentation depth stack. Consumes no input, thus multiple dedents might be parsed at a single position.

dedent' :: Stream s m Char => ParsecIT s u m () Source

As dedent, but also consume the nextline. The leading whitespace is left intact.

startIndent :: Stream s m Char => IndentPolicy -> [ParsecIT s u m ()] -> IndentState s u m Source

Create a starting IndentationState: indentation is initially enabled and the indentation depth stack starts with [0].

endIndent :: Stream s m Char => ParsecIT s u m () Source

Succeed only when the indentation stack is suitably empty: is empty or equal to [0], or if indentation is disabled.

Read/Write Indentation State

isIndentEnabled :: Stream s m t => ParsecIT s u m Bool Source

Test if indentation is enabled.

getIndentDepth :: Stream s m t => ParsecIT s u m Int Source

Obtain the current indentation depth. Fails if indentation is disabled.

withIndentation :: Stream s m t => ParsecIT s u m a -> ParsecIT s u m a Source

Run the passed parser with indentation enabled.

withoutIndentation :: Stream s m t => ParsecIT s u m a -> ParsecIT s u m a Source

Run the passed parser with indentation disabled.

State Manipulation

getState :: Monad m => ParsecIT s u m u Source

Alternate version of Parsec's getState suited for indentation-sensitive parsers.

putState :: Monad m => u -> ParsecIT s u m () Source

Alternate version of Parsec's puttState suited for indentation-sensitive parsers.

modifyState :: Monad m => (u -> u) -> ParsecIT s u m () Source

Alternate version of Parsec's modifyState suited for indentation-sensitive parsers.

peekIndentation :: Monad m => ParsecIT s u m Int Source

Peek the top of the depth stack.

popIndentation :: Monad m => ParsecIT s u m Int Source

Pop the top of the depth stack and return the popped depth.

pushIndentation :: Monad m => Int -> ParsecIT s u m () Source

Push to the top of the depth stack.

Re-exports

data ParseError :: *

The abstract data type ParseError represents parse errors. It provides the source position (SourcePos) of the error and a list of error messages (Message). A ParseError can be returned by the function parse. ParseError is an instance of the Show and Eq classes.

errorPos :: ParseError -> SourcePos

Extracts the source position from the parse error

data SourcePos :: *

The abstract data type SourcePos represents source positions. It contains the name of the source (i.e. file name), a line number and a column number. SourcePos is an instance of the Show, Eq and Ord class.

type Line = Int

type Column = Int

sourceName :: SourcePos -> SourceName

Extracts the name of the source from a source position.

sourceLine :: SourcePos -> Line

Extracts the line number from a source position.

sourceColumn :: SourcePos -> Column

Extracts the column number from a source position.