- The Classes Defining the Interface
- The type describing parsers:
`P`

- Additional useful combinators
- Controlling the text of error reporting:
`<?>`

- An alternative for the Alternative, which is greedy:
`<<|>`

- Parsers can be disambiguated using micro-steps:
`micro`

- Dealing with (non-empty) Ambigous parsers:
`amb`

- Parse errors can be retreived from the state:
`pErrors`

- Starting and finalising the parsing process:

and`pEnd`

`parse`

- The state may be temporarily change type:
`pSwitch`

- A more efficient version for

from the module`<$`

`Control.Applicative`

- Controlling the text of error reporting:
- Maintaining Progress Information
- Auxiliary functions and types
- Checking for non-sensical combinations:

and`must_be_non_empty`

`must_be_non_empties`

- The type

for describing the minimal number of tokens consumed`Nat`

- Checking for non-sensical combinations:

The module `Core`

contains the basic functionality of the parser library.
It uses the breadth-first module to realise online generation of results, the error
correction administration, dealing with ambigous grammars; it defines the types of the elementary parsers
and recognisers involved.For typical use cases of the libray see the module `Text.ParserCombinators.UU.Examples`

- class (Applicative p, ExtApplicative p, Alternative p) => IsParser p
- class ExtApplicative p where
- (<$) :: a -> p b -> p a

- class Symbol p symbol token | p symbol -> token where
- class Provides state symbol token | state symbol -> token where
- splitState :: symbol -> (token -> state -> Steps a) -> state -> Steps a

- class Eof state where
- eof :: state -> Bool
- deleteAtEnd :: state -> Maybe (Cost, state)

- data P st a = P (forall r. (a -> st -> Steps r) -> st -> Steps r) (forall r. (st -> Steps r) -> st -> Steps (a, r)) (forall r. (st -> Steps r) -> st -> Steps r) Nat (Maybe a)
- (<?>) :: P state a -> String -> P state a
- amb :: P st a -> P st [a]
- class Stores state error | state -> error where
- getErrors :: state -> ([error], state)

- pErrors :: Stores st error => P st [error]
- pEnd :: (Stores st error, Eof st) => P st [error]
- parse :: Eof t => P t a -> t -> a
- pSwitch :: (st1 -> (st2, st2 -> st1)) -> P st2 a -> P st1 a
- type Cost = Int
- type Progress = Int
- type Strings = [String]
- data Steps a where
- eval :: Steps a -> a
- push :: v -> Steps r -> Steps (v, r)
- apply :: Steps (b -> a, (b, r)) -> Steps (a, r)
- pushapply :: (b -> a) -> Steps (b, r) -> Steps (a, r)
- norm :: Steps a -> Steps a
- best :: Steps a -> Steps a -> Steps a
- best' :: Steps b -> Steps b -> Steps b
- getCheapest :: Int -> [(Int, Steps a)] -> Steps a
- traverse :: Int -> Steps a -> Int -> Int -> Int
- removeEnd_h :: Steps (a, r) -> Steps r
- removeEnd_f :: Steps r -> Steps [r]
- must_be_non_empty :: [Char] -> P t t1 -> t2 -> t2
- must_be_non_empties :: [Char] -> P t1 t -> P t3 t2 -> t4 -> t4
- data Nat
- module Control.Applicative

# The Classes Defining the Interface

`IsParser`

class (Applicative p, ExtApplicative p, Alternative p) => IsParser p Source

This class collects a number of classes which together defines what a `Parser`

should provide.
Since it is just a predicate we have prefixed the name by the phrase `Is`

(Applicative p, ExtApplicative p, Alternative p) => IsParser p |

`ExtApplicative`

class ExtApplicative p whereSource

The module Control.Applicative contains the definition for `<$`

which cannot be changed .
Since we want to give optimised implementations of this combinator, we hide its definition, and define a class containing its signature.

ExtApplicative (P st) | In the new module |

`Symbol`

class Symbol p symbol token | p symbol -> token whereSource

Many parsing libraries do not make a distinction between the terminal symbols of the language recognised and the tokens actually constructed from the input. This happens e.g. if we want to recognise an integer or an identifier: we are also interested in which integer occurred in the input, or which identifier.

The function `pSym`

takes as argument a value of some type `symbol`

, and returns a value of type `token`

. The parser will in general depend on some
state which is maintained holding the input. The functional dependency fixes the `token`

type, based on the `symbol`

type and the type of the parser `p`

.
Since `pSym`

is overloaded both the type and the value of symbol determine how to decompose the input in a `token`

and the remaining input.
`pSymExt`

is the actual function, which takes two extra parameters: one describing the minimal numer of tokens recognised,
and the second whether the symbol can recognise the empty string and the value which is to be returned in that case

pSymExt :: Nat -> Maybe token -> symbol -> p tokenSource

The first parameter to `pSymExt`

is a `Nat`

which describes the minimal numer of tokens accepted by this parser. It is used in the abstract interpretation
which computes this property for each parser. It's main use is in choosinga non-recursive alternative in case a non-terminal has to be inserted.
The second parameter indicates whether this parser can also skip recognising anything and just return a value of type a, hence a `Maybe a`

`Provides`

class Provides state symbol token | state symbol -> token whereSource

The function `splitStae`

playes a crucial role in splitting up the state. The `symbol`

parameter tells us what kind of thing, and even which value of that kind, is expected from the input.
The state and and the symbol type together determine what kind of token has to be returned. Since the function is overloaded we do not have to invent
all kind of different names for our elementary parsers.

splitState :: symbol -> (token -> state -> Steps a) -> state -> Steps aSource

`Eof`

# The type describing parsers: `P`

`P`

P (forall r. (a -> st -> Steps r) -> st -> Steps r) (forall r. (st -> Steps r) -> st -> Steps (a, r)) (forall r. (st -> Steps r) -> st -> Steps r) Nat (Maybe a) |

Monad (P st) | |

Functor (P state) | |

MonadPlus (P st) | |

Applicative (P state) | |

Alternative (P state) | |

ExtApplicative (P st) | In the new module |

Provides state symbol token => Symbol (P state) symbol token |

## Parsers are functors: `fmap`

`fmap`

## Parsers are Applicative: `<*>`

, `<*`

, `*>`

and `pure`

`<*>`

`<*`

`*>`

`pure`

## Parsers are Alternative: `<|>`

and `empty`

`<|>`

`empty`

## Parsers can recognise single tokens: `pSym`

and `pSymExt`

`pSym`

`pSymExt`

## Parsers are Monads: `>>=`

and `return`

`>>=`

`return`

# Additional useful combinators

## Controlling the text of error reporting: `<?>`

`<?>`

(<?>) :: P state a -> String -> P state aSource

The parsers build a list of symbols which are expected at a specific point.
This list is used to report errors.
Quite often it is more informative to get e.g. the name of the non-terminal.
The

combinator replaces this list of symbols by it's righ-hand side argument.
`<?>`

## An alternative for the Alternative, which is greedy: `<<|>`

`<<|>`

## Parsers can be disambiguated using micro-steps: `micro`

`micro`

## Dealing with (non-empty) Ambigous parsers: `amb`

`amb`

## Parse errors can be retreived from the state: `pErrors`

`pErrors`

class Stores state error | state -> error whereSource

`getErrors`

retreives the correcting steps made since the last time the function was called. The result can,
using a monad, be used to control how to-- proceed with the parsing process.

## Starting and finalising the parsing process: `pEnd`

and `parse`

`pEnd`

`parse`

pEnd :: (Stores st error, Eof st) => P st [error]Source

The function `pEnd`

should be called at the end of the parsing process. It deletes any unsonsumed input, and reports its preence as an eror.

## The state may be temporarily change type: `pSwitch`

`pSwitch`

pSwitch :: (st1 -> (st2, st2 -> st1)) -> P st2 a -> P st1 aSource

`pSwitch`

takes the current state and modifies it to a different type of state to which its argument parser is applied.
The second component of the result is a function which converts the remaining state of this parser back into a valuee of the original type.

## A more efficient version for `<$`

from the module `Control.Applicative`

`<$`

`Control.Applicative`

# Maintaining Progress Information

The data type

is the core data type around which the parsers are constructed.
It is a describes a tree structure of streams containing (in an interleaved way) both the online result of the parsing process,
and progress information. Recognising an input token should correspond to a certain amount of `Steps`

,
which tells how much of the input state was consumed.
The `Progress`

is used to implement the breadth-first search process, in which alternatives are
examined in a more-or-less synchonised way. The meaning of the various `Progress`

constructors is as follows:
`Step`

`Step`

- A token was succesfully recognised, and as a result the input was
`advanced`

by the distance`Progress`

`Apply`

- The type of value represented by the
`Steps`

changes by applying the function parameter. `Fail`

- A correcting step has to made to the input; the first parameter contains information about what was expected in the input,
and the second parameter describes the various corrected alternatives, each with an associated
`Cost`

`Micro`

- A small cost is inserted in the sequence, which is used to disambiguate. Use with care!

The last two alternatives play a role in recognising ambigous non-terminals. For a full description see the technical report referred to from the README file..

removeEnd_h :: Steps (a, r) -> Steps rSource

removeEnd_f :: Steps r -> Steps [r]Source

# Auxiliary functions and types

## Checking for non-sensical combinations: `must_be_non_empty`

and `must_be_non_empties`

`must_be_non_empty`

`must_be_non_empties`

must_be_non_empty :: [Char] -> P t t1 -> t2 -> t2Source

The function checks wehther its second argument is a parser which can recognise the mety sequence. If so an error message is given using the name of the context. If not then the third argument is returned. This is useful in testing for loogical combinations. For its use see the module Text>parserCombinators.UU.Derived

must_be_non_empties :: [Char] -> P t1 t -> P t3 t2 -> t4 -> t4Source

This function is similar to the above, but can be used in situations where we recognise a sequence of elements separated by other elements. This does not make sense if both parsers can recognise the empty string. Your grammar is then highly ambiguous.

## The type `Nat`

for describing the minimal number of tokens consumed

`Nat`

The data type

is used to represent the minimal length of a parser.
Care should be taken in order to not evaluate the right hand side of the binary functions `Nat`

and `nat_min`

``nat-add``

more than necesssary.

module Control.Applicative