Safe Haskell | None |
---|
In this tutorial you will learn how to use this library. The Simple example section should be enough to get you going, but you can keep reading if you want to better understand how to deal with complex parsing scenarios.
You may import this module and try the subsequent examples as you go.
- data Name = Name Text
- hello :: Parser Name
- input1 :: [Text]
- input2 :: [Text]
- helloPipe1 :: (Proxy p, Monad m) => () -> Pipe p Text Name m r
- helloPipe2 :: (Proxy p, Monad m) => () -> Pipe p Text Name m r
- helloPipe3 :: (Proxy p, Monad m) => () -> Pipe (EitherP BadInput p) Text Name m r
- helloPipe4 :: (Proxy p, Monad m) => () -> Pipe (EitherP BadInput p) Text Name m r
- helloPipe5 :: (Proxy p, Monad m) => () -> Pipe (EitherP BadInput p) Text Name m r
- helloPipe6 :: (Proxy p, Monad m) => () -> Pipe p Text Name m r
- skipPartialResults :: (Monad m, Proxy p, AttoparsecInput a) => ParserStatus a -> p (ParserStatus a) (ParserSupply a) (ParserStatus a) (ParserSupply a) m r
Simple example
We'll write a simple Parser
that turns Text
like “Hello John Doe.”
into
, and then make a Name
"John Doe"Pipe
that turns
those Text
values flowing downstream into Name
values flowing
downstream using that Parser
.
In this example we are using Text
, but we may as well use
ByteString
. Also, the OverloadedStrings
language
extension lets us write our parser easily.
{-# LANGUAGE OverloadedStrings #-} import Control.Proxy import Control.Proxy.Attoparsec import Control.Proxy.Trans.Either import Data.Attoparsec.Text import Data.Text data Name = Name Text deriving (Show) hello :: Parser Name hello = fmap Name $ "Hello " .*> takeWhile1 (/='.') <*. "."
We are done with our parser, now lets make a simple parsing Pipe
with it.
helloPipe1 :: (Proxy p, Monad m) => () -> Pipe p Text Name m r helloPipe1 = parserInputD >-> parserD hello
As the type indicates, this Pipe
receives Text
values from upstream and
sends Name
values downstream. This Pipe
is made of two smaller
two smaller cooperating Proxy
s:
-
parserInputD
: Preparesa
input received from upstream to be consumed by a downstream parsingProxy
. -
parserD
: Repeatedly runs a given
on inputParser
a ba
from upstream, and sendsb
values downstream.
We need some sample input to test our simple helloPipe1
.
input1 :: [Text] input1 = [ "Hello Kate." , "Hello Mary.Hello Jef" , "f." , "Hel" , "lo Tom." ]
We'll use
as our input source, which sends
downstream one element from the list at a time. We'll call each of these
elements a chunk. So, fromListS
input1
sends 5 chunks of
fromListS
input1Text
downstream.
Notice how some of our chunks are not, by themselves, complete inputs
for our hello
Parser
. This is fine; we want to be able to feed the
Parser
with either partial or complete input as soon as it's
received from upstream. More input will be requested when needed.
Attoparsec's Parser
handles partial parsing just fine.
>>>
runProxy $ fromListS input1 >-> helloPipe1 >-> printD
Name "Kate" Name "Mary" Name "Jeff" Name "Tom"
We have accomplished our simple goal: We've made a Pipe
that parses
downstream flowing input using our Parser
hello
.
Parsing control proxies
Handling parser errors
Let's try with some more complex input.
input2 :: [Text] input2 = [ "Hello Amy." , "Hello, Hello Tim." , "Hello Bob." , "Hello James" , "Hello" , "Hello World." , "HexHello Jon." , "H" , "ello Ann" , "." , "Hello Jean-Luc." ]
>>>
runProxy $ fromListS input2 >-> helloPipe1 >-> printD
Name "Amy" Name "Bob" Name "JamesHelloHello World" Name "Ann" Name "Jean-Luc"
The simple helloPipe1
we built skips chunks of input that fail to be
parsed, and then continues parsing new input. That approach might be
enough if you are certain your input is always well-formed, but
sometimes you may prefer to act differently on these extraordinary
situations.
Instead of just using parserInputD
and parserD
to build our
helloPipe1
, we could have used an additional Proxy
in between them to
handle these situations. The module Control.Proxy.Attoparsec.Control
exports some useful Proxy
s that serve this purpose. The default
behavior just mentioned resembles the one provided by
skipMalformedChunks
.
Here are some other examples:
retryLeftovers
- On parsing failures, keep retrying with any left-over input, skipping individual bits each time. If there are no left-overs, then more input is requests form upstream.
helloPipe2 :: (Proxy p, Monad m) => () -> Pipe p Text Name m r helloPipe2 = parserInputD >-> retryLeftovers >-> parserD hello
>>>
runProxy $ fromListS input2 >-> helloPipe2 >-> printD
Name "Amy" Name "Tim" Name "Bob" Name "JamesHelloHello World" Name "Jon" Name "Ann" Name "Jean-Luc"
throwParsingErrors
-
When a parsing error arises, aborts execution by throwing
MalformedInput
in theEitherP
proxy transformer.
helloPipe3 :: (Proxy p, Monad m) => () -> Pipe (EitherP BadInput p) Text Name m r helloPipe3 = parserInputD >-> throwParsingErrors >-> parserD hello
>>>
runProxy . runEitherK $ fromListS input2 >-> helloPipe3 >-> printD
Name "Amy" Left (MalformedInput {miParserErrror = ParserError {errorContexts = [], errorMessage = "Failed reading: takeWith"}})
limitInputLength
n-
If a
has consumed inputParser
a ba
of length longer thann
without producing ab
value, and it's still requesting more input, then throwInputTooLong
in theEitherP
proxy transformer.
helloPipe4 :: (Proxy p, Monad m) => () -> Pipe (EitherP BadInput p) Text Name m r helloPipe4 = parserInputD >-> limitInputLength 10 >-> parserD hello
>>>
runProxy . runEitherK $ fromListS input2 >-> helloPipe4 >-> printD
Name "Amy" Name "Bob" Left (InputTooLong {itlLenght = 11})
Notice that by default, as mentioned earlier, parsing errors are ignored by skipping the malformed chunk. That's why we didn't get any complaint about the malformed input between “Amy” and “Bob”.
Composing
These Proxy
s that control the parsing behavior can be easily plugged
together with (
to achieve a combined functionality. Keep in
mind that the order in which these >->
)Proxy
s are used is important.
Suppose you don't want to parse inputs of length longer than 10, and on parsing failures, you want to retry feeding the parser with any left-overs. You could achieve that behaviour like this:
helloPipe5 :: (Proxy p, Monad m) => () -> Pipe (EitherP BadInput p) Text Name m r helloPipe5 = parserInputD >-> limitInputLength 10 >-> retryLeftovers >-> parserD hello
>>>
runProxy . runEitherK $ fromListS input2 >-> helloPipe5 >-> printD
Name "Amy" Name "Tim" Name "Bob" Left (InputTooLong {itlLenght = 11})
Custom behavior
In case the Proxy
s provided by Control.Proxy.Attoparsec.Control are not
enough for your needs, you can create your custom parsing control Proxy
.
Through a parsing control Proxy
,
values flow
upstream and ParserStatus
a
values flow downstream. A parsing
control ParserSupply
aProxy
may use, replace, or alter the flow of these values to
achieve its purpose. A
value received from downstream
reports the status of a ParserStatus
aparserD
parsing Proxy
, and in exchange,
downstream expects a
value.
ParserSupply
a
values carry raw input to be parsed and directives on how it
should be used. ParserSupply
a
is just a type synonym for
ParserSupply
a(
. The SupplyUse
, a)a
value is the input chunk, and the SupplyUse
value
could be either Resume
if the parser currently waiting for input should be
fed, or Start
, if a new parser should be started and fed the input,
effectively aborting any parsing activity currently waiting for more input.
See the documentation about ParserStatus
and ParserSupply
for more details.
Suppose you want to write a parsing control Proxy
that never provides
additional input to partial parsing results. Let's first take a look at the
type of this Proxy
:
skipPartialResults :: (Monad m, Proxy p, AttoparsecInput a) => ParserStatus a -> p (ParserStatus a) (ParserSupply a) (ParserStatus a) (ParserSupply a) m r
Just like we said,
values flow upstream and ParserStatus
a
values flow downstream.
ParserSupply
a
Now to a simple implementation: If we receive
from downstream,
then we know there is a partial parsing result waiting for more input. If we
were to respond to this request with Parsing
n(
, then the partial parser
would continue consuming input, but if we change our response to Resume
, a)(
,
then the partial parser would be aborted and a new parser would start
consuming the given input. A simple implementation is quite straightforward:
Start
, a)
skipPartialResults = runIdentityK . foreverK $ go where go x@(Parsing _) = request x >>= \(_, a) -> respond (Start, a) go x = request x >>= respond
We forward upstream the requests we get from downstream. However, in the case
of a
status, we replace with Parsing
nStart
the first value in the
pair we receive from upstream before responding.
ParserSupply
a
Now we can use this parsing control Proxy
with some simple input and see it
working.
helloPipe6 :: (Proxy p, Monad m) => () -> Pipe p Text Name m r helloPipe6 = parserInputD >-> skipPartialResults >-> parserD hello
>>>
runProxy $ fromListS input1 >-> helloPipe6 >-> printD
Name "Kate" Name "Mary"
Try for yourself
This module exports the following previous examples so that you can try them.
skipPartialResults :: (Monad m, Proxy p, AttoparsecInput a) => ParserStatus a -> p (ParserStatus a) (ParserSupply a) (ParserStatus a) (ParserSupply a) m rSource