Safe Haskell | None |
---|---|
Language | Haskell2010 |
- data Parser m a
- type Error = Maybe ErrorDetails
- data ErrorDetails
- run :: Monad m => Parser m a -> ListT m Token -> m (Either Error a)
- eoi :: Monad m => Parser m ()
- token :: Monad m => Parser m Token
- rawToken :: Monad m => Parser m Token
- space :: Monad m => Parser m Text
- openingTag :: Monad m => Parser m OpeningTag
- closingTag :: Monad m => Parser m Identifier
- text :: Monad m => Parser m Text
- comment :: Monad m => Parser m Text
- html :: Monad m => Parser m Builder
- properHTML :: Monad m => Parser m Builder
- xmlNode :: Monad m => Parser m Node
- many1 :: Monad m => Parser m a -> Parser m [a]
- manyTill :: Monad m => Parser m a -> Parser m b -> Parser m ([a], b)
- skipTill :: Monad m => Parser m a -> Parser m a
- total :: Monad m => Parser m a -> Parser m a
Documentation
A backtracking HTML-tokens stream parser.
type Error = Maybe ErrorDetails Source #
data ErrorDetails Source #
ErrorDetails_Message Text | A text message |
ErrorDetails_UnexpectedToken | Unexpected token |
ErrorDetails_EOI | End of input |
Eq ErrorDetails Source # | |
Show ErrorDetails Source # | |
Monad m => MonadError Error (Parser m) Source # | |
run :: Monad m => Parser m a -> ListT m Token -> m (Either Error a) Source #
Run a parser on a stream of HTML tokens, consuming only as many as needed.
Parsers
token :: Monad m => Parser m Token Source #
A token with HTML entities decoded and with spaces filtered out.
rawToken :: Monad m => Parser m Token Source #
An HTML token as it is: without HTML-decoding and ignoring of spaces.
space :: Monad m => Parser m Text Source #
A text token, which is completely composed of characters,
which satisfy the isSpace
predicate.
openingTag :: Monad m => Parser m OpeningTag Source #
An opening tag with HTML entities in values decoded.
closingTag :: Monad m => Parser m Identifier Source #
A closing tag.
html :: Monad m => Parser m Builder Source #
The auto-repaired textual HTML representation of an HTML-tree node.
Useful for consuming HTML-formatted snippets.
E.g., when the following parser:
openingTag *> html
is run against the following HTML snippet:
<ul> <li>I'm not your friend, <b>buddy</b>!</li> <li>I'm not your buddy, <b>guy</b>!</li> <li>He's not your guy, <b>friend</b>!</li> <li>I'm not your friend, <b>buddy</b>!</li> </ul>
it'll produce the following text builder value:
<li>I'm not your friend, <b>buddy</b>!</li>
If you want to consume all children of a node,
it's recommended to use properHTML
in combination with many
or many1
.
For details consult the docs on properHTML
.
This parser is smart and handles and repairs broken HTML:
- It repairs unclosed tags,
interpreting them as closed singletons.
E.g.,
<br>
will be consumed as<br/>
. - It handles orphan closing tags by ignoring them.
E.g. it'll consume the input
<a></b></a>
as<a></a>
.
properHTML :: Monad m => Parser m Builder Source #
Same as html
, but fails if the input begins with an orphan closing tag.
I.e., the input "</a><b></b>" will make this parser fail.
This parser is particularly useful for consuming all children in the current context. E.g., running the following parser:
openingTag *> (mconcat <$> many properHTML)
on the following input:
<ul> <li>I'm not your friend, <b>buddy</b>!</li> <li>I'm not your buddy, <b>guy</b>!</li> <li>He's not your guy, <b>friend</b>!</li> <li>I'm not your friend, <b>buddy</b>!</li> </ul>
will produce a merged text builder, which consists of the following nodes:
<li>I'm not your friend, <b>buddy</b>!</li> <li>I'm not your buddy, <b>guy</b>!</li> <li>He's not your guy, <b>friend</b>!</li> <li>I'm not your friend, <b>buddy</b>!</li>
Notice that unlike with html
, it's safe to assume
that it will not consume the following closing </ul>
tag,
because it does not begin a valid HTML-tree node.
Notice also that despite failing in case of the first broken token,
this parser handles the broken tokens in other cases the same way as html
.
xmlNode :: Monad m => Parser m Node Source #
Works the same way as properHTML
, but constructs an XML-tree.
Combinators
manyTill :: Monad m => Parser m a -> Parser m b -> Parser m ([a], b) Source #
Apply a parser multiple times until another parser is satisfied. Returns results of both parsers.