An integration of the tagsoup and hexpat packages, allowing HTML to be parsed to a hexpat tree, tolerant of errors.
The real work is done by Neil Mitchell's tagsoup package.
- parseTags :: (StringLike s, GenericXMLString text) => s -> UNode text
- parseTagsOptions :: (StringLike s, GenericXMLString text) => ParseOptions s -> s -> UNode text
- data ParseOptions str = ParseOptions {
- optTagPosition :: Bool
- optTagWarning :: Bool
- optEntityData :: (str, Bool) -> [Tag str]
- optEntityAttrib :: (str, Bool) -> (str, [Tag str])
- optTagTextMerge :: Bool
- parseOptions :: StringLike str => ParseOptions str
- parseOptionsFast :: StringLike str => ParseOptions str
Documentation
parseTags :: (StringLike s, GenericXMLString text) => s -> UNode textSource
Parse tags using TagSoup, invoke canonicalizeTags to convert them all to
lower case, automatically self-close tags like img
and input
, then
convert to a hexpat tree.
parseTagsOptions :: (StringLike s, GenericXMLString text) => ParseOptions s -> s -> UNode textSource
Variant that accepts options.
data ParseOptions str
These options control how parseTags
works.
ParseOptions | |
|
parseOptions :: StringLike str => ParseOptions str
The default parse options value, described in ParseOptions
.
parseOptionsFast :: StringLike str => ParseOptions str
A ParseOptions
structure optimised for speed, following the fast options.