fast-tagsoup-utf8-only-1.0.4: Fast parser for tagsoup package

Safe HaskellNone



Very fast TagSoup parser.

Works only with strict bytestrings. Correctly handles HTML <script> and <style> tags.

This module is intended to be used in conjunction with the original tagsoup package:

 import Text.HTML.TagSoup hiding (parseTags, renderTags)
 import Text.HTML.TagSoup.Fast.Utf8Only

Remark that tags are returned in lower case and comments are not returned.

In long running multithreaded applications it's generally recommended to use parseTagsT and work with [Tag Text] to reduce memory fragmentation.



parseTags :: ByteString -> [Tag ByteString]Source

Parse a string to a list of tags.

 parseTags "<div>&amp;<script>x<y</script>" ==
   [TagOpen "div" [],TagText "&",TagOpen "script" [],TagText "x<y",TagClose "script"]

renderTags :: [Tag ByteString] -> ByteStringSource

Show a list of tags, as they might have been parsed.

parseTagsT :: ByteString -> [Tag Text]Source

Alternative to parseTags working with Text

renderTagsT :: [Tag Text] -> TextSource

Alternative to renderTags working with Text

ensureUtf8Xml :: ByteString -> ByteStringSource

Decode XML to UTF-8 using encoding attribute of <?xml> tag.

escapeHtml :: ByteString -> ByteStringSource

Escape characters unsafe to HTML

escapeHtmlT :: Text -> TextSource

Alternative to escapeHtml working with Text

unescapeHtml :: ByteString -> ByteStringSource

Convert escaped HTML to raw.

unescapeHtmlT :: Text -> TextSource

Alternative to unescapeHtml working with Text