The fast-tagsoup-utf8-only package

[ Tags: bsd3, library, xml ] [ Propose Tags ]

Fast TagSoup parser. Speeds of 20-200MB/sec were observed.

Works only with strict bytestrings.

This library is intended to be used in conjunction with the original tagsoup package:

import Text.HTML.TagSoup hiding (parseTags, renderTags)
import Text.HTML.TagSoup.Fast.Utf8Only

Besides speed fast-tagsoup correctly handles HTML <script> and <style> tags and converts tags to lower case. This fork purposefully removes support for parsing non-utf8 documents, to avoid dependency on text-icu. If you need to handle other encodings, refer to the original http://hackage.haskell.org/package/fast-tagsoup

This parser is used in production in BazQux Reader feeds and comments crawler.

Properties

Versions 1.0.4, 1.0.5
Dependencies base (==4.*), bytestring, tagsoup, text [details]
License BSD3
Copyright Vladimir Shabanov 2011-2012
Author Vladimir Shabanov <vshabanoff@gmail.com>
Maintainer Vladimir Shabanov <vshabanoff@gmail.com>
Category XML
Home page https://github.com/vshabanov/fast-tagsoup
Source repository head: git clone https://github.com/exbb2/fast-tagsoup
Uploaded Sat Feb 9 10:56:39 UTC 2013 by MikhailKuddah
Distributions NixOS:1.0.5
Downloads 655 total (13 in the last 30 days)
Rating (no votes yet) [estimated by rule of succession]
Your Rating
  • λ
  • λ
  • λ
Status Docs uploaded by user
Build status unknown [no reports yet]
Hackage Matrix CI

Modules

[Index]

Downloads

Maintainer's Corner

For package maintainers and hackage trustees