fast-tagsoup-utf8-only: Fast parser for tagsoup package
Fast TagSoup parser. Speeds of 20-200MB/sec were observed.
Works only with strict bytestrings.
This library is intended to be used in conjunction with the original tagsoup package:
import Text.HTML.TagSoup hiding (parseTags, renderTags) import Text.HTML.TagSoup.Fast.Utf8Only
Besides speed fast-tagsoup correctly handles HTML <script> and <style> tags and converts tags to lower case.
This fork purposefully removes support for parsing non-utf8 documents, to avoid dependency on text-icu.
If you need to handle other encodings, refer to the original http://hackage.haskell.org/package/fast-tagsoup
This parser is used in production in BazQux Reader feeds and comments crawler.
Downloads
- fast-tagsoup-utf8-only-1.0.5.tar.gz [browse] (Cabal source package)
 - Package description (as included in the package)
 
Maintainer's Corner
For package maintainers and hackage trustees
Candidates
- No Candidates
 
| Versions [RSS] | 1.0.4, 1.0.5 | 
|---|---|
| Dependencies | base (>=4 && <5), bytestring, tagsoup, text [details] | 
| License | BSD-3-Clause | 
| Copyright | Vladimir Shabanov 2011-2012 | 
| Author | Vladimir Shabanov <vshabanoff@gmail.com> | 
| Maintainer | Vladimir Shabanov <vshabanoff@gmail.com> | 
| Category | XML | 
| Home page | https://github.com/exbb2/fast-tagsoup | 
| Source repo | head: git clone https://github.com/exbb2/fast-tagsoup | 
| Uploaded | by MikhailKuddah at 2013-12-11T20:23:49Z | 
| Distributions | |
| Reverse Dependencies | 1 direct, 0 indirect [details] | 
| Downloads | 2254 total (6 in the last 30 days) | 
| Rating | (no votes yet) [estimated by Bayesian average] | 
| Your Rating | |
| Status | Docs available [build log] Successful builds reported [all 1 reports]  |