fast-tagsoup: Fast parsing and extracting information from (possibly malformed) HTML/XML documents
Fast TagSoup parser. Speeds of 20-200MB/sec were observed.
Works only with strict bytestrings.
This library is intended to be used in conjunction with the original
import Text.HTML.TagSoup hiding (parseTags, renderTags) import Text.HTML.TagSoup.Fast
fast-tagsoup correctly handles HTML
<style> tags, converts tags to lower case and can decode non UTF-8 XML for you.
This parser is used in production in BazQux Reader feeds and comments crawler.
|Versions [faq]||1.0.0, 1.0.1, 1.0.2, 1.0.3, 1.0.4, 1.0.5, 1.0.6, 1.0.7, 1.0.8, 1.0.9, 1.0.10, 1.0.11, 1.0.12, 1.0.13, 1.0.14|
|Dependencies||base (==4.*), bytestring, containers, tagsoup (>=0.13.10), text, text-icu [details]|
|Copyright||Vladimir Shabanov 2011-2017|
|Author||Vladimir Shabanov <firstname.lastname@example.org>|
|Maintainer||Vladimir Shabanov <email@example.com>|
|Source repo||head: git clone https://github.com/vshabanov/fast-tagsoup|
|Uploaded||by VladimirShabanov at Tue Jul 4 17:36:00 UTC 2017|
|Downloads||4970 total (206 in the last 30 days)|
|Rating||(no votes yet) [estimated by rule of succession]|
Docs available [build log]
Last success reported on 2017-07-04 [all 1 reports]
For package maintainers and hackage trustees