fast-tagsoup-utf8-only: Fast parser for tagsoup package

[ bsd3, library, xml ] [ Propose Tags ] [ Report a vulnerability ]

Fast TagSoup parser. Speeds of 20-200MB/sec were observed.

Works only with strict bytestrings.

This library is intended to be used in conjunction with the original tagsoup package:

import Text.HTML.TagSoup hiding (parseTags, renderTags)
import Text.HTML.TagSoup.Fast.Utf8Only

Besides speed fast-tagsoup correctly handles HTML <script> and <style> tags and converts tags to lower case. This fork purposefully removes support for parsing non-utf8 documents, to avoid dependency on text-icu. If you need to handle other encodings, refer to the original http://hackage.haskell.org/package/fast-tagsoup

This parser is used in production in BazQux Reader feeds and comments crawler.

Modules

[Index]

Text
- HTML
  - TagSoup
    - Fast
      - Text.HTML.TagSoup.Fast.Utf8Only

Downloads

fast-tagsoup-utf8-only-1.0.5.tar.gz [browse] (Cabal source package)
Package description (as included in the package)

Maintainer's Corner

Package maintainers

MikhailKuddah

For package maintainers and hackage trustees

edit package information

Candidates

No Candidates

Versions [RSS]	1.0.4, 1.0.5
Dependencies	base (>=4 && <5), bytestring, tagsoup, text [details]
License	BSD-3-Clause
Copyright	Vladimir Shabanov 2011-2012
Author	Vladimir Shabanov <vshabanoff@gmail.com>
Maintainer	Vladimir Shabanov <vshabanoff@gmail.com>
Category	XML
Home page	https://github.com/exbb2/fast-tagsoup
Source repo	head: git clone https://github.com/exbb2/fast-tagsoup
Uploaded	by MikhailKuddah at 2013-12-11T20:23:49Z
Distributions
Reverse Dependencies	1 direct, 0 indirect [details]
Downloads	2260 total (3 in the last 30 days)
Rating	(no votes yet) [estimated by Bayesian average]
Your Rating	λ λ λ
Status	Docs available [build log] Successful builds reported [all 1 reports]