The html-parse package

[Tags:benchmark, bsd3, library]

This package provides a fast and reasonably robust HTML5 tokenizer built upon the attoparsec library. The parsing strategy is based upon the HTML5 parsing specification with few deviations.

The package targets similar use-cases to the venerable tagsoup library, but is significantly more efficient, achieving parsing speeds of over 50 megabytes per second on modern hardware with and typical web documents.

Properties

Versions 0.1.0.0, 0.2.0.0
Dependencies attoparsec (==0.13.*), base (>=4.8 && <4.10), deepseq (==1.4.*), text (==1.2.*) [details]
License BSD3
Copyright (c) 2016 Ben Gamari
Author Ben Gamari
Maintainer ben@smart-cactus.org
Category Text
Home page http://github.com/bgamari/html-parse
Source repository head: git clone git://github.com/bgamari/html-parse
Uploaded Wed Nov 23 17:02:09 UTC 2016 by BenGamari
Distributions NixOS:0.2.0.0
Downloads 79 total (4 in the last 30 days)
Votes
0 []
Status Docs uploaded by user
Build status unknown [no reports yet]

Modules

[Index]

Downloads

Maintainer's Corner

For package maintainers and hackage trustees