The html-parse package

[Tags:benchmark, bsd3, library]

This package provides a fast and reasonably robust HTML5 tokenizer built upon the attoparsec library. The parsing strategy is based upon the HTML5 parsing specification with few deviations.

The package targets similar use-cases to the venerable tagsoup library, but is significantly more efficient, achieving parsing speeds of over 50 megabytes per second on modern hardware with and typical web documents.


Dependencies attoparsec (==0.13.*), base (>=4.8 && <4.10), deepseq (==1.4.*), text (==1.2.*) [details]
License BSD3
Copyright (c) 2016 Ben Gamari
Author Ben Gamari
Category Text
Home page
Source repository head: git clone git://
Uploaded Wed Nov 23 17:02:09 UTC 2016 by BenGamari
Distributions NixOS:
Downloads 109 total (18 in the last 30 days)
0 []
Status Docs uploaded by user
Build status unknown [no reports yet]
Hackage Matrix CI




Maintainer's Corner

For package maintainers and hackage trustees